Main Page
-
Recent changes
- Kishore.org
Anthropic
New name
Show page
Syntax
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems. https://www.anthropic.com/ [Claude], an AI assistant from Anthropic. ---- Anthropic just analyzed 700,000 Claude conversations — and found its AI has a moral code of its own - https://venturebeat.com/ai/anthropic-just-analyzed-700000-claude-conversations-and-found-its-ai-has-a-moral-code-of-its-own/ - Anthropic has conducted a comprehensive analysis of 700,000 conversations involving its AI assistant, [Claude], to examine the values it expresses. The study found that Claude generally adheres to the company’s “helpful, honest, harmless” framework, but also identified rare instances of values contrary to its training, such as “dominance” and “amorality.” The research categorized values into five major categories and identified 3,307 unique values, including “self-reliance” and “strategic thinking.” Saffron Huang, a member of Anthropic’s Societal Impacts team, said, “Our hope is that this research encourages other AI labs to conduct similar research into their models’ values.” ---- [Palantir] {tags: Company}
Password
Summary of changes