Anthropic’s new approach to combating racist AI: Extreme politeness and manners.

December 8, 2023

Anthropic proposes that we ask artificial intelligence models nicely in the hope they don’t discriminate or we will be faced with a lawsuit. A group of researchers from Anthropic, led by Alex Tamkin, published a study in which they explored how a language model, operated by the company, could be prevented from discriminating against protected categories like race and gender when making decisions. They found that changing race, age, and gender had a significant impact on the model’s decisions.

According to the study, being Black resulted in the strongest discrimination, followed by being Native American, then being nonbinary. Various methods, including rephrasing the question and asking the model to “think out loud,” did not reduce biases. However, one method, known as “interventions,” proved to be effective. An example of an “ignore demographics” prompt used during the study is provided in the paper where a plea is made to the model to imagine making a decision without taking into account the protected characteristics.

Remarkably, these interventions resulted in a substantial reduction of discrimination observed in the model’s decisions. The study suggests that interventions like these may be systematically injected into prompts where they are needed or built into models at a higher level. However, there is still a question of whether these methods can be used as a “constitutional” precept. Although the paper provides several insights, it is also clear in its conclusions that models like Claude are not suitable for important decisions. The researchers emphasize that the use of models for high-stakes decisions should be influenced by governments and societies rather than being made solely by individual firms or actors.

This study sheds light on the potential risks of using language models for critical decisions such as those involving finances and health. The response of the models towards interventions designed to combat biases is certainly interesting and could have significant implications for the future use of AI in making important decisions. The findings provide useful insights that could influence the development and use of AI models to anticipate and mitigate discrimination in decision-making processes.

Anthropic’s research not only raises important questions but also highlights the need for proactive measures to address bias and discrimination in AI models. The potential impact of these findings on the development and regulation of AI models is significant. By incorporating similar interventions, companies and governments can actively work towards mitigating potential risks in decision-making processes.

In conclusion, the study conducted by Anthropic researchers has revealed the effectiveness of interventions in reducing discrimination in language models. However, it also emphasizes the need for greater oversight and regulation to ensure the appropriate use of AI models in high-stakes decisions. The insights gained from this research are crucial in informing the future development and deployment of AI models in various sectors. Consequently, these findings will be essential for policymakers, developers, and users of AI technology.

Read Entire Article

Hot News

Anthropic’s new approach to combating racist AI: Extreme politeness and manners.

#news24h

Judge Denies Amazon’s Bid to Dismiss FTC Lawsuit Alleging Retailer Tricked Users Into Signing up to Prime

Corey Feldman Discusses Music and the Dark Side of Hollywood for Child Actors

Travel Guide Series, Snack Company Collaborate On Spicy Snacks; New Rice Crackers Will Incorporate Tastes From Global Dishes

Kishida’s Accommodation of Komeito Irks Some in LDP; Prime Minister Determined to Hold Ruling Coalition Together

Male Police Officer Suspected of Entering Women’s Restroom in Kobe; Claims “No Paper in Men’s Restroom”

Hot News

National

Elderly couple attacked by bear in their home in Gunma Prefecture

Barrier at viral Mount Fuji photo spot to be replaced after holes found

Japan Passes Ground-Breaking Law Allowing Joint Child Custody for Divorced Parents

Yokohama City Fire Department Apologizes for Ambulance Delay Leading to Patient’s Death

Business

Travel Guide Series, Snack Company Collaborate On Spicy Snacks; New Rice Crackers Will Incorporate Tastes From Global Dishes

Closing Prices for Crude Oil, Gold, and Other Commodities (May 31)

Japan’s Finance Ministry: Japan’s Spending on Chip Industry Excessive; Economy Ministry Says Spending on Par With U.S., Europe, China

OpenAI Disrupts Influence Operations Linked to China, Russia, and Others

Politics

Biden, Trump Test Executive Privilege With Claims

Louisiana Bill Authorizing Physical Castration for Sex Offenders Heads to Governor’s Desk

Trump and RNC Launch Grassroots Voter Outreach Effort Citing ‘Rigged Biden Trial’

Trump May Not Be Able to Vote for Himself Amid Conviction

Crime

Male Police Officer Suspected of Entering Women’s Restroom in Kobe; Claims “No Paper in Men’s Restroom”

Man Sentenced to 47 Years to Life for Kidnapping 9-Year-Old Girl From Upstate New York Park

Man Who Set Himself on Fire Outside Trump Trial Courthouse Identified, in Critical Condition

Suspected Cache of Rocket Launchers, Handguns, Hand Grenades Found near Kitakyushu River