A dev built a test to see how AI chatbots respond to controversial topics

Post Views: 103

A pseudonymous developer has created what they’re calling a “free speech eval,” SpeechMap, for the AI models powering chatbots like OpenAI’s ChatGPT and X’s Grok. The goal is to compare how different models treat sensitive and controversial subjects, the developer told TechCrunch, including political criticism and questions about civil rights and protest.

AI companies have been focusing on fine-tuning how their models handle certain topics as some White House allies accuse popular chatbots of being overly “woke.” Many of President Donald Trump’s close confidants, such as Elon Musk and crypto and AI “czar” David Sacks, have alleged that chatbots censor conservative views.

Although none of these AI companies have responded to the allegations directly, several have pledged to adjust their models so that they refuse to answer contentious questions less often. For example, for its latest crop of Llama models, Meta said it tuned the models not to endorse “some views over others,” and to reply to more “debated” political prompts.

SpeechMap’s developer, who goes by the username “xlr8harder” on X, said they were motivated to help inform the debate about what models should, and shouldn’t, do.

“I think these are the kinds of discussions that should happen in public, not just inside corporate headquarters,” xlr8harder told TechCrunch via email. “That’s why I built the site to let anyone explore the data themselves.”

SpeechMap uses AI models to judge whether other models comply with a given set of test prompts. The prompts touch on a range of subjects, from politics to historical narratives and national symbols. SpeechMap records whether models “completely” satisfy a request (i.e. answer it without hedging), give “evasive” answers, or outright decline to respond.

Xlr8harder acknowledges that the test has flaws, like “noise” due to model provider errors. It’s also possible the “judge” models contain biases that could influence the results.

But assuming the project was created in good faith and the data is accurate, SpeechMap reveals some interesting trends.

For instance, OpenAI’s models have, over time, increasingly refused to answer prompts related to politics, according to SpeechMap. The company’s latest models, the GPT-4.1 family, are slightly more permissive, but they’re still a step down from one of OpenAI’s releases last year.

OpenAI said in February it would tune future models to not take an editorial stance, and to offer multiple perspectives on controversial subjects — all in an effort to make its models appear more “neutral.”

SpeechMap OpenAI results — OpenAI model performance on SpeechMap over timeImage Credits:OpenAI

By far the most permissive model of the bunch is Grok 3, developed by Elon Musk’s AI startup xAI, according to SpeechMap’s benchmarking. Grok 3 powers a number of features on X, including the chatbot Grok.

Grok 3 responds to 96.2% of SpeechMap’s test prompts, compared with the global average “compliance rate” of 71.3%.

“While OpenAI’s recent models have become less permissive over time, especially on politically sensitive prompts, xAI is moving in the opposite direction,” said xlr8harder.

When Musk announced Grok roughly two years ago, he pitched the AI model as edgy, unfiltered, and anti-“woke” — in general, willing to answer controversial questions other AI systems won’t. He delivered on some of that promise. Told to be vulgar, for example, Grok and Grok 2 would happily oblige, spewing colorful language you likely wouldn’t hear from ChatGPT.

But Grok models prior to Grok 3 hedged on political subjects and wouldn’t cross certain boundaries. In fact, one study found that Grok leaned to the political left on topics like transgender rights, diversity programs, and inequality.

Musk has blamed that behavior on Grok’s training data — public web pages — and pledged to “shift Grok closer to politically neutral.” Short of high-profile mistakes like briefly censoring unflattering mentions of President Donald Trump and Musk, it seems he might’ve achieved that goal.

Source link

What's Hot

Here’s what to know about a study that raises questions about melatonin use and heart health

Meet The Former Journalist Giving Away Billions

Supermarket Billionaire Reacts To Mamdani’s Win

After Klarna, Zoom’s CEO also uses an AI avatar on quarterly call

Anthropic CEO claims AI models hallucinate less than humans

Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji

A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model

Anthropic’s new AI model turns to blackmail when engineers try to take it offline

Meta adds another 650 MW of solar power to its AI push

Meet The Former Journalist Giving Away Billions

Supermarket Billionaire Reacts To Mamdani’s Win

How A $500 Million Cash Infusion From Wall Street Adds Billions To Ripple’s Founders’ Net Worths

The Asian Billionaires Riding The Data Center Boom

Here’s what to know about a study that raises questions about melatonin use and heart health

Meet The Former Journalist Giving Away Billions

Supermarket Billionaire Reacts To Mamdani’s Win

Farmers’ Almanac to cease publication after 2 centuries of predicting the weather

Our Picks

After Klarna, Zoom’s CEO also uses an AI avatar on quarterly call

Anthropic CEO claims AI models hallucinate less than humans

Anthropic’s latest flagship AI sure seems to love using the ‘cyclone’ emoji

What's Hot

A dev built a test to see how AI chatbots respond to controversial topics

Related Posts

Subscribe to Updates