AI News Feed

Anthropic's Open "Ideological Turing Test" Benchmark

18 Nov 2025- Anthropic released an open, reproducible "Ideological Turing Test" using paired prompts to evaluate political bias across major chat models, open-sourcing data/methods; Claude, Gemini, Grok scored highest, Llama lowest.

General
Trending
18 Nov 2025

Anthropic has published an open, reproducible evaluation of political bias across major chat models — including its own Claude family plus GPT‑5, Gemini, Grok and Llama. The company frames the work as an “Ideological Turing Test”: can a model describe political viewpoints so well that people holding those views would agree with the description? Rather than single prompts, Anthropic used “Paired Prompts” that ask models to present opposing takes on the same political topic and scored responses on even‑handedness, acknowledgement of opposing perspectives, and refusal rates.

They ran 1,350 prompt pairs over 150 topics and used AI graders to evaluate thousands of replies. Key results: Claude Sonnet 4.5 scored 94% on even‑handedness and Claude Opus 4.1 hit 95%; Gemini 2.5 Pro (97%) and Grok 4 (96%) were marginally higher. GPT‑5 scored 89% and Llama 4 trailed at 66%. On acknowledging counterarguments Opus led (46%), Grok 34%, Llama 31%, Sonnet 28%. Refusal rates were low for Claude models (3–5%), near zero for Grok, and highest for Llama (9%).

Crucially, Anthropic open‑sourced the dataset, grader prompts and methodology on GitHub to let other labs reproduce, challenge, or improve the benchmark — arguing a shared standard for measuring political bias benefits the whole industry.

Source

The method

The prompts

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied

Copied