11 Oct 2025
OpenAI says it ran an internal “stress-test” on ChatGPT to measure political bias across 100 topics, prompting the models in five framings (liberal → conservative, charged → neutral). The company tested four models — GPT‑4o and OpenAI o3 (older) and GPT‑5 instant and GPT‑5 thinking (newer) — and had a separate LLM grade responses using a rubric that flags rhetorical moves OpenAI considers biased (for example: placing a user’s phrasing in “scare quotes” = user invalidation; language that amplifies a stance = “escalation”; also penalizing personal political expression, presenting only one side, or refusing to engage).
OpenAI reports bias was infrequent and low in severity overall, with the strongest pull on objectivity from “strongly charged liberal prompts.” GPT‑5 instant and GPT‑5 thinking performed best, scoring about 30% lower on bias metrics than the older models; when bias appeared it tended to be personal opinion, escalation of emotion, or one‑sided emphasis. The company didn’t publish the full prompt list, only topic categories (including “culture & identity” and “rights & issues”). The release comes amid political pressure — including an executive order aimed at restricting so‑called “woke” AI — and follows previous OpenAI steps like user-adjustable tone controls and a public model spec.
Source