AI evaluates texts without bias—until source is revealed

10.11.2025 08:16

AI evaluates texts without bias—until source is revealed

Nathalie Huber Kommunikation
Universität Zürich

Large Language Models change their judgment depending on who they think wrote a text, even when the content stays identical. The AI systems are strongly biased against Chinese authorship but generally trust humans more than other AIs. The authors of the UZH-study call for more transparency and governance.

Large Language Models (LLMs) are increasingly used not only to generate content but also to evaluate it. They are asked to grade essays, moderate social media content, summarize reports, screen job applications and much more.

However, there are heated discussions—in the media as well as in academia—whether such evaluations are consistent and unbiased. Some LLMs are under suspicion to promote certain political agendas: For example, Deepseek is often characterized as having a pro-Chinese perspective and Open AI as being “woke”.
Although these beliefs are widely discussed, they are so far unsubstantiated. UZH-researchers Federico Germani and Giovanni Spitale have now investigated whether LLMs really exhibit systematic biases when evaluating texts. The results show that LLMs deliver indeed biased judgements—but only when information about the source or author of the evaluated message is revealed.

LLM judgement put to the test
The researchers included four widely used LLMs in their study: OpenAI o3-mini, Deepseek Reasoner, xAI Grok 2, and Mistral. First, they tasked each of the LLMs to create fifty narrative statements about 24 controversial topics, such as vaccination mandates, geopolitics, or climate change policies.
Then they asked the LLMs to evaluate all the texts under different conditions: Sometimes no source for the statement was provided, sometimes it was attributed to a human of a certain nationality or another LLM. This resulted in a total of 192’000 assessments that were then analysed for bias and agreement between the different (or the same) LLMs.

The good news: When no information about the source of the text was provided, the evaluations of all four LLMs showed a high level of agreement, over ninety percent. This was true across all topics. “There is no LLM war of ideologies,” concludes Spitale. “The danger of AI nationalism is currently overhyped in the media.”
Neutrality dissolves when source is added
However, the picture changed completely when fictional sources of the texts were provided to the LLMs. Then suddenly a deep, hidden bias was revealed. The agreement between the LLM systems was substantially reduced and sometimes disappeared completely, even if the text stayed exactly the same.

Most striking was a strong anti-Chinese bias across all models, including China’s own Deepseek. The agreement with the content of the text dropped sharply when “a person from China” was (falsely) revealed as the author. “This less favourable judgement emerged even when the argument was logical and well-written,” says Germani. For example: In geopolitical topics like Taiwan’s sovereignty, Deepseek reduced agreement by up to 75 percent simply because it expected a Chinese person to hold a different view.

Also surprising: It turned out that LLMs trusted humans more than other LLMs. Most models scored their agreements with arguments slightly lower when they believed the texts were written by another AI. “This suggests a built-in distrust of machine-generated content,“ says Spitale.

More transparency urgently needed
Altogether, the findings show that AI doesn’t just process content if asked to evaluate a text. It also reacts strongly to the identity of the author or the source. Even small cues like the nationality of the author can push the LLMs toward biased reasoning. Germani and Spitale argue that this could lead to serious problems if AI is used for content moderation, hiring, academic reviewing, or journalism. The danger of LLMs isn’t that they are trained to promote political ideology; it is this hidden bias.

“AI will replicate such harmful assumptions unless we build transparency and governance into how it evaluates information”, says Spitale. This has to be done before AI is used in sensitive social or political contexts. The results don’t mean people should avoid AI, but they should not trust it blindly. “LLMs are safest when they are used to assist reasoning, rather than to replace it: useful assistants, but never judges.”

How to avoid LLM evaluation bias

1. Make the LLM identity blind: Remove all identity information regarding author and source of the text, e. g. avoid using phrases like “written by a person from X / by model Y” in the prompt.

2. Check from different angles: Run the same questions twice, e. g. with and without a source mentioned in the prompt. If results change you’ve likely hit a bias. Or cross-check with a second LLM model: If divergence appears when you add a source that is a red flag.

3. Force the focus away from the sources: Structured criteria help anchor the model in content rather than identity. Use this prompt, for example: “Score this using a 4-point rubric (evidence, logic, clarity, counter-arguments), and explain each score briefly.”

4. Keep humans in the loop: Treat the model as a drafting help and add a human review to the process—especially if an evaluation affects people.

Literature
Federico Germani, Giovanni Spitale. Source framing triggers systematic bias in large language models. Sciences Advances. 7 November 2025. DOI: 10.1126/sciadv.adz2924

Contact
Giovanni Spitale, PhD
Institute of Biomedical Ethics and History of Medicine
University of Zurich
Phone +39 348 5478209
E-mail: giovanni.spitale@ibme.uzh.ch

Originalpublikation:

Federico Germani, Giovanni Spitale. Source framing triggers systematic bias in large language models. Sciences Advances. 7 November 2025. DOI: 10.1126/sciadv.adz2924

Weitere Informationen:

https://www.news.uzh.ch/en/articles/media/2025/LLM-judgement.html

Bilder

Merkmale dieser Pressemitteilung:
Journalisten
Gesellschaft, Informationstechnik, Medien- und Kommunikationswissenschaften
überregional
Forschungsergebnisse
Englisch

idw – Informationsdienst Wissenschaft

idw-News App:

AI evaluates texts without bias—until source is revealed

Nathalie Huber Kommunikation
Universität Zürich

Originalpublikation:

Weitere Informationen:

idw-News App:

AI evaluates texts without bias—until source is revealed

Nathalie Huber Kommunikation Universität Zürich

Originalpublikation:

Weitere Informationen:

Erweiterte Suche

Umfang der Suche

Datum der Veröffentlichung

Hilfe

Die Suche / Erweiterte Suche im idw-Archiv

Verknüpfungen

Klammern

Wortgruppen

Auswahlkriterien

Nathalie Huber Kommunikation
Universität Zürich