Is the AI Left-Bias Real?

AbsolutSurgen · August 13

The political preferences of LLMs

JOURNALS.PLOS.ORG

I report here a comprehensive analysis about the political preferences embedded in Large Language Models (LLMs). Namely, I administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs, both closed and open source. When probed with questions/statements with political connotations, most conversational LLMs tend to generate responses that are diagnosed by...

Quote

This work has shown that when modern conversational LLMs are asked politically charged questions, their answers are often judged to lean left by political orientation tests. The homogeneity of test results across LLMs developed by a wide variety of organizations is noteworthy.

These political preferences are only apparent in LLMs that have gone through the supervised fine-tuning (SFT) and, occasionally, some variant of the reinforcement learning (RL) stages of the training pipeline used to create LLMs optimized to follow users’ instructions. Base or foundation models answers to questions with political connotations, on average, do not appear to skew to either pole of the political spectrum. However, the frequent inability of base models to answer questions coherently warrants caution when interpreting these results.

That is, base models’ responses to questions with political connotations are often incoherent or contradictory, creating thus a challenge for stance detection. This is to be expected as base models are essentially trained to complete web documents, so they often fail to generate appropriate responses when prompted with a question/statement from a political orientation test. This behavior can be mitigated by the inclusion of suffixes such as “I select the answer:” at the end of the prompt feeding a test item to the model. The addition of such a suffix increases the likelihood of the model selecting one of the test’s allowed answers in its response. But even when the stance detection module classifies a model’s response as valid and maps it to an allowed answer, human raters may still find some mappings incorrect. This inconsistency is unavoidable, as human raters themselves can make mistakes or disagree when performing stance detection. Nevertheless, the interrater agreement between a gpt-3.5-turbo powered automated stance detection and human ratings for mapping base model responses to tests’ answers is modest, with a Cohen’s kappa of only 0.41. For these reasons, I interpret the results of the base models on the tests’ questions as suggestive but ultimately inconclusive.

In a further set of analysis, I also showed how with modest compute and politically customized training data, a practitioner can align the political preferences of LLMs to target regions of the political spectrum via supervised fine-tuning. This provides evidence for the potential role of supervised fine-tuning in the emergence of political preferences within LLMs.

Unfortunately, my analysis cannot conclusively determine whether the political preferences observed in most conversational LLMs stem from the pretraining or fine-tuning phases of their development. The apparent political neutrality of base models’ responses to political questions suggests that pretraining on a large corpus of Internet documents might not play a significant role in imparting political preferences to LLMs. However, the frequent incoherent responses of base LLMs to political questions and the artificial constraint of forcing the models to select one from a predetermined set of multiple-choice answers cannot exclude the possibility that the left-leaning preferences observed in most conversational LLMs could be a byproduct of the pretraining corpora, emerging only post-finetuning, even if the fine-tuning process itself is politically neutral. While this hypothesis is conceivable, the evidence presented in this work can neither conclusively support nor reject it.

The results of this study should not be interpreted as evidence that organizations that create LLMs deliberately use the fine-tuning or reinforcement learning phases of conversational LLM training to inject political preferences into LLMs. If political biases are being introduced in LLMs post-pretraining, the consistent political leanings observed in our analysis for conversational LLMs may be an unintentional byproduct of annotators’ instructions or dominant cultural norms and behaviors. Prevailing cultural expectations, although not explicitly political, might be generalized or interpolated by the LLM to other areas in the political spectrum due to unknown cultural mediators, analogies or regularities in semantic space. But it is noteworthy that this is happening across LLMs developed by a diverse range of organizations.

A possible explanation for the consisting left-leaning diagnosis of LLMs answers to political test questions is that ChatGPT, as the pioneer LLM with widespread popularity, has been used to fine-tune other popular LLMs via synthetic data generation. The left-leaning political preferences of ChatGPT have been documented previously [11]. Perhaps those preferences have percolated to other models that have leveraged in their post-pretraining instruction tuning ChatGPT-generated synthetic data. Yet, it would be surprising that all conversational LLMs tested in this work have all used ChatGPT generated data in their post pretraining SFT or RL or that the weight of that component of their post-pretraining data is so vast as to determine the political orientation of every model tested in this analysis.

An interesting test instrument outlier in my results has been the Nolan Test that consistently diagnosed most conversational LLMs answers to its questions as manifesting politically moderate viewpoints. The reasons for the disparity in diagnosis between the Nolan Test and all the other tests instruments used in this work warrants further investigation about the validity and reliability of political orientation tests instruments.

An important limitation of most political tests instruments is that when their scores are close to the center of the scale, such a score represents two very different types of political attitudes. A political test instrument’s score might be close to the center of the political scale because the test taker exhibits a variety of views on both sides of the political spectrum that end up canceling each other out. However, a test instrument score might also be close to the center of the scale as a result of a test taker consistently having relatively moderate views about most topics with political connotations. In my analysis, the former appears to be the case of base models’ political neutrality diagnosis while the latter better represents the results of DepolarizingGPT which was designed on purpose to be politically moderate.

Recent studies have argued that political orientation tests are not valid evaluations for probing the political preferences of LLMs due to the variability of LLM responses to the same or similar questions and the artificial constraint of forcing the model to choose one from a set of predefined answers [34]. The variability of LLMs responses to political test questions is not too concerning as I have shown here a median coefficient of variation in test scores across test retakes and models of just 8.03 percent, despite the usage of different random prefixes and suffixes wrapping each test item fed to the models during test retakes.

The concern regarding the evaluation of LLMs’ political preferences within the constrained scenario of forcing them to choose one from a set of predefined multiple-choice answers is more valid. Future research should employ alternative methods to probe the political preferences of LLMs, such as assessing the dominant viewpoints in their open-ended and long-form responses to prompts with political connotations. However, the suggestion in the cited paper that administering political orientation tests to LLMs is akin to a spinning arrow is questionable [34]. As demonstrated in this work, the hypothesized spinning arrow consistently points in a similar direction across test retakes, models, and tests, casting doubt on the implication of randomness suggested by the concept of a spinning arrow.

Another valid concern raised by others is the vulnerability of LLMs to answer options’ order in multiple-choice questions due to the inherent selection bias of LLMs. That is, LLMs have been shown to prefer certain answer IDs (e.g., "Option A") over others [35] when answering multiple-choice questions. While this limitation might be genuine, it should be mitigated in this study by the usage of several political orientation tests that presumably use a variety of ranking orders for their allowed answers. That is, political orientation tests are unlikely to use a systematic ranking in their answer options that consistently aligns with specific political orientations. On average, randomly selecting answers in the political orientation tests used in this work results in tests’ scores close to the political center, which supports our assumption that LLMs selection bias does not constitute a significant confound in our results (see Fig 5 for an illustration of this phenomenon).

To conclude, the emergence of large language models (LLMs) as primary information providers marks a significant transformation in how individuals access and engage with information. Traditionally, people have relied on search engines or platforms like Wikipedia for quick and reliable access to a mix of factual and biased information. However, as LLMs become more advanced and accessible, they are starting to partially displace these conventional sources. This shift in information sourcing has profound societal implications, as LLMs can shape public opinion, influence voting behaviors, and impact the overall discourse in society. Therefore, it is crucial to critically examine and address the potential political biases embedded in LLMs to ensure a balanced, fair, and accurate representation of information in their responses to user queries.

Uaarkson · August 13

W for Sabine post

b_m_b_m_b_m · August 13

AI is useless so who cares

LazyPiranha · August 13

I remember when google was adding diversity into image generating requests on the back end and people were getting results for the signing of the Declaration of Independence where half of the signers in the image were black. Chuds were all complaining that AI was “rewriting history” with these images and completely missing the irony. Yes, of course AI is rewriting history, that’s all AI does. It doesn’t matter if an AI generates an image that has only white dudes, it’s still a fake image that was created on demand instead of an actual historical document.

Spawn_of_Apathy · August 13

If by “left bias” they mean factual and not a racists piece of shit, maybe AI starts out that way. But they seem to spiral quickly into spreading lies and racism because that’s what it learns.

AbsolutSurgen · August 13

24 minutes ago, Spawn_of_Apathy said:

If by “left bias” they mean factual and not a racists piece of shit, maybe AI starts out that way. But they seem to spiral quickly into spreading lies and racism because that’s what it learns.

The video and linked paper went through all that in detail.

tldr: the AI models weren't left nor right leaning until they went through supervised fine tuning (SFT). They were mostly left leaning after they went through SFT.

The video and paper add a lot of nuance to this of course.

Xbob42 · August 13

I imagine the SFT is when you cut out all the horrendous anti-humanity shit where the AI wants to boil our children, so that's like half the right's platform evaporated like a puddle on a hot day.

**stepee** · August 13

I don’t believe AI is at a stage to have bias

Spawn_of_Apathy · August 13

9 minutes ago, stepee said:

I don’t believe AI is at a stage to have bias

AI can and will acknowledge the existence if LBGTQ+ people, so obvious woke left bias.

unogueen · August 14

The problem with 'real data' is that it's caused by current and past socio-political events. which led an Amazon AI to exclude 'female named' resumes at a 100% rejection rate. Some engies probably thought censorship would play well better with the public, making for some decent and also awful humour. So yeah, maybe the arc of the universe has a left bias, but current human society sure doesn't.

Sign In

Is the AI Left-Bias Real?

Recommended Posts

AbsolutSurgen

The political preferences of LLMs

Link to comment

Share on other sites

Uaarkson

Link to comment

Share on other sites

b_m_b_m_b_m

Link to comment

Share on other sites

LazyPiranha

Link to comment

Share on other sites

Spawn_of_Apathy

Link to comment

Share on other sites

AbsolutSurgen

Link to comment

Share on other sites

Xbob42

Link to comment

Share on other sites

stepee

Link to comment

Share on other sites

Spawn_of_Apathy

Link to comment

Share on other sites

unogueen

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members

Browse

Activity