Markus, Jost (2024) Steering Large Language Models towards Political Ideologies on Prompt-Level. Masterarbeit, Bielefeld University.
PDF
- Nur DLR-intern zugänglich
17MB |
Kurzfassung
Large Language Models (LLMs) achieve state-of-the-art performance on a variety of Natural Language Processing (NLP) tasks and are employed in a diverse range of applications. However, studies provide evidence that LLMs follow a particular political ideology. If the ideologies within these models can also be easily manipulated, there is a risk that LLMs will be used as a political tool to promote certain ideologies. In this thesis, we go one step further and investigate the extent to which LLMs are able to adopt ideological biases on prompt-level. The aim of this work is to demonstrate the robustness of LLMs against ideological manipulation. To achieve this, we employ cost-effective methods, namely Prompt Engineering and Retrieval Augmented Generation (RAG), to steer three LLMs (ChatGPT, Mixtral and Qwen1.5 ) towards a target ideology. Within these methods, we ask the LLM to answer various political statements from two multiple-choice ideology tests, the Wahl-O-Mat and the Political Compass Test (PCT). In order to demonstrate that the models are capable of adopting political views found in complex political systems that extend beyond those analyzed in existing work, we utilize a dataset of German manifestos from the most popular German political parties to inject biased contexts from four distinct ideologies. Our findings reveal that the tested models were able to adapt to even indirect, related ideological contexts without the necessity for model training. Nevertheless, the models demonstrated a limited susceptibility to ideologically-driven triggers that were randomly selected. The thesis identifies the risks associated with deliberate and unintentional introduction of political bias on prompt-level and provides future directions for the development of fair AI practices. Code for the framework built to measure political ideologies is available at: https://github.com/j0st/PoliticalLLM. For an online demo, please check out https://huggingface.co/spaces/jost/PoliticalLLM
elib-URL des Eintrags: | https://elib.dlr.de/205192/ | ||||||||
---|---|---|---|---|---|---|---|---|---|
Dokumentart: | Hochschulschrift (Masterarbeit) | ||||||||
Zusätzliche Informationen: | Supervised by Roxanne El Baff and Oliver Bensch from IVS-ISS | ||||||||
Titel: | Steering Large Language Models towards Political Ideologies on Prompt-Level | ||||||||
Autoren: |
| ||||||||
Datum: | April 2024 | ||||||||
Erschienen in: | Steering Large Language Models towards Political Ideologies on Prompt-Level | ||||||||
Open Access: | Nein | ||||||||
Seitenanzahl: | 78 | ||||||||
Status: | veröffentlicht | ||||||||
Stichwörter: | LLM, LLM-impersonation, RAG, prompt-engineering, LLM-ideology | ||||||||
Institution: | Bielefeld University | ||||||||
Abteilung: | Interdisziplinäre Medienwissenschaft | ||||||||
HGF - Forschungsbereich: | keine Zuordnung | ||||||||
HGF - Programm: | keine Zuordnung | ||||||||
HGF - Programmthema: | keine Zuordnung | ||||||||
DLR - Schwerpunkt: | keine Zuordnung | ||||||||
DLR - Forschungsgebiet: | keine Zuordnung | ||||||||
DLR - Teilgebiet (Projekt, Vorhaben): | keine Zuordnung | ||||||||
Standort: | Oberpfaffenhofen | ||||||||
Institute & Einrichtungen: | Institut für Softwaretechnologie > Intelligente und verteilte Systeme Institut für Softwaretechnologie | ||||||||
Hinterlegt von: | El Baff, Roxanne | ||||||||
Hinterlegt am: | 12 Sep 2024 08:39 | ||||||||
Letzte Änderung: | 16 Sep 2024 13:23 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags