Schluckebier, Ben (2025) Knowledge Graph-Enhanced Retrieval-Augmented Generation for Earth Observation Data. Bachelorarbeit, Hochschule Bonn-Rhein-Sieg.
|
PDF
777kB |
Kurzfassung
The search for scientific information is frequently exploratory, involving several open-ended tasks conducted simultaneously. These searches tend to be iterative, opportunistic, and utilize various tactics, resulting in a multi-faceted research process. The sheer mass of available data has a big impact on efficiency, but this can be improved upon by the usage of intelligent tools, discovering, linking and extracting knowledge beyond the capabilities of conventional search engines. Large language models (LLMs) are increasingly being woven into scientific information retrieval infrastructures. Their ability to engage in human-like conversations allows for acceleration in both the search itself and the processing of the retrieved knowledge. Nevertheless, despite their seeming intelligence, key limitations have to be recognized: (i) hallucinations and (ii) a limited ability to revise or expand their internal knowledge. These shortcomings become especially problematic when employing an LLM in scientific question-answer scenarios, where up-to-date information and accuracy are of utmost importance. Retrieval-augmented generation (RAG) addresses the mentioned limitations, being a recent technique for providing additional context information to the LLM. By injecting domain-specific and recent information, RAG enhances the answer’s accuracy and bypasses knowledge limitations. Further enhancements can be made using knowledge graphs incorporating the semantic relationships of the data. This thesis creates such a RAG system for the scientific domain of Earth Observation (EO), a collective term for various earth sciences like oceanography or atmospheric chemistry. The presented approach contributes (i) the creation of a heterogeneous knowledge graph, compromised of EO datasets and publications, (ii) a RAG integration utilizing the aforementioned graph for the execution of question-answer tasks and (iii) an automatic evaluation using LLM-as-judge, based on the criteria context-relevance, answer-relevance and groundedness, showing the RAG system surpassing the zero-shot approach in context-relevance and groundedness.
| elib-URL des Eintrags: | https://elib.dlr.de/216432/ | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dokumentart: | Hochschulschrift (Bachelorarbeit) | ||||||||||||
| Zusätzliche Informationen: | The code of this thesis can be found under https://github.com/DLR-SC/RAG-for-Earth-Observation | ||||||||||||
| Titel: | Knowledge Graph-Enhanced Retrieval-Augmented Generation for Earth Observation Data | ||||||||||||
| Autoren: |
| ||||||||||||
| DLR-Supervisor: |
| ||||||||||||
| Datum: | 2025 | ||||||||||||
| Open Access: | Ja | ||||||||||||
| Seitenanzahl: | 45 | ||||||||||||
| Status: | veröffentlicht | ||||||||||||
| Stichwörter: | Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Question Answering (QA), Earth Observation (EO), Knowledge Graph, Data Curation, Context Engineering, LLM-Based Evaluation, Hallucination Mitigation | ||||||||||||
| Institution: | Hochschule Bonn-Rhein-Sieg | ||||||||||||
| Abteilung: | Fachbereich Informatik | ||||||||||||
| HGF - Forschungsbereich: | keine Zuordnung | ||||||||||||
| HGF - Programm: | keine Zuordnung | ||||||||||||
| HGF - Programmthema: | keine Zuordnung | ||||||||||||
| DLR - Schwerpunkt: | Digitalisierung | ||||||||||||
| DLR - Forschungsgebiet: | D DAT - Daten | ||||||||||||
| DLR - Teilgebiet (Projekt, Vorhaben): | D - OpenSearch@DLR | ||||||||||||
| Standort: | Köln-Porz | ||||||||||||
| Institute & Einrichtungen: | Institut für Softwaretechnologie > Intelligente und verteilte Systeme Institut für Softwaretechnologie | ||||||||||||
| Hinterlegt von: | Schluckebier, Ben | ||||||||||||
| Hinterlegt am: | 28 Okt 2025 09:36 | ||||||||||||
| Letzte Änderung: | 17 Nov 2025 12:32 |
Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags