elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Barrierefreiheit | Kontakt | English
Schriftgröße: [-] Text [+]

A Local Access-Controlled Multi-Source RAG Framework For Privacy-Sensitive Enterprise Environments

Piazza, Daniele (2026) A Local Access-Controlled Multi-Source RAG Framework For Privacy-Sensitive Enterprise Environments. Masterarbeit, Università Degli Studi di Milano.

[img] PDF - Nur DLR-intern zugänglich
1MB

Kurzfassung

Accessing internal knowledge in large organisations is often slow and fragmented, particularly when information is distributed across multiple heterogeneous platforms. In security-critical environments such as space agencies, the challenge is further compounded by strict access control requirements and data sovereignty constraints that prevent routing queries through external infrastructure, making these not optional design goals but hard operational requirements. This thesis presents an access-controlled multi-source Retrieval-Augmented Generation (RAG) system developed for the German Aerospace Center (DLR) to improve access to enterprise knowledge while strictly enforcing source- level access control. The central architectural contribution is an on-demand retrieval strategy that eliminates the need for a persistent vector index. Rather than crawling and pre-indexing the knowledge base with a privileged service account, the system performs live searches at query time using the authenticated session of the requesting user, inherently guaranteeing Access Control List (ACL) compliance, absolute data freshness and minimal credential exposure. A unified reader interface abstracts the heterogeneity of the two connected enterprise knowledge bases, Atlassian Confluence and Microsoft SharePoint, into a single retrieval pipeline, while remaining extensible to additional platforms without requiring modifications to the core RAG logic. The system operates fully on-premise, using locally hosted Large Language Models (LLMs) managed with Ollama and accessed through Open WebUI, adopting a privacy-by-design approach in which no internal data transits external cloud services. Deployment is container-based and automated through a GitLab Continuous Integration (CI) pipeline. To validate and iteratively refine the system, a curated evaluation dataset was constructed from internal agency documentation through a combination of synthetic generation, automated filtering and human verification. Both retrieval effectiveness and generation quality were assessed using embedding-based and LLM-as-judge metrics. In addition to the formal evaluation, practical usage within the agency confirmed the viability of a fully local, access-controlled RAG system in a large enterprise environment.

elib-URL des Eintrags:https://elib.dlr.de/224327/
Dokumentart:Hochschulschrift (Masterarbeit)
Titel:A Local Access-Controlled Multi-Source RAG Framework For Privacy-Sensitive Enterprise Environments
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Piazza, Danieledaniele.piazza (at) dlr.deNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
DLR-Supervisor:
BeitragsartDLR-SupervisorInstitution oder E-Mail-AdresseDLR-Supervisor-ORCID-iD
Thesis advisorKrummen, Svensven.krummen (at) dlr.dehttps://orcid.org/0000-0002-4126-688X
Datum:April 2026
Open Access:Nein
Seitenanzahl:91
Status:veröffentlicht
Stichwörter:Retrieval-Augmented Generation, RAG, Generative AI, Natural Language Processing, Large Language Models, LLMs, Privacy-by-Design, On-Premise Deployment, Access Control, Enterprise Knowledge Management, Enterprise Search
Institution:Università Degli Studi di Milano
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Raumfahrt
HGF - Programmthema:Technik für Raumfahrtsysteme
DLR - Schwerpunkt:Raumfahrt
DLR - Forschungsgebiet:R SY - Technik für Raumfahrtsysteme
DLR - Teilgebiet (Projekt, Vorhaben):R - Digitale Transformation in der Raumfahrt [SY]
Standort: Bremen
Institute & Einrichtungen:Institut für Raumfahrtsysteme > Systementwicklung und Projektbüro
Hinterlegt von: Krummen, Sven
Hinterlegt am:05 Mai 2026 11:47
Letzte Änderung:05 Mai 2026 11:47

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
OpenAIRE Validator logo electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.