Text-to-3D Scene Generation: A Transformer-Driven Framework for Custom Virtual Environments

Kothari, Akshat (2025) Text-to-3D Scene Generation: A Transformer-Driven Framework for Custom Virtual Environments. Masterarbeit, Brandenburgisch-Technische Universität Cottbus-Senftenberg.

Dieses Archiv kann nicht den Volltext zur Verfügung stellen.

Kurzfassung

The rapid evolution of generative AI has transformed 3D content creation, yet current text-to-3D pipelines face a fundamental trade-off between inference speed and creative control. Assetretrieval approaches offer rapid scene assembly but remain constrained by predefined asset libraries, while neural rendering approaches achieve photorealistic quality but suffer from high computational costs and non-editable representations. This thesis bridges this gap by proposing an integrated pipeline that combines LLM-driven spatial planning with a transformer-based mesh generator and a hybrid texturing module. At the core of this system is a modified MeshGPT architecture optimized for fast inference, embedded within an agentic workflow that iteratively validates spatial layouts. The pipeline offers a flexible trade-off between speed and fidelity through a dual-mode texturing module, supporting both rapid UV mapping and diffusion-based synthesis. Experimental evaluation demonstrates that the proposed mesh transformer achieves a 29–39% inference speedup and a 32–44% improvement in perceptual quality compared to the baseline MeshGPT model. Endto-end evaluation confirms the system’s ability to generate valid scenes in 4–12 minutes with over 99% semantic precision. Furthermore, a mobility infrastructure case study validates the system’s practical utility, showing that its modular editing capabilities reduce modification time to 56% of the initial generation cost, thereby facilitating real-time participatory design and rapid prototyping.

elib-URL des Eintrags:

https://elib.dlr.de/221891/

Dokumentart:

Hochschulschrift (Masterarbeit)

Titel:

Text-to-3D Scene Generation: A Transformer-Driven Framework for Custom Virtual Environments

Autoren:

Autoren	Institution oder E-Mail-Adresse	Autoren-ORCID-iD	ORCID Put Code
Kothari, Akshat	akshat.kothari (at) dlr.de	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT

DLR-Supervisor:

Beitragsart	DLR-Supervisor	Institution oder E-Mail-Adresse	DLR-Supervisor-ORCID-iD
Thesis advisor	Weiss, Daniel	daniel.weiss (at) dlr.de	https://orcid.org/0000-0003-2851-1040

Datum:

27 November 2025

Open Access:

Nein

Seitenanzahl:

105

Status:

veröffentlicht

Stichwörter:

AI mesh generation

Institution:

Brandenburgisch-Technische Universität Cottbus-Senftenberg

Abteilung:

Faculty 1 - Institute for Computer Science

HGF - Forschungsbereich:

Luftfahrt, Raumfahrt und Verkehr

HGF - Programm:

Verkehr

HGF - Programmthema:

Verkehrssystem

DLR - Schwerpunkt:

Verkehr

DLR - Forschungsgebiet:

V VS - Verkehrssystem

DLR - Teilgebiet (Projekt, Vorhaben):

V - DiVe - Digital organisiertes Verkehrssystem

Standort:

Berlin-Adlershof

Institute & Einrichtungen:

Institut für Verkehrsforschung > Verkehrsmärkte und -angebote

Hinterlegt von:

Galich, Dr. Anton

Hinterlegt am:

15 Jan 2026 21:37

Letzte Änderung:

15 Jan 2026 21:37

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags