elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Impressum | Datenschutz | Barrierefreiheit | Kontakt | English
Schriftgröße: [-] Text [+]

A machine learning framework for modeling the associations between environmental factors and health: an application in the German National Cohort

Nikolaou, Nikolaos und Cea, D und Valizadeh, Mahyar und Dallavalle, Marco und Staab, Jeroen und Piraud, M und Peters, A. und Schneider, Alexandra und Taubenböck, Hannes und Wolf, Kathrin (2025) A machine learning framework for modeling the associations between environmental factors and health: an application in the German National Cohort. ISES-ISEE 2025, 2025-08-17 - 2025-08-20, Atlanta, USA.

Dieses Archiv kann nicht den Volltext zur Verfügung stellen.

Kurzfassung

Objective: Human health has been associated with individual characteristics, environmental exposures, socio-economic and neighborhood settings but their interplay is not adequately understood. We aimed to build a machine learning (ML) pipeline to identify the driving environmental, socio-economic and individual factors for health outcomes, using hypertension as a case study. Material and Methods: The ML pipeline is based on three main pillars: data extract/transform/load (feature selection, imputation of missing values), modeling (hyperparameter optimization), and explainability (permutation feature importance). For our use case, we included health data from the baseline examination of the population-based German National Cohort (NAKO), conducted between 2014-19 in 16 study regions across Germany. We assigned environmental exposures (e.g., air pollution, air temperature, noise, greenness) and neighborhood factors (e.g., urbanization, deprivation) based on the participants’ residential addresses. We compared traditional regression approaches (Logistic Regression) with multiple ML methods, such as neighbor-based methods (K-Nearest Neighbor), Statistical Learning (Support Vector Machine), Ensemble Learning (Random Forest, XGBoost) and Neural Networks, to identify the main drivers for hypertension. Results: Of 204,752 participants included in our analysis, 41.2% were classified as hypertensive. Most models performed well with comparable accuracy ranging from 0.69 (K-Nearest Neighbour) to 0.73 (XGBoost) in our test set. The different approaches identified similar factors as the main drivers for hypertension with highest feature importance attributed to individual characteristics (age, body mass index, and sex). SHapley Additive exPlanations and sub-group analyses also identified environmental and neighborhood variables (minimum air temperature, noise and deprivation index), following the primary individual factors. Conclusion: Our results indicate some variation in performance and that a guided application is needed if evidence shall be generated beyond major drivers of disease such as age and sex. The ML pipeline for binary health outcomes shall be openly accessible soon, but we also plan to expand it to continuous outcomes.

elib-URL des Eintrags:https://elib.dlr.de/219234/
Dokumentart:Konferenzbeitrag (Poster)
Titel:A machine learning framework for modeling the associations between environmental factors and health: an application in the German National Cohort
Autoren:
AutorenInstitution oder E-Mail-AdresseAutoren-ORCID-iDORCID Put Code
Nikolaou, NikolaosInstitute of Epidemiology, Helmholtz Zentrum München-German Research Centre for Environmental Health, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Cea, DHelmholtz AI, Helmholtz Munich, German Research Center for Environmental Health, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Valizadeh, MahyarInstitute of Epidemiology, Helmholtz Zentrum München-German Research Centre for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Dallavalle, MarcoInstitute of Epidemiology, Helmholtz Zentrum München-German Research Centre for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Staab, JeroenJeroen.Staab (at) dlr.dehttps://orcid.org/0000-0002-7342-4440NICHT SPEZIFIZIERT
Piraud, MHelmholtz AI, Helmholtz Munich, German Research Center for Environmental Health, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Peters, A.Institute of Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Schneider, AlexandraInstitute of Epidemiology, Helmholtz Zentrum München-German Research Centre for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Taubenböck, HannesHannes.Taubenboeck (at) dlr.dehttps://orcid.org/0000-0003-4360-9126NICHT SPEZIFIZIERT
Wolf, KathrinInstitute of Epidemiology, Helmholtz Zentrum München-German Research Centre for Environmental Health, Ingolstädter Landstrasse 1, 85764, Neuherberg, GermanyNICHT SPEZIFIZIERTNICHT SPEZIFIZIERT
Datum:18 August 2025
Referierte Publikation:Nein
Open Access:Nein
Gold Open Access:Nein
In SCOPUS:Nein
In ISI Web of Science:Nein
Status:veröffentlicht
Stichwörter:Built environment, Environmental epidemiology, External exposome, Modeling, Socio-economical factors
Veranstaltungstitel:ISES-ISEE 2025
Veranstaltungsort:Atlanta, USA
Veranstaltungsart:internationale Konferenz
Veranstaltungsbeginn:17 August 2025
Veranstaltungsende:20 August 2025
HGF - Forschungsbereich:Luftfahrt, Raumfahrt und Verkehr
HGF - Programm:Raumfahrt
HGF - Programmthema:Erdbeobachtung
DLR - Schwerpunkt:Raumfahrt
DLR - Forschungsgebiet:R EO - Erdbeobachtung
DLR - Teilgebiet (Projekt, Vorhaben):R - Fernerkundung u. Geoforschung
Standort: Oberpfaffenhofen
Institute & Einrichtungen:Deutsches Fernerkundungsdatenzentrum > Georisiken und zivile Sicherheit
Hinterlegt von: Schöpfer, Dr. Elisabeth
Hinterlegt am:20 Nov 2025 10:29
Letzte Änderung:20 Nov 2025 10:29

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags

Blättern
Suchen
Hilfe & Kontakt
Informationen
OpenAIRE Validator logo electronic library verwendet EPrints 3.3.12
Gestaltung Webseite und Datenbank: Copyright © Deutsches Zentrum für Luft- und Raumfahrt (DLR). Alle Rechte vorbehalten.