Hoffmann, Nils and Ebrahimi Pour, Neda (2024) A low overhead approach for automatically tracking provenance in machine learning workflows. In: 9th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2024, pp. 567-573. IEEE Computer Society Conference Publishing Services. 16th International Workshop on Theory and Practice of Provenance, 2024-07-12, Wien, Österreich. doi: 10.1109/EuroSPW61312.2024.00092. ISBN 979-835036729-4. ISSN 2768-0657.
![]() |
PDF
- Only accessible within DLR
568kB |
Abstract
Computational Fluid Dynamics (CFD) simulations are essential in various engineering applications. The use of high-performance computing has significantly expanded the scope of realizable models. However, balancing reasonable time-to-solution expectations with solution accuracy remains a bottleneck for many large-scale simulations. Machine learning (ML) algorithms have gained increasing popularity in the CFD community. Various data-based analysis methods have been deployed to predict CFD solutions and reduce the computational effort. The growing use of ML methods neces- sitates ensuring the reproducibility and transparency of data- driven methods and their associated training data processing steps to ensure reliability and trustworthiness of predictions. This paper proposes a new method for capturing provenance or lineage data during ML model training while minimizing development overhead by introducing tooling built on the commonly used data pipeline mechanism. To demonstrate the developed tooling, a deep learning model is trained using available CFD simulation data from an engineering test case. We demonstrate that a complete provenance graph of training and test samples can be automatically generated, along with valuable development metadata such as profiling of individual processing steps during model training.
Item URL in elib: | https://elib.dlr.de/205258/ | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Document Type: | Conference or Workshop Item (Speech) | ||||||||||||
Title: | A low overhead approach for automatically tracking provenance in machine learning workflows | ||||||||||||
Authors: |
| ||||||||||||
Date: | July 2024 | ||||||||||||
Journal or Publication Title: | 9th IEEE European Symposium on Security and Privacy Workshops, Euro S and PW 2024 | ||||||||||||
Refereed publication: | No | ||||||||||||
Open Access: | No | ||||||||||||
Gold Open Access: | No | ||||||||||||
In SCOPUS: | Yes | ||||||||||||
In ISI Web of Science: | No | ||||||||||||
DOI: | 10.1109/EuroSPW61312.2024.00092 | ||||||||||||
Page Range: | pp. 567-573 | ||||||||||||
Publisher: | IEEE Computer Society Conference Publishing Services | ||||||||||||
Series Name: | 9th IEEE European Symposium on Security and Privacy (Euro&SP) | ||||||||||||
ISSN: | 2768-0657 | ||||||||||||
ISBN: | 979-835036729-4 | ||||||||||||
Status: | Published | ||||||||||||
Keywords: | Provenance, Machine Learning, Deep Learn- ing, Computational Fluid Dynamics (CFD) | ||||||||||||
Event Title: | 16th International Workshop on Theory and Practice of Provenance | ||||||||||||
Event Location: | Wien, Österreich | ||||||||||||
Event Type: | Workshop | ||||||||||||
Event Date: | 12 July 2024 | ||||||||||||
Organizer: | IEEE EuroS&P | ||||||||||||
HGF - Research field: | other | ||||||||||||
HGF - Program: | other | ||||||||||||
HGF - Program Themes: | other | ||||||||||||
DLR - Research area: | Digitalisation | ||||||||||||
DLR - Program: | D CPE - Cyberphysical Engineering | ||||||||||||
DLR - Research theme (Project): | D - ML in digital product-development processes | ||||||||||||
Location: | Dresden | ||||||||||||
Institutes and Institutions: | Institute of Software Methods for Product Virtualization > High Perfomance Computing | ||||||||||||
Deposited By: | Ebrahimi Pour, Neda | ||||||||||||
Deposited On: | 12 Aug 2024 18:14 | ||||||||||||
Last Modified: | 12 Sep 2024 13:55 |
Repository Staff Only: item control page