elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Accessibility | Contact | Deutsch
Fontsize: [-] Text [+]

FoxBench: Benchmark for n-Dimensional Array File Formats in Data Analytics Environments

Osterthun, Arne and Pohl, Matthias (2025) FoxBench: Benchmark for n-Dimensional Array File Formats in Data Analytics Environments. GI. Datenbanksysteme für Business, Technologie und Web (BTW 2025), 2025-03-03 - 2025-03-07, Bamberg, Germany. doi: 10.18420/BTW2025-25.

[img] PDF
369kB

Abstract

For effective data exchange and transfer, choosing the right file format is crucial. Different domains have specific standards for file formats. While CSV files are commonly used, they lack reusability. Data files are well-suited for computing clusters. Data analytics pipelines can be time-consuming due to handling large volumes of data. Timely data access is crucial for efficient processing and analysis. Earth system science (ESS) data commonly manifests as dense or sparse n-dimensional data. Dense n-dimensional data is conventionally stored in arrays, while sparse n-dimensional data is typically housed in data frames. In the realm of ESS, an array of file formats is leveraged for the storage of dense n-dimensional data, including NetCDF4, TileDB, and Zarr. The paper at hand aims to evaluate data file formats for retrieving multidimensional data, specifically focusing on tools within the ESS domain. The insights from this exploration will be applicable to other data analytics projects.

Item URL in elib:https://elib.dlr.de/219246/
Document Type:Conference or Workshop Item (Speech)
Title:FoxBench: Benchmark for n-Dimensional Array File Formats in Data Analytics Environments
Authors:
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Osterthun, ArneUNSPECIFIEDhttps://orcid.org/0000-0001-6455-9119UNSPECIFIED
Pohl, MatthiasUNSPECIFIEDhttps://orcid.org/0000-0002-6241-7675UNSPECIFIED
Date:7 March 2025
Refereed publication:Yes
Open Access:Yes
Gold Open Access:No
In SCOPUS:No
In ISI Web of Science:No
Volume:361
DOI:10.18420/BTW2025-25
Publisher:GI
Series Name:Lecture Notes of Informatics
Status:Published
Keywords:Benchmark, Data Access, Storage, Cost-based Valuation, File Formats, Big Data
Event Title:Datenbanksysteme für Business, Technologie und Web (BTW 2025)
Event Location:Bamberg, Germany
Event Type:international Conference
Event Start Date:3 March 2025
Event End Date:7 March 2025
Organizer:GI
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Space
HGF - Program Themes:other
DLR - Research area:Raumfahrt
DLR - Program:R - no assignment
DLR - Research theme (Project):R - no assignment
Location: Jena
Institutes and Institutions:Institute of Data Science > Data Management and Enrichment
Deposited By: Pohl, Matthias
Deposited On:19 Nov 2025 08:18
Last Modified:19 Nov 2025 10:59

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
OpenAIRE Validator logo electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.