Highly scaled Federated Learning Simulations for Text Classification

Rudolph, Skady und Schumann, Gerrit und Steffens, Lars und Karl, Michael und Marx Gómez, Jorge (2025) Highly scaled Federated Learning Simulations for Text Classification. 10th International Conference on Computer and Communication Systems (ICCCS 2025), 2025-04-18 - 2025-04-21, Chengdu, China.

PDF - Nur DLR-intern zugänglich
1MB

Kurzfassung

When simulating federated learning scenarios, most studies use a small number of clients that have comparatively large amounts of local data. In this study, we investigated how the classification accuracy of a language model fine-tuned using federated learning changes when the same amount of data is distributed over an increasing number of clients (up to 1,000), so that the amount of data per client is steadily reduced. To this end, we conducted several experiments using an example of the wellknown "contradiction detection" classification task, which showed that the model accuracy decreased with an increasing number of clients when the number of federated training rounds remained the same. To counteract this effect and ensure that each client participates equally often in the training, we dynamically adjusted the number of federated training rounds and modified the widely used "FedAvg" method to allow a controlled client selection per training round instead of a random selection. In this way, a Bert model trained on 1,000 clients (with only 391 data instances each) achieved an accuracy that was 0.81% higher than that of a Bert model trained on 100 clients (with 3,910 data instances each) and only 0.18% below the accuracy of a Bert model trained conventionally (non-federated). These results can be relevant for all federated learning use cases where model accuracy losses caused by a high number of clients need to be compensated, especially in the case of transformer-based language models such as Bert.

elib-URL des Eintrags:

https://elib.dlr.de/215919/

Dokumentart:

Konferenzbeitrag (Vortrag)

Titel:

Highly scaled Federated Learning Simulations for Text Classification

Autoren:

Autoren	Institution oder E-Mail-Adresse	Autoren-ORCID-iD	ORCID Put Code
Rudolph, Skady	Dept. of Business Informatics, University of Oldenburg	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Schumann, Gerrit	Dept. of Business Informatics, University of Oldenburg	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Steffens, Lars	Lars.Steffens (at) dlr.de	https://orcid.org/0000-0002-2561-0687	NICHT SPEZIFIZIERT
Karl, Michael	michael.karl (at) dlr.de	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT
Marx Gómez, Jorge	Dept. of Business Informatics, University of Oldenburg	NICHT SPEZIFIZIERT	NICHT SPEZIFIZIERT

Datum:

2025

Referierte Publikation:

Open Access:

Nein

Gold Open Access:

Nein

In SCOPUS:

Nein

In ISI Web of Science:

Nein

Status:

veröffentlicht

Stichwörter:

Federated Learning, Simulation, Federated Averaging, Contradiction Detection, Text Classification

Veranstaltungstitel:

10th International Conference on Computer and Communication Systems (ICCCS 2025)

Veranstaltungsort:

Chengdu, China

Veranstaltungsart:

internationale Konferenz

Veranstaltungsbeginn:

18 April 2025

Veranstaltungsende:

21 April 2025

HGF - Forschungsbereich:

Luftfahrt, Raumfahrt und Verkehr

HGF - Programm:

Verkehr

HGF - Programmthema:

keine Zuordnung

DLR - Schwerpunkt:

Verkehr

DLR - Forschungsgebiet:

V - keine Zuordnung

DLR - Teilgebiet (Projekt, Vorhaben):

V - keine Zuordnung

Standort:

Rhein-Sieg-Kreis

Institute & Einrichtungen:

Institut für KI-Sicherheit

Hinterlegt von:

Steffens, Lars

Hinterlegt am:

19 Aug 2025 08:41

Letzte Änderung:

19 Aug 2025 08:41

Nur für Mitarbeiter des Archivs: Kontrollseite des Eintrags