DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Contact | Deutsch
Fontsize: [-] Text [+]

The impact of averaging logits over probabilities on ensembles of neural networks

Njieutcheu Tassi, Cedrique Rovile and Gawlikowski, Jakob and Fitri, Auliya Unnisa and Triebel, Rudolph (2022) The impact of averaging logits over probabilities on ensembles of neural networks. In: 2022 Workshop on Artificial Intelligence Safety, AISafety 2022, 3215 (19). AISafety 2022: Workshop on Artificial Intelligence Safety, 2022-07-23 - 2022-07-25, Vienna, Austria. ISSN 1613-0073.

[img] PDF - Only accessible within DLR

Official URL: http://ceur-ws.org/Vol-3215/19.pdf


Model averaging has become a standard for improving neural networks in terms of accuracy, calibration, and the ability to detect false predictions (FPs). However, recent findings show that model averaging does not necessarily lead to calibrated confidences, especially for underconfident networks. While existing methods for improving the calibration of combined networks focus on recalibrating, building, or sampling calibrated models, we focus on the combination process. Specifically, we evaluate the impact of averaging logits instead of probabilities on the quality of confidence (QoC). We compare combined logits instead of probabilities of members (networks) for models such as ensembles, Monte Carlo Dropout (MCD), and Mixture of Monte Carlo Dropout (MMCD). Comparison is done using experimental results on three datasets using three different architectures. We show that averaging logits instead of probabilities increase the confidence thereby improving the confidence calibration for underconfident models. For example, for MCD evaluated on CIFAR10, averaging logits instead of probabilities reduces the expected calibration error (ECE) from 12.03% to 5.44%. However, the increase in confidence can bring harm to confidence calibration for overconfident models and the separability between true predictions (TPs) and FPs. For example, for MMCD evaluated on MNIST, the average confidence on FPs due to the noisy data increases from 51.31% to 94.58% when averaging logits instead of probabilities. While averaging logits can be applied with underconfident models to improve the calibration on test data, we suggest to average probabilities for safety- and mission-critical applications where the separability of TPs and FPs is of paramount importance.

Item URL in elib:https://elib.dlr.de/188833/
Document Type:Conference or Workshop Item (Speech)
Title:The impact of averaging logits over probabilities on ensembles of neural networks
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Triebel, RudolphUNSPECIFIEDhttps://orcid.org/0000-0002-7975-036XUNSPECIFIED
Journal or Publication Title:2022 Workshop on Artificial Intelligence Safety, AISafety 2022
Refereed publication:Yes
Open Access:No
Gold Open Access:No
In ISI Web of Science:No
Series Name:CEUR Workshop Proceedings (CEUR-WS.org)
Keywords:Model averaging, Combination process, Logit averaging, Probability averaging, Ensemble, Monte Carlo Dropout (MCD), Mixture of Monte Carlo Dropout (MMCD), Quality of confidence (QoC), Confidence calibration, Separating true predictions (TPs) and false predictions (FPs)
Event Title:AISafety 2022: Workshop on Artificial Intelligence Safety
Event Location:Vienna, Austria
Event Type:Workshop
Event Start Date:23 July 2022
Event End Date:25 July 2022
Organizer:IJCAI-ECAI 2022
HGF - Research field:other
HGF - Program:other
HGF - Program Themes:other
DLR - Research area:Digitalisation
DLR - Program:D IAS - Innovative Autonomous Systems
DLR - Research theme (Project):D - SKIAS, R - Multisensory World Modelling (RM) [RO]
Location: Berlin-Adlershof
Institutes and Institutions:Institute of Optical Sensor Systems
Institute of Robotics and Mechatronics (since 2013)
Institute of Data Science
Deposited By: Njieutcheu Tassi, Cedrique Rovile
Deposited On:12 Oct 2022 07:54
Last Modified:24 Apr 2024 20:50

Repository Staff Only: item control page

Help & Contact
electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.