elib
DLR-Header
DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Contact | Deutsch
Fontsize: [-] Text [+]

Weaknesses of voice biometrics - sensitivity of Speaker verification to emotional arousal

Rusko, Milan and Trnka, Marian and Darjaa, Sakhia and Stelkens-Kobsch, Tim H. and Finke, Michael (2018) Weaknesses of voice biometrics - sensitivity of Speaker verification to emotional arousal. 25th International Congress on Sound and Vibration, 8.-12. Juli 2018, Hiroshima, Japan.

[img] PDF
489kB

Abstract

In our series of experiments we study weaknesses of the voice biometric systems and try to find solutions to improve their robustness. The acoustical features that represent human voices in the current automatic speaker verification systems change significantly when the person’s emotional arousal deviates from the neutral state. Speech templates of a given speaker used for enrollment are generally recorded in a neutral emotional state using "normal" speech effort. Therefore speaking with higher or lower voice tension causes a mismatch between training and testing resulting in a higher number of verification errors. The acoustical cues of increased emotional arousal in speech are highly non-specific. They are similar to those of Lombard speech, warning and insisting voice, emergency voice, extreme acute stress, shouting, and emotions like anger, fear, hate, and many others. As the available spontaneous emotional speech databases do not cover the full range of the emotional arousal for individual voices, and do not have enough utterances per speaker, we decided to use our CRISIS acted database containing speech utterances at six levels of tense emotional arousal per speaker. Sensitivity of the state of the art i-vector based speaker recognizer with PLDA scoring to arousal mismatch was validated. The speaker verification system was successfully implemented in the online “Speaker authorization” module developed in the frame of the European project Global ATM Security Management (GAMMA). It has been observed that at extreme arousal levels the reliability of the verification decreases. Mixed enrollments with various levels of arousal were used to create more robust models and have shown a promising improvement in the verification reliability compared to the baseline.

Item URL in elib:https://elib.dlr.de/118727/
Document Type:Conference or Workshop Item (Speech)
Title:Weaknesses of voice biometrics - sensitivity of Speaker verification to emotional arousal
Authors:
AuthorsInstitution or Email of AuthorsAuthors ORCID iD
Rusko, MilanMilan.Rusko (at) savba.skUNSPECIFIED
Trnka, MarianUNSPECIFIEDUNSPECIFIED
Darjaa, SakhiaUNSPECIFIEDUNSPECIFIED
Stelkens-Kobsch, Tim H.tim.stelkens-kobsch (at) dlr.deUNSPECIFIED
Finke, Michaelmichael.finke (at) dlr.deUNSPECIFIED
Date:2018
Refereed publication:Yes
Open Access:Yes
Gold Open Access:No
In SCOPUS:No
In ISI Web of Science:No
Status:Published
Keywords:biometry, speaker verification, stress, arousal
Event Title:25th International Congress on Sound and Vibration
Event Location:Hiroshima, Japan
Event Type:international Conference
Event Dates:8.-12. Juli 2018
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Aeronautics
HGF - Program Themes:air traffic management and operations
DLR - Research area:Aeronautics
DLR - Program:L AO - Air Traffic Management and Operation
DLR - Research theme (Project):L - Communication, Navigation and Surveillance
Location: Braunschweig
Institutes and Institutions:Institute of Flight Control > ATM-Simulation
Institute of Flight Control > Controller Assistance
Deposited By: Finke, Michael
Deposited On:26 Mar 2018 10:52
Last Modified:31 Jul 2019 20:16

Repository Staff Only: item control page

Browse
Search
Help & Contact
Information
electronic library is running on EPrints 3.3.12
Copyright © 2008-2017 German Aerospace Center (DLR). All rights reserved.