DLR-Logo -> http://www.dlr.de
DLR Portal Home | Imprint | Privacy Policy | Contact | Deutsch
Fontsize: [-] Text [+]

Weaknesses of voice biometrics - sensitivity of Speaker verification to emotional arousal

Rusko, Milan and Trnka, Marian and Darjaa, Sakhia and Stelkens-Kobsch, Tim H. and Finke, Michael (2018) Weaknesses of voice biometrics - sensitivity of Speaker verification to emotional arousal. 25th International Congress on Sound and Vibration, 2018-07-08 - 2018-07-12, Hiroshima, Japan.

[img] PDF


In our series of experiments we study weaknesses of the voice biometric systems and try to find solutions to improve their robustness. The acoustical features that represent human voices in the current automatic speaker verification systems change significantly when the person’s emotional arousal deviates from the neutral state. Speech templates of a given speaker used for enrollment are generally recorded in a neutral emotional state using "normal" speech effort. Therefore speaking with higher or lower voice tension causes a mismatch between training and testing resulting in a higher number of verification errors. The acoustical cues of increased emotional arousal in speech are highly non-specific. They are similar to those of Lombard speech, warning and insisting voice, emergency voice, extreme acute stress, shouting, and emotions like anger, fear, hate, and many others. As the available spontaneous emotional speech databases do not cover the full range of the emotional arousal for individual voices, and do not have enough utterances per speaker, we decided to use our CRISIS acted database containing speech utterances at six levels of tense emotional arousal per speaker. Sensitivity of the state of the art i-vector based speaker recognizer with PLDA scoring to arousal mismatch was validated. The speaker verification system was successfully implemented in the online “Speaker authorization” module developed in the frame of the European project Global ATM Security Management (GAMMA). It has been observed that at extreme arousal levels the reliability of the verification decreases. Mixed enrollments with various levels of arousal were used to create more robust models and have shown a promising improvement in the verification reliability compared to the baseline.

Item URL in elib:https://elib.dlr.de/118727/
Document Type:Conference or Workshop Item (Speech)
Title:Weaknesses of voice biometrics - sensitivity of Speaker verification to emotional arousal
AuthorsInstitution or Email of AuthorsAuthor's ORCID iDORCID Put Code
Stelkens-Kobsch, Tim H.UNSPECIFIEDhttps://orcid.org/0000-0002-8485-6628UNSPECIFIED
Finke, MichaelUNSPECIFIEDhttps://orcid.org/0000-0003-2355-7779UNSPECIFIED
Refereed publication:Yes
Open Access:Yes
Gold Open Access:No
In ISI Web of Science:No
Keywords:biometry, speaker verification, stress, arousal
Event Title:25th International Congress on Sound and Vibration
Event Location:Hiroshima, Japan
Event Type:international Conference
Event Start Date:8 July 2018
Event End Date:12 July 2018
HGF - Research field:Aeronautics, Space and Transport
HGF - Program:Aeronautics
HGF - Program Themes:air traffic management and operations
DLR - Research area:Aeronautics
DLR - Program:L AO - Air Traffic Management and Operation
DLR - Research theme (Project):L - Communication, Navigation and Surveillance (old)
Location: Braunschweig
Institutes and Institutions:Institute of Flight Guidance > ATM-Simulation
Institute of Flight Guidance > Controller Assistance
Deposited By: Finke, Michael
Deposited On:26 Mar 2018 10:52
Last Modified:24 Apr 2024 20:23

Repository Staff Only: item control page

Help & Contact
electronic library is running on EPrints 3.3.12
Website and database design: Copyright © German Aerospace Center (DLR). All rights reserved.