|
The Voice Identification and Verification Agent (VIVA) is an implementation
of conversational speech biometrics in the telephony environment which
allows for user identification and verification using spoken natural language.
Conversational speech biometrics combines two sources for authentication:
1) physical speech biometrics (the voice-print) and 2) user knowledge
(e.g., passwords and personal information). The combination of the two
information sources increases the security and reliability and provides
a flexible framework for various authentication scenarios so as to maximize
user convenience. Technologies utilized to enable conversational speech
biometrics include acoustic speaker recognition, speech recognition, natural
language understanding, and dialog management. In our prototype, the verification
consists of one or several short interviews involving randomly asked authentication
questions and an acoustic voice-print check. The length of a session depends
on the correctness of the answers and the estimated voice-print confidence.
Users can enroll into the system via an HTML-form and a telephone server.
Measured on real data, casual impostors are falsely accepted by the system
in less than 0.00001 % of the cases, balanced by a 3% false rejection
of the clients.
Scenario 1: An authentic user accesses his account (user="jiri").
Based on an initial voice command (request to open the email box),
the VIVA is capable of identifying the user acoustically. Then it
opens an authentication interview to verify the identity. Due to a
good voice match only one question is asked to successfully verify
the caller. | SCENARIO 1 WAV
 |
Scenario2: An imposter is trying to intrude another user's account.
Besides the natural voice mismatch, the imposter has no knowledge
available to answer the biometrics questions correctly. | SCENARIO
2 WAV
 |
Scenario 3: A (well informed) imposter is trying to intrude another
user's account. In this scenario, the imposter is in possession
of the complete list with correct answers to the biometrics questions
of the true user. In praxis, capturing the complete question pool
-- for example by eaves-dropping -- is made difficult by randomizing
the interviews and asking different questions in every consecutive
session. | SCENARIO 3 WAV
 |
Publications
- "An
instantiable speech biometrics module with natural language interface:
Implementation in the telephony environment," Proc. of the ICASSP
2000, Istanbul, Turkey, June 2000
- "Conversational
Speech Biometrics," Chapter in "E-Commerce Agents Marketplace Solutions,
Security Issues, and Supply and Demand," J. Liu and Y. Ye (Eds.):
Springer Verlag, 2001, Pages 166-179
|