
|
 |
| IBM Research
|
 |
User Interface Technologies

|
Computer
Science >
User Interface
Technologies > Research Spotlight
(January 2003) > Selected Papers |
|
A.W.
Senior, Tracking with Probabilistic Appearance Model, in proceedings
ECCV workshop on Performance Evaluation of Tracking and Surveillance
Systems 1 June 2002 pp 48--55.
This
paper describes a real-time computer vision system for people in
monocular video sequences. The system tracks people as they move
through the camera's field of view, by a combination of background
subtraction and the learning of models. The appearance models allow
objects to be tracked through occlusions using a probabilistic pixel
reclassification algorithm. The system is evaluated on the three
test sequences of the PETS 2002 dataset,for which tracking results
and processing time requirementsare presented.
|
|
C.Neti
& G. Potamianos
(et al.) wrote the Editorial to the Special
Issue "Joint audio-visual
speech processing" in Eurasip Journal of Applied signal processing,
in Press, November 2003.
|
|
Fairweather,
P. G., Richards, J. T., & Hanson,
V. L. (2002). Distributed accessibility control points to help
deliver a directly accessible Web. Universal Access and Inclusion
in Design: A Special Issue of Universal Access in the Information
Society. DOI 10.1007/s10209-002-0037-3.
This
paper describes a set of interfaces and mechanisms to enhance access
to the World Wide Web for persons with sensory, cognitive, or motor
limitations. Paradoxically, although complex Web architectures are
often accused of impeding accessibility, their layers expand the
range of points where interventions can be staged to improve it.
This paper identifies some of these access control points and evaluates
the particular strengths and weaknesses of each. In particular,
it describes an approach to enhance access that is distributed across
multiple control points and implemented as an aggregation of services.
|
|
G.
Potamianos, C.
Neti, J. Luettin, and I. Matthews, ``Audio-visual
automatic speech recognition: An overview,'' To appear in: Audio-Visual
Speech Processing, E. Vartikiotis-Bateson, G. Bailly, and P. Perrier
(Eds.), MIT Press, pp. 121-148, 2003.MIT press book chapter on "audio-visual
speech recognition".
|
|
Lisa
Brown and Yingli
Tian “Comparative Study of Coarse
Head Pose Estimation," IEEE Workshop on Motion and Video
Computing, Dec. 5-6, 2002. (Orlando FL)
For many
practical applications, it is sufficient to estimate coarse head
to infer gaze direction. Indeed for any application in which the
camera is situated unobtrusively in an overhead corner, the only
possible inference is coarse pose because of the limitations of
the quality and resolution of the incoming data. However, the vast
majority of research in head pose estimation deals with tracking
full rigid body motion (6 degrees of freedom) for a limited range
of motion (typically +/-45 degrees out-of-plane) and relatively
high resolution data (usually 64x64 or more.) In this paper, we
review the smaller body of research on coarse pose estimation. This
work involves image-based learning, estimation of a wide range of
pose, and is capable of real-time performance for low-resolution
imagery. We evaluate two coarse pose estimation schemes, based on
(1) a probabilistic model approach and (2) a neural network approach.
We compare the results of the two techniques for varying resolution,
head localization accuracy and required pose accuracy. We conclude
with details for the implementation specifications for resolution
and localization accuracy depending on system accuracy requirements.
|
Malcolm
Slaney,"Image-based Facial Synthesis,"
To appear in: Audio-Visual Speech Processing, E. Vartikiotis-Bateson,
G. Bailly, and P. Perrier (Eds.), MIT Press, pp. 149-161, 2003.
|
|
N. K.
Ratha, J. H. Connell and R. M. Bolle, "Secure Fingerprint Authentication".
Chapter 11, Automated Biometrics: Technologies and Systems, Kluwer
2002 (David Zhang Editor)
Biometrics-based
authentication systems offer advantages over the present practices
of knowledge and/or possession-based authentication systems. However,
when using biometrics, the overall authentication architecture needs
to be reexamined to ensure that no new weak security points are
introduced. After analyzing a pattern recognition-based threat model
of a biometrics authentication system, this chapter describes secure
fingerprint authentication. Several solutions are proposed to alleviate
the threats using conventional encryption as well as novel techniques
that exploit the richness of biometrics data. The proposed methods
are applicable in many application areas. These includes system
security, electronic commerce security, point of sale, point of
entry/exit and point of access. We also argue that an authentication
scheme with both smart card and biometrics improves the overall
security of a system.
|
|
S.
Maes, J. Navratil, U. Chaudhari, "Conversational Speech Biometrics,"
Chapter in "E-Commerce Agents Marketplace Solutions, Security Issues,
and Supply and Demand," J. Liu and Y. Ye (Eds.): Springer
Verlag, 2001, Pages 166-179.
This
paper discusses a new modality for speaker recognition - conversational
biometrics - as a high security voice-based authentication method
for E-commerce applications. By combining diverse simultaneous conversational
technologies, high accuracy transparent speaker recognition becomes
possible even in channel or environment mismatches. For speaker
identification over very large populations, we combine dialogs to
reduce the set of confusable speakers and text-independent speaker
identification to pin-point the actual speaker. Similarly, dialogs
with personal random or predefined questions are used to perform
simultaneously knowledge-based and acoustic-based verifications
of the user. Adequate design of the dialog allows to tailor the
ROC curves to the needs of most applications. We demonstrate the
conceptual advantages using our telephony prototype. Users familiar
with the system can log into the system with 0.8% or 1.3% false
rejection and ca. 5e-12% or 2e-06% false acceptance rates in about
40 sec or 20 sec respectively which is an impressive result as compared
to purely voice-print based authentication.
|
| |
|