IBM
Skip to main content
 
Search IBM Research
     Home  |  Products & services  |  Support & downloads  |  My account
 Select a country
 IBM Home
IBM Research
Think Research
Technical Disciplines
Cross-Disciplines
About IBM Research
Resources
Search Research
Feedback

Related Links
  Worldwide Labs
  Page Contact
 
 


IBM Research
User Interface Technologies

Computer Science > User Interface Technologies > Research Spotlight (January 2003) > Selected Papers

A.W. Senior, Tracking with Probabilistic Appearance Model, in proceedings ECCV workshop on Performance Evaluation of Tracking and Surveillance Systems 1 June 2002 pp 48--55.

This paper describes a real-time computer vision system for people in monocular video sequences. The system tracks people as they move through the camera's field of view, by a combination of background subtraction and the learning of models. The appearance models allow objects to be tracked through occlusions using a probabilistic pixel reclassification algorithm. The system is evaluated on the three test sequences of the PETS 2002 dataset,for which tracking results and processing time requirementsare presented.


C.Neti & G. Potamianos (et al.) wrote the Editorial to the Special Issue "Joint audio-visual speech processing" in Eurasip Journal of Applied signal processing, in Press, November 2003.


Fairweather, P. G., Richards, J. T., & Hanson, V. L. (2002). Distributed accessibility control points to help deliver a directly accessible Web. Universal Access and Inclusion in Design: A Special Issue of Universal Access in the Information Society. DOI 10.1007/s10209-002-0037-3.

This paper describes a set of interfaces and mechanisms to enhance access to the World Wide Web for persons with sensory, cognitive, or motor limitations. Paradoxically, although complex Web architectures are often accused of impeding accessibility, their layers expand the range of points where interventions can be staged to improve it. This paper identifies some of these access control points and evaluates the particular strengths and weaknesses of each. In particular, it describes an approach to enhance access that is distributed across multiple control points and implemented as an aggregation of services.


G. Potamianos, C. Neti, J. Luettin, and I. Matthews, ``Audio-visual automatic speech recognition: An overview,'' To appear in: Audio-Visual Speech Processing, E. Vartikiotis-Bateson, G. Bailly, and P. Perrier (Eds.), MIT Press, pp. 121-148, 2003.MIT press book chapter on "audio-visual speech recognition".


Lisa Brown and Yingli TianComparative Study of Coarse Head Pose Estimation," IEEE Workshop on Motion and Video Computing, Dec. 5-6, 2002. (Orlando FL)

For many practical applications, it is sufficient to estimate coarse head to infer gaze direction. Indeed for any application in which the camera is situated unobtrusively in an overhead corner, the only possible inference is coarse pose because of the limitations of the quality and resolution of the incoming data. However, the vast majority of research in head pose estimation deals with tracking full rigid body motion (6 degrees of freedom) for a limited range of motion (typically +/-45 degrees out-of-plane) and relatively high resolution data (usually 64x64 or more.) In this paper, we review the smaller body of research on coarse pose estimation. This work involves image-based learning, estimation of a wide range of pose, and is capable of real-time performance for low-resolution imagery. We evaluate two coarse pose estimation schemes, based on (1) a probabilistic model approach and (2) a neural network approach. We compare the results of the two techniques for varying resolution, head localization accuracy and required pose accuracy. We conclude with details for the implementation specifications for resolution and localization accuracy depending on system accuracy requirements.


Malcolm Slaney,"Image-based Facial Synthesis," To appear in: Audio-Visual Speech Processing, E. Vartikiotis-Bateson, G. Bailly, and P. Perrier (Eds.), MIT Press, pp. 149-161, 2003.

N. K. Ratha, J. H. Connell and R. M. Bolle, "Secure Fingerprint Authentication". Chapter 11, Automated Biometrics: Technologies and Systems, Kluwer 2002 (David Zhang Editor)

Biometrics-based authentication systems offer advantages over the present practices of knowledge and/or possession-based authentication systems. However, when using biometrics, the overall authentication architecture needs to be reexamined to ensure that no new weak security points are introduced. After analyzing a pattern recognition-based threat model of a biometrics authentication system, this chapter describes secure fingerprint authentication. Several solutions are proposed to alleviate the threats using conventional encryption as well as novel techniques that exploit the richness of biometrics data. The proposed methods are applicable in many application areas. These includes system security, electronic commerce security, point of sale, point of entry/exit and point of access. We also argue that an authentication scheme with both smart card and biometrics improves the overall security of a system.


S. Maes, J. Navratil, U. Chaudhari, "Conversational Speech Biometrics," Chapter in "E-Commerce Agents Marketplace Solutions, Security Issues, and Supply and Demand," J. Liu and Y. Ye (Eds.): Springer Verlag, 2001, Pages 166-179.

This paper discusses a new modality for speaker recognition - conversational biometrics - as a high security voice-based authentication method for E-commerce applications. By combining diverse simultaneous conversational technologies, high accuracy transparent speaker recognition becomes possible even in channel or environment mismatches. For speaker identification over very large populations, we combine dialogs to reduce the set of confusable speakers and text-independent speaker identification to pin-point the actual speaker. Similarly, dialogs with personal random or predefined questions are used to perform simultaneously knowledge-based and acoustic-based verifications of the user. Adequate design of the dialog allows to tailor the ROC curves to the needs of most applications. We demonstrate the conceptual advantages using our telephony prototype. Users familiar with the system can log into the system with 0.8% or 1.3% false rejection and ca. 5e-12% or 2e-06% false acceptance rates in about 40 sec or 20 sec respectively which is an impressive result as compared to purely voice-print based authentication.


 
  About IBM  |  Privacy  |  Terms of use  |  Contact