|

|
Chalapathy Neti - Publications
Joint processing of audio and visual information
The key thrust of this work is to understand how visual information could
be exploited to improve audio-based processing of speech and speaker in
adverse acoustic conditions. This is a first step towards understanding
general primciples of combining/fusing multiple sources of information
for robust interpretation of human identity, activity and intent (Perceptual
computing).
- HJ Nock, G Iyengar, C Neti, Speaker Localisation using Audio-Visual Synchrony: An Empirical Study
To appear in CIVR 2003
- G Iyengar, HJ Nock, C Neti, Audio-Visual Synchrony for Detection of Monologues in Video Archives, Proc ICASSP 2003 (Presented at ICME 2003)
- HJ Nock, G Iyengar, C Neti, Issues in Speech-based Retrieval of Video,
Proc ISCA Tutorial Workshop (Multilingual Spoken Document Retrieval) 2003
- HJ Nock, W Adams, G Iyengar, C-Y Lin, M Naphade, A Natsev, C Neti, JR Smith, B Tseng, User-trainable Video Annotation Using Multimodal Cues
To appear in Proc SIGIR 2003
- D.M.Russel, P.P. Maglio, R. Dordick, C. Neti, Dealing with Ghosts: Managing the User Experience of Autonomic Computing, IBM Systems Journal, Vol.42, No.1, pp.177-188, 2003.
- W.H. Adams, G. Iyengar, C-Y Lin, M.R. Naphade, C. Neti, H.J. Nock, J.R. Smith, Semantic Indexing of Multimedia Content Using Visual, Audio and Text Cues, Eurasip Journal on Applied Signal Processing
Vol 2003, No 2, Feb 2003
- Sabine Deligne, Gerasimos Potamianos, Chalapathy Neti, Audio-Visual Speech Enhancement With AVCDCN (Audio-Visual Codebook
Dependent Cepstral Normalization), IEEE workshop on Sensor Array and Multichannel Signal Processing in August 2002, Washington DC and ICSLP 2002, Denver.
- G. Iyengar, H. Nock, C. Neti, M. Franz, Semantic Indexing of Multimedia using Audio, Text and Visual Cues, Proceedings of ICME2002, Lausanne, Switzerland, 2002
- G. Gravier, G. Potamianos, and C. Neti, Asynchrony modeling for audio-visual speech recognition, Proc. Human Language Technology Conference, San Diego, 2002.
- G. Gravier, S. Axelrod, G. Potamianos, and C. Neti, Maximum entropy and MCE based HMM stream weight estimation for audio-visual ASR, Proc. Int. Conf. Acoust. Speech Signal Process., Orlando, 2002.
- R. Goecke, G. Potamianos, and C. Neti, Noisy audio feature enhancement using audio-visual speech data, Proc. Int. Conf. Acoust. Speech Signal Process., Orlando, 2002.
- G. Iyengar, C. Neti. A vision-based microphone switch for speech intent detection, Recognition, Analysis and Tracking of Face and Gestures in Real Time Systems (RATFG-RTS) Workshop at ICCV 2001 in Vancouver, 13th July 2001.
- G. Potamianos, C. Neti, G. Iyengar, and E. Helmuth, Large-vocabulary
audio-visual speech recognition by machines and humans, Proc.
Eurospeech, Aalborg, 2001.
- G. Potamianos and C. Neti, Automatic speechreading
of impaired speech, Proc. Work. Audio-Visual Speech Process.,
Scheelsminde, 2001.
- G. Potamianos and C. Neti, Improved
ROI and within frame discriminant features for lipreading, Proc.
Int. Conf. Image Process., Thessaloniki, 2001.
- C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, and D. Vergyri,
Large-vocabulary audio-visual speech recognition: A summary of the Johns
Hopkins Summer 2000 Workshop, Proc. IEEE Work. Multimedia Signal
Process., Cannes, 2001.
- G. Iyengar and C. Neti, Detection of faces under shadows and lighting variations, Cannes, 2001.
- G. Iyengar, G. Potamianos, C. Neti, T. Faruquie, and A. Verma, Robust
detection of visual ROI for automatic speechreading, Proc. IEEE Work.
Multimedia Signal Process., Cannes, 2001.
- I. Matthews, G. Potamianos, C. Neti, and J. Luettin, A
comparison of model and transform-based visual features for audio-visual
LVCSR, Proc. IEEE Int. Conf. Multimedia Expo., Tokyo, 2001.
- G. Potamianos, C. Neti, G. Iyengar, A.W.
Senior, and A. Verma, A cascade visual front end for speaker independent
automatic speechreading,Int. J. Speech Technology,
Vol. 4, pp. 193-208, 2001.
- G. Potamianos, J. Luettin, C. Neti. Hierarchical
discriminant features for audio-visual LVCSR, ICASSP, Salt Lake
City, May 2001.
- J. Luettin, G. Potamianos, C. Neti. Asynchronous
stream modeling for large-vocabulary audio-visual speech recognition,
ICASSP, Salt Lake City, May 2001.
- H. Glotin, D. Vergyri, C. Neti, G. Potamianos, J. Luettin. Weighting
schemes for audio-visual fusion in speech recognition, ICASSP, Salt
Lake City, May 2001.
- G. Potamianos, C. Neti. Stream confidence
estimation for audio-visual speech recognition, ICSLP, vol III,
pp. 746-749, Beijing, October 2000
- C. Neti, G. Iyengar, G. Potamianos, A. Senior, B. Maison. Perceptual
interfaces for information interaction: Joint processing of audio and visual
information for human-computer interaction, ICSLP, vol III, pp.
11-14, Beijing, October 2000.
- C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri,
J. Sison, A.Mashari, and J. Zhou, Audio-Visual
Speech Recognition, Final Workshop 2000 Report, Center for
Language and Speech Processing, The Johns Hopkins University, Baltimore,
MD (Oct. 12, 2000).
- G. Potamianos,
A. Verma, C. Neti, G. Iyengar. A cascade
image transform for speaker independent automatic speechreading.
International Conference on Multimedia and Expo (ICME00), New
York, July 2000.
- C.Neti,
P.deCuetos A.Senior. Audio-visual intent-to-speak
detection for human-computer interaction, ICASSP June 5-9
2000, Istanbul, Turkey.
- C.Neti,
B.Maison, A.Senior, G.Iyengar, P.deCuetos, S.Basu, A.Verma.
Joint proccessing of audio and visual information for multimedia
indexing and human-computer interaction, RIAO April 12-14
2000, Paris, France.
- G.Iyengar,
C.Neti. Speaker change detection using
joint audio-visual statistics, RIAO 12-14 April 2000,
Paris, France, Dec. 20, 1999.
- Ashish
Verma, Tanveer Faruquie, C. Neti, Sankar Basu, Andrew Senior.
Late Integration in Continuous Audio-Visual Speech Recognition,
ASRU, Colorado, 1999.
- Benoit Maison, Chalapathy Neti, Andrew Senior.
Audio-Visual speaker recognition for video broadcast news: some fusion
techniques, IEEE Multimedia Signal Processing (MMSP99),
Denmark, Sept, 1999.
- S. Basu, C. Neti, N. Rajput, A. Senior. L. Subramaniam, A. Verma.
Audio-Visual large-vocabulary continous speech recognition in the
broadcast news domain, IEEE Multimedia Signal Processing
Conference (MMSP99), Denmark, Sept, 1999.
- Andrew Senior, Chalapathy Neti, Benoit Masion. On
the use of visual information for improving audio-based speaker recognition,
Audio-Visual Speech processing conference (AVSP99), Santa Cruz,
CA, Aug, 1999.
- Chalapathy Neti, Andrew Senior. Audio-Visual
speaker recognition for video broadcast news, DARPA HUB4
Workshop, Washington D.C., March 1999.
Position Papers
- Chalapathy Neti, Stephane Maes, Mark Lucente and Dragutin Petkovic.
Knowledge/Smart Spaces, 1999 DARPA/NSF/NIST Workshop on Research
issues in Smart Computing Environments, July 1999.
- S. Basu, E. E. Jan, Mark Lucente and Chalapathy Neti. Beyond Audio-based
speech recognition, 1998 NIST/DARPA Workhop on SmartSpaces,
GaithersBurg, MD, 1998.
Conversational (Spoken language) systems
The thrust of these papers is to develop an understanding of the design
of conversational systems that include speech recognition, natural language
understanding and dialog. The first paper is the basis of a prototype
for conversational interaction with personal information and the
second is the basis for the bilingual (English and French) ATIS prototype
demonstration. Both demonstrations are widely used by the Human Language
technology group for customer visits.
- G. Ramaswamy, J. Kleindienst, D. Coffman, P. Gopalakrishnan and C.
Neti., A Pervasive Conversational Interface for Information
Interaction. Eurospeech99, Budapest, Hunagary, 1999.
- T.Ward, S. Roukos, C. Neti, M. Epstein, S. Dharanipragada. Towards
speech understanding across multiple languages, Proceedings of
the International conference on spoken language processing (ICSLP98),
Sydney, Australia, 1998.
Speech Recognition
The thrust of these papers is to improve spontaneous, speaker and language-indepedent
speech recognition performance. These papers are related to algorithms
on confidence estimation, accent and language independence and noise-robust
speech representations based on mammalian auditory system.
- C. Neti, S. Dharanipragada and Salim Roukos. Towards a universal
speech recognizer for multiple languages. Automatic Speech Recognition
and Understanding Workshop (ASRU97), Santa Barbara, CA,
1997.
- C. Neti and Salim Roukos. Phone-specific gender-dependent models
for continuous speech recognition, Automatic Speech Recognition
and Understanding Workshop (ASRU97), Santa Barbara, CA,
1997.
- C. Neti, E. Eide and Salim Roukos. Word-based confidence measures
as a guide for stack search in continuous speech recognition.
International Conference on Acoustics Speech and Signal Processing (ICASSP97),
Munich, Germany, 1997.
- R.Bakis, P.S. Gopalakrishnan, R. Gopinath, F.H. Liu, S. Maes, M.
Monkowski, C. Neti, H. Printz, P.S. Rao. Confidence Measures.
Proceedings of the Large Vocabulary Continuous Speech Recognition
Workshop. April 29, 1996.
- Nagendra Kumar, Chalapathy Neti and Andreas Andreou. Application
of Discriminant Analysis to Speech Recognition with Auditory Model Based
Features. Proceedings of the Speech Research Symposium XV,
Johns Hopkins University, Baltimore, MD, 1995.
- Chalapathy Neti. Neuromorphic Speech processing for noisy environments.
Proceedings of the IEEE International conference on Neural Network,
Orlando, FL, pp 4425-4430, 1994.
Computational Neuroscience and Biology
The thrust of these papers is to develop computational models of human
sensory processing and systemic aspects. In particular, I focussed on
models of sound localization that are structurally similar to the underlying
brain pathways.The concept of fault tolerance and model of sound localization
developed in this work is the first work of its kind and is cited by all
subsequent work.
- Chalapathy Neti, Michael Schneider and Eric Young. Maximally fault-tolerant
Neural Networks IEEE transactions on Neural Networks,
vol 3, no 1}, pp 14-23, 1992.
- Chalapathy Neti, Eric Young and Michael Schneider. Neural network
models of Sound Localization based on Directional filtering by the Pinna.
J. Acoust. Soc. America, Vol 92, No 6}, pp 3140-3156, 1992.
- Michael Schneider, Kristen Farrow and Chalapathy Neti. Low Storage
Second-Order learning algorithms. Proceedings of IJCNN, Seattle,
pp A-954, 1991.
- Chalapathy Neti, Michael Schneider and Eric Young. Maximally fault-tolerant
Neural Networks and nonlinear programming . Proceedings of the
IJCNN, San Diego, CA, 1990.
- Chalapathy Neti and Eric Young. A neural network model of
sound localization based on spectral cues. Neuroscience
Abstracts, St. Louis, MO, 1990.
- K. Campbell, J. Ringo. C. Neti and J. Alexander. Informational
analysis of Left Ventricle/Systemic-Arterial interaction Annals
of Biomedical Engineering, pp 209-231, 1984.
VLSI design tools
The main focus was to develop sophisticated design tools for function,
timing and fault simulation of VLSI chips.
- Chalapathy Neti and David Coelho. Timing-Verification using a General
behavioural simulator Proceedings of International conference
on Computer Design, Portchester, New York, 1984.
PATENTS
- D. Coffman, P. Gopalakrishnan, G. Ramaswamy, J. Kleindinst, C. Neti. Method and system for multi-client access to a dialog system, US Patent Number 6,377,913, April 2002.
- S. Basu, T. Farooquie, C. Neti, N. Rajput, L. V. Subramaniam and A. Verma. Speech driven lip synthesis using viseme based hidden markov models, US Patent Number 6,366,885, April 2002.
- Chalapathy Neti and Salim Roukos. Speech recognition models combining
gender-dependent and gender-indepedent phone states using phonetic-context-dependence,
US Patent Number. 5,953,701. Issued Sept 14, 1999.
- Chalapathy Neti. Method and System for noise-robust speech processing
with cochlear filters in an auditory model, US Patent Number
5,768,474, 1998.
- Chalapathy Neti. Method and System for adapter configuration in
a data processing system, US Patent Number 5,619,701, 1997.
Email: cneti@watson.ibm.com
|