Beomseok Nam, Henrique Andrade, et al.
ACM/IEEE SC 2006
The Pen Technologies group at IBM Research has recently been investigating methods for retrieving handwritten documents based on user queries. This paper investigates the use of typed and handwritten queries to retrieve relevant handwritten documents. The IBM handwriting recognition engine was used to generate N-best lists for the words in each of 108 short documents. These N-best lists are concise statistical representations of the handwritten words. These statistical representations enable the retrieval methods to be robust when there are machine transcription errors allowing retrieval of documents that would be missed by a traditional transcription-based retrieval system. Our experimental results demonstrate that significant improvements in retrieval performance can be achieved compared to standard keyword text searching of machine-transcribed documents. We have developed a software architecture for a multimedia document retrieval framework into which machine learning algorithms for feature extraction and matching may be easily integrated. The framework provides a "plug-and-play" mechanism for the integration of new media types, new feature extraction methods, and new document types.
Beomseok Nam, Henrique Andrade, et al.
ACM/IEEE SC 2006
Xinyi Su, Guangyu He, et al.
Dianli Xitong Zidonghua/Automation of Electric Power Systems
Kento Tsubouchi, Yosuke Mitsuhashi, et al.
npj Quantum Information
Gabriele Dominici, Pietro Barbiero, et al.
ICLR 2025