Skip to main content

Automatic Recognition

Overview

Selected fields of interest go through Optical Character Recognition (OCR), Intelligent Character Recognition (ICR), Optical Mark Recognition (OMR), and barcode recognition. In some applications, several ICR/OCR engines/results can be combined with special voting. Both machine-printed and hand-printed fields are recognized.

Numeric and alphanumeric information can be handled along with special character segmentation (used in checks, old typewriters, etc.).

OMR system capabilities include the location and recognition of barcodes and various other marks (e.g., checkboxes). Both one-dimensional and two-dimensional (2D) barcodes can be handled.

Character Separation Technology

Character separation is a process that precedes OCR, where a field is separated into its individual characters.

The field is segmented into individual connected components. Each connected component is separated into its individual characters. Optionally, a preprocessing pass over the whole page learns the specific characteristics of the writer, such as nominal values of pen width, character size, typical distance between characters, etc.

OCR - Optical Character Recognition

In typical applications, OCR is activated in the omni-font mode, where the system automatically adjusts itself to the specific font type and size.

A special module was developed to identify italics and dot matrix cases.

In some cases, it is advisable to train the system for specific standard fonts (e.g., OCRB, CMC7, etc.). In such cases, exceptionally high recognition rates are possible.

Special solutions can be applied for non-typical cases such as texts printed on type-writers.

ICR - Intelligent Character Recognition for Handwriting

The ICR process includes algorithms for identifying the topology of the writing, character separation, and character identification by neural networks.

ICR recognizes characters using the following steps:

  • Training phase - offline:
    • Find topology
    • Calculate features
    • Train neural networks for each feature
  • Recognition phase - online:
    • Find topology
    • Calculate features
    • Use neural network trained on this topology
    • Produce a probability for each class

In typical applications, no ICR training is required; in other words, default training results are sufficient. However, training capabilities are important for tuning the system results to the specific writing patterns of a given country (e.g., to account for different styles of writing such as "7" in Europe and the US).

OMR and Barcode Recognition

System capabilities for OMR and barcode recognition include the location and recognition of barcodes and other marks (e.g., checkboxes). Our system can handle both one-dimensional and two-dimensional (2D) barcodes, and can decode low resolution or even damaged barcodes, in real time.

Online Signature Recognition

An online signature is a signature that was captured using a digitizing tablet and special pen. The system recognizes the signature using data such as coordinates, pen tilt, and pressure. All the features include timing data, thereby making the system more immune to forgery.