Skip to main content

Summary of Activities

The Document Processing and Management group at the IBM Haifa Research Lab focuses on researching and developing the areas of efficient data acquisition, data acquisition processes, advanced productivity enhancement tools, and more. Solutions developed by the group are designed to fit the specific requirements of each project, by adapting existing technologies and developing novel ones as required.

The group has developed these main technologies and skills:

  • Processing various types of forms:
    • Structured forms - forms with consistent structure
    • Logical structured forms - similar forms with the same functionality but printed slightly differently
    • Unstructured forms - forms with similar functionality but different structure
  • Automatic character recognition and related technologies:
    • Printed character recognition (OCR)
    • Handwritten character recognition (ICR)
    • Optical mark recognition such as checkboxes (OMR)
    • One-dimensional and two-dimensional barcodes
    • Dynamic signature verification
  • Advanced post-OCR logic:
    • Disambiguation - improving OCR results using the dictionary of choice
    • Logics - utilizing predefined rules such as math logic, street names, and business process databases
    • Search on OCR - robust solution for searching low quality prints
  • Back-office productivity:
    • SmartKey - a unique patented tool that enables fast, accurate manual correction of automatic recognition results. Using SmartKey, the overall operator productivity can be up to five times higher than for conventional automated systems.

Several success stories demonstrate the group's technologies and skills:

  • IFP (Intelligent Form Processing) - originally developed under contract with the Maryland Department of Revenue to process MD state tax forms. IFP was later installed in five other states (Wisconsin, North Carolina, Vermont, Maine, and the U.S. Virgin Islands). It was also used for medical claims, invoices, employment forms, and censuses. Today, this system is the basis for the integrated IBM document processing platform.
  • Automatic Parcel Sorting System - reads and decodes addresses located on a parcel while the parcel travels on a conveyor. The system sorts parcels in real time with virtually no errors.
  • Stockholm toll system - automatic license plate recognition is based on adapted OCR technology. The results exceeded other OCR solutions.
  • Hearst Metrotone newsreel collection - this important historical record includes approximately 850 hours of newsreel footage. The collection is indexed on 675,000 cards, made of aging paper with low quality printing. Our solution for digitizing the index increased the recognition rate two-fold.
  • Censuses - an adapted OCR technology combined with SmartKey achieved high accuracy results with minimal operator effort.