Automatic Parcel Sorting System
The Automatic Parcel Sorting System developed by IBM Haifa reads and decodes addresses located on a parcel while the parcel travels on a conveyor. Performing correct layout analysis proved to be a major challenge. Unlike standard envelopes, the location of the address information on parcels is not strongly defined. Moreover, both Send: and To: addresses are located on the same side.
This system was installed in the sorting centers at Swiss Post and has proved itself through stability and high quality results.
The Automatic Parcel Sorting System decodes addresses from a grayscale image. The image is captured by a high resolution camera located above the parcel while it travels on a conveyor.
The address decoding process is composed of the following main steps:
- Acquire image (up to 40 MB)
- Find regions of interest (ROIs) for barcodes, labels, and addresses
- For each type of ROI:
- Locate the objects within the ROI
- Decode the object: OCR for addresses versus reading barcodes (see Barcode Detection and Recognition section, above)
- Verify the addresses by running postal dictionary lookup
The input image is acquired by a high resolution line CCD camera, located above the conveyor. The images may include labels, logos, strips, graphics, and warping plastics-all the usual variation of parcels at a sorting site.
The following diagram presents a schematic system for the parcel barcode recognition.
Click to see full size

Figure 23 - System scheme
The Process in Detail
The process first finds ROIs that are deemed to include text. The process is carried out on grayscale images. The efficiency is very high, so the 40 MB images are processed in real-time.
The address block grayscale image is extracted, rotated, and de-skewed. The input image has a perspective distortion in the x direction, where the distortion level is proportional to the parcel height. Therefore, a special method is used to extract and correct the address block image. Image quality is enhanced to ensure high quality of the OCR results.
The images contain graphics, textured background, colored labels, and text of differing sizes and widths. Adaptive binarization is a crucial step to improving OCR input image quality.
To overcome the large variance in address syntaxes and OCR errors due to the poor image quality, the decoding is completed by a postal directory look-up process. Using this method, up to four "hard" OCR errors can be corrected with virtually no substitution errors.