|
In this research, regions of an image that have similar color and texture are extracted automatically. It is then possible to encode the image based on the extracted regions, and each region can be manipulated as a separated object.
Region segmentation
The region segmentation method was realized with effective parallelism, automated in a way that it is independent of the splitting threshold for any image, avoiding oversplitting inside a texture region, and using Gaussian Markov Random Fields (GMRF) to represent texture properties. The approach taken in this method involves region merging, and the use of GMRF model for the initial conditions and contour sub-regions merging was possible by performing:
- pseudo KL transform (transformation for less correlated color planes)
- multiple GMRF models (4 neighbors and 8 neighbors)
- hypothesize-and-verify scheme for merging small regions
Therefore, it was possible to process GMRF parameters linearly, merging regions larger than 6 pixels by testing their likelihood.
 |
 |
 |
| Quad-tree splitting |
Region merging |
Result from segmentation |
Image compression using hierarchical vector quantization
Images are decomposed into minimal blocks of two by two pixels (1st order block), and each 1st order block is approximated by one of the following patterns:
Four N-th order blocks (2x2) form (N+1)-th order block.
The ID of the (N+1)-th order block is generated from the four IDs of N-th order blocks and is written into a codebook.
Data compression is achieved by putting such blocks together hierarchically.
Image representation based on regions
An hierarchical structure was given to the image, with "region layers" as the unit to manipulate each region. A region layer consists of an overlay order, shape data, approximated region layer, and some data units of residual region layer.
 |
|
 |
|
 |
| Original image |
|
Tree region on the top of stack |
|
Tree region at the bottom of stack |
Scalable image representation
The tradeoff between amount of data/image quality of region, with levels going from approximated region to lossless image, can be chosen by deciding the order of residual region layer.
The approximation methods are color averaging, triangular patches, orthogonal transformations, fractal representations, etc.
|
 |
 |
 |
 |
 |
| Amount of data |
1% |
11% |
28% |
46% |
100% |
| PSNR |
24.3db |
26.6db |
34.2db |
37.5db |
- |
|
Approximation |
|
|
|
Lossless image |
The image in the left, below, is an original image.
The image in the right was constructed by selecting the image quality of each region to a slightly degraded level, so that human eyes can not detect the degradation.
The image in the right, containing extra data such as shape data and overlay order, has 35% more data than the original image.
By using this technology, even without captioning an image, it is possible to search for objects in an image by specifying shape, color and texture. Also, regions can be manipulated separately, so they can be easily moved or combined. Another advantage is that an image can be displayed in several levels of quality, from a raw image with small amount of data, to a fully detailed original image, in a single image. Examples of applications are image searching in a digital library, fast data transfer in channels with limited bandwidth (like the Internet), and support for creators to put added value into their images.
|