Parallel Compression using a Cooperative Dictionary
- The general idea of LZ1 type compression algorithms is as follows: given an
input string, replace phrases (consecutive sequences of characters
in the input) with encodings of {pointer,length} pairs referring
to earlier occurrences of the same phrases.
- For example, the string 'ABCABCX' could be encoded (in symbolic
form) as 'ABC{0,3}X', where "{0,3}" refers to the phrase beginning
at position 0 of length 3.
- Phrases can overlap; for example the string 'ABABABAB' would be
encoded (symbolically) as 'AB{0,6}'; note that this means that not
only does LZ1 type compression give good results for runs of
characters, but for
any sequence consisting of repeated patterns of characters
(unlike most run-length encoding schemes).
- LZ1 type compression
lends itself to efficient hardware implementations using associative
(also called content addressable) memories, since in each successive cycle
the current character can be simultaneously compared with all previous characters
(for example see references [2], [4] below for more details).
- In order to achieve even faster compression, the input string may be
divided into n blocks, and each block compressed in parallel.
- Unfortunately dividing the input string into n blocks reduces
the average "dictionary size" (that is the size of the "already seen" data
for which phrase matches can potentially be found) by a factor of n,
with a resulting loss of compression.
- A solution to this problem is to allow phrases to match "already seen"
data not only in the block in which the phrase resides, but in the
"already seen" data in any of the other blocks being compressed in parallel.
- The result is parallel compression using a cooperative dictionary; using
this class of methods the average dictionary size is not reduced, and therefore
the amount of compression is roughly the same as that of sequential methods.
- The demo below illustrates 4-way parallel compression using a
cooperative dictionary.
Demo
References
- Bell, Cleary, and Witten.
Text Compression, Prentice Hall, 1990.
- Craft.
A fast hardware data compression algorithm and some algorithmic extensions.
IBM Jour. Research & Devlopment 42, 6 (Nov. 1998), 733-746.
- Franaszek, Robinson, and Thomas.
Parallel compression with cooperative dictionary construction.
In Proc. DCC'96 Data Compression Conf., pp. 200-209, IEEE, 1996.
- Franaszek, Robinson, and Thomas.
Parallel Compression and Decompression using a Cooperative Dictionary,
U.S. Patent No. 5,729,228, 1998.
[ Research Home |
Memory Xpansion Technologies ]