Cosc 2P03 Fall 2011
Assignment#4

(Due date for assignment -->TBA)

LZW compression

For this assignment you will be implementing the LZW compressor and de-compressor. To make this a realistic assignment, consider compressing a text file to a stream of binary short integers (16 bits).  The code book will need to be initialized with the standard ASCII alphabet, this will give 127 possible initial characters. To aid in the look-up of the strings we will use a hash table where an entry to the table will consist of a record (object) consisting of the string S  (String)  and the code C (short). This will allow efficient look-up of the character strings. To cover the range of a short int make the table 65537 which should be prime, open addressing can be used (k==+1). Design a hash function f(S) to map the string S into the hash table. Once possible function can be:

f(S) = [1*char(1) + 2*char(2) + 3*char(3) etc....] mod table_size.

Implement the compressor, reading from some text file character by character. Note that you will need also treat end of line and carriage returns as characters to be encoded. These special characters will need to be escaped when added into the hash table. For each S encoded, output C as a short to a binary file.


The decompressor will read shorts from the compressed file and build the original text file. The code book will be the ASCII char set. This code book will need to be 65536 in length to accommodate all possible codes.

Show that your solution works by downloading some text files from the internet, compressing them and decompressing them.

Submission Requirements:

Good Luck