Lossy Image Bisection


For my project, I applied using image bisection in a lossy manner for image and video compression. The basic idea is that each channel in the image is recursively split in half along its longest axis. If two child nodes in the tree differ by less than a certain tolerance, then they are pruned from the tree and their average is stored in the parent node. The result of this process is a binary tree with the image information stored in the leaves. This can be easily serialized to disk and further compressed using a general data compression algorithm such as huffman coding or deflate.

My data compression textbook only briefly mentioned image bisection before moving onto quadrisection for losslessly compressing b/w bitmaps. I’m not exactly certain why that is, as bisection is less complicated to implement, allows you to process non-square images, and uses the same amount of space to store the tree data. Originally I was going to use quadrisection, but the restriction that the image would have to be square made me realize the superiority of bisection.

My implementation consisted of a series of commands that I could chain together. I had commands for reading/writing raw RGB data from PNG files, splitting/merging channels, compressing/decompressing a single channel, transforming to YCbCr from RGB and vise versa, and performing frame differencing. By rearranging the commands in the pipeline, I could easily try out new possibilities. This ultimately proved quite handy once I tried video compression, as I only had to add commands for handling frame differencing. Lastly, I would always compress the resulting data with gzip (deflate).

For the impatient, my algorithm performed mildly well on image little high frequency noise. When used losslessly, it seemed to perform better than PNG compression. I assume that it had to do with a better entropy coding, but I don’t have enough data to back up either claim. It’s also fair to think of it as RLE on steroids. While image bisection is better in most cases than RLE for animation images, it is no match for DCT or wavelet based methods for photo data.