Saturday, September 17, 2005

Learning from domain specific compression

There are two compression techniques I know of basically - one compresses every character based on its frequency - if 'a' is used a lot, it could be '01' while if 'x' is rarely used, it would be represented as '100101' or something long. This is called huffman coding.

The other type is LZW encoding - it has to do with a fixed length code, but variable length of what the code represents - so if we've seen "aaaaaaa" a lot before, it could be encoded in only 8 bits.

So here's one for you - music encoding. Let's say we're encoding notes. So we might have "bs3q" or something like that - b sharp, 3rd octave above the lowest octave we'll encode, quarter note. We could use LZW for this - it might be nice if we see a lot of notes in one octave. Huffman might be a little naive - if all the notes are in one octave, there would be plenty of overhead to keep entering bits for the octave.

So think about music - sheet music is fairly sparse compared to the representation I described. At the beginning, it says which octave it is, which notes are sharp, and the time. If we could encode that only once, it might be useful.

But that's not the point - I could write the desnsist representation of music ever, and then run huffman and or lzw over it. But the beauty of XML is readable markup - so how do we compress this extremely densely while having the original being uber readable? How do we use smarter compression algorithms than Huffman & LZW?

I'd love to check into "smarter" compression techniques - not necessarily ones tuned for music or another format, but ones that take into account how humans like to represent data (ie: not referring to an external table, but having all information available at their point of focus) and seeing if there's a better way to compress data meant to be read by humans - based on how the mind really likes to work, and how humans actually write naturally.

Just for fun.


Anonymous said...

On my old Nokia, I could compose short ringtones using a similar method to the first one you described -- one letter each for pitch, octave, and length.

As far as human-readable visual representation of music, this is a pretty good example :)

Seriously, though, for something as aurally complicated as a full-orchestra piece, I would venture to say that sheet music is a brilliantly comprehensive and compact method of representing it. But then again, how do you measure the relative compactness of media designed for two different senses?

As far as making it even denser, I point you towards examples of figured bass, Schenker analysis and tabs.

And then, as far as uber-readable information compression goes, there's always, always Edward Tufte. But I'm guessing you knew that.


Martin Davidsson said...

Interesting post and comment.