Saturday, September 3, 2011

LZ4-HC : High Compression LZ4 version is now Open Sourced

 It's been quite a while since my last post in this blog. Well, with Summer ongoing, i was busy having a life. You know, things that happen outside...
Anyway, with September ringing in, it's time to get back to work.

I've used my first few days at complying to a long standing request : wtf, where is LZ4-HC ?

Well, as i replied a few times, LZ4-HC was not provided; that is, up to now.

LZ4-HC is now Open Sourced, and provided as a sample demo for the MMC Search Algorithm, which it makes heavy use of.
That's the main difference with regular LZ4 : the HC version makes a full search, in contrast with the fast scan of the "regular" LZ4. It results in a compression ratio typically improved by 20%. But the speed is also quite impacted : the HC version is between 6x and 10x slower than the Fast one.

Nonetheless, the compression advantage can be made worthwhile, especially in "offline compression" scenario, when the time necessary to compress a resource is not important, since data can only get decompressed during execution.

There is also a second difference : since MMC is GPL, therefore LZ4-HC is GPL too. Note that it does not change anything regarding LZ4 license, which remains BSD.

Is that an issue ? Not necessarily. Since LZ4-HC and LZ4 use the very same format (and indeed, the decoding function is one and the same), so you can provide the output of LZ4-HC to LZ4, and the other way round.

Therefore, the following scenario is valid : you can use LZ4-HC to compress your data, and ship your commercial product with the compressed data streams and the (BSD) LZ4 decoder. There is no problem with that.

You can also create your own private application using LZ4-HC to compress your resources, without ever disclosing your source code. This is valid as long as the binary is not distributed.

Only if you want to distribute your application will you need to comply to the GPL License to integrate the LZ4-HC compression routine. Nothing unusual here : either open-source your code or acquire a commercial license.

Hopefully, although legality is a boring thing, it's not too complex to understand for a change.

You can grab the LZ4-HC Source Code here; it's been compiled and tested successfully under Windows, Linux 32 bits and Linux 64 bits.

[Edit] : LZ4HC is now Hosted at its own web page : http://code.google.com/p/lz4hc/. The latest version also improves compression ratio by a few %.