You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gregg Donovan <gr...@gmail.com> on 2015/11/13 22:30:31 UTC

Compression for solrbin?

We've had success with LZ4 compression in a custom ShardHandler to reduce
network overhead, getting ~25% compression with low CPU impact. LZ4 or
Snappy seem like reasonable choices[1] for maximizing compression +
transfer + decompression times in the data center.

Would it make sense to integrate compression into javabin itself? For the
ShardHandler and transaction log javabin usage it seems to make sense. We
could flip on gzip in Jetty for HTTP, but GZIP may add more CPU than is
desirable and wouldn't help with the transaction log.

If we did, i t seems incrementing the javabin version[2] and
compressing/decompressing inside of JavaBinCodec#marshal[3] and
JavaBinCodec#unmarshal[4] would allow us to retain backwards compatibility
with older clients or existing files.

Thoughts?

--Gregg

[1] http://cyan4973.github.io/lz4/#tab-2
[2]
https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L83
[3]
https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L112:L120
[4]
https://github.com/apache/lucene-solr/blob/trunk/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L129:L137