You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Garrick Toubassi <gt...@yahoo.com> on 2011/04/04 23:50:41 UTC

Alternative field compression scheme

I am exploring alternative field compression schemes, designed to perform more 
effectively on small documents.  In particular to be more effective compressing 
stored fields that show repetitive structure across fields, but not necessarily 
within a field.  I have been working with a significant index from the ShopStyle 
search/shopping engine, and have achieved compression rates exceeding that of 
gzip.  The work related to this project is at 
https://github.com/gtoubassi/femtozip.

I'd like to connect with lucene users who would benefit from more effective 
compression of their stored fields to assess applicability of femtozip.  Note as 
part of the project I have a simple tool which can analyze an index to compare 
compression rates of femtozip vs gzip 
(https://github.com/gtoubassi/femtozip/wiki/Indexanalyzer).

gt


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org