You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by bu...@apache.org on 2004/09/09 21:35:02 UTC
DO NOT REPLY [Bug 31149] New: -
[PATCH] to store binary fields with compression
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=31149>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=31149
[PATCH] to store binary fields with compression
Summary: [PATCH] to store binary fields with compression
Product: Lucene
Version: CVS Nightly - Specify date in submission
Platform: All
OS/Version: All
Status: NEW
Severity: Enhancement
Priority: Other
Component: Index
AssignedTo: lucene-dev@jakarta.apache.org
ReportedBy: bernhard.messer@intrafind.de
hi all,
as promised here is the enhancement for the binary field patch with optional
compression. The attachment includes all necessary diffs based on the latest
version from CVS. There is also a small junit test case to test the core
functionality for binary field compression. The base implementation for binary
fields where this patch relies on, can be found in patch #29370. The existing
unit tests pass fine.
For testing binary fields and compression, I'm creating an index from 2700 plain
text files (avg. 6kb per file) and store all file content within that index
without using compression. The test was created using the IndexFiles class from
the demo distribution. Setting up the index and storing all content without
compression took about 60 secs and the final index size was 21 MB. Running the
same test, switching compression on, the time to index increase to 75 secs, but
the final index size shrinks to 13 MB. This is less than the plain text files
them self need in the file system (15 MB)
Hopefully this patch helps people dealing with huge index and want to store more
than just 300 bytes per document to display a well formed summary.
regards
Bernhard
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org