You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Zhao, Gang" <gz...@ea.com> on 2014/06/14 02:09:47 UTC

Indexing size increase 20% after switching from lucene 4.4 to 4.5 or 4.8 with BinaryDocValuesField

I used lucene 4.4 to create index for some documents. One of the indexing fields is BinaryDocValuesField. After I change the dependency to lucene 4.5. The index size for 1 million documents increases from 293MB to 357MB. If I did not use BinaryDocValuesField, the index size increases only about 2%. I also tried lucene 4.8. The index size is similar to index size with lucene 4.5.

I am wondering what the change for handling BinaryDocValuesField from 4.4 to 4.5 or 4.8 is.

Gang Zhao
Software Engineer - EA Digital Platform
207 Redwood Shores Parkway
Redwood City, CA 94065
Direct Line: 650-628-3719
[cid:image001.png@01CD68F0.6239B040]


Re: Indexing size increase 20% after switching from lucene 4.4 to 4.5 or 4.8 with BinaryDocValuesField

Posted by Robert Muir <rc...@gmail.com>.
Again, because merging is based on byte size, you have to be careful how
you measure (hint: use LogDocMergePolicy).

Otherwise you are comparing apples and oranges.

Separately, your configuration is using experimental codecs like
"disk"/"memory" which arent as heavily benchmarked etc as the default index
format.


On Fri, Jun 13, 2014 at 8:09 PM, Zhao, Gang <gz...@ea.com> wrote:

>   I used lucene 4.4 to create index for some documents. One of the
> indexing fields is BinaryDocValuesField. After I change the dependency to
> lucene 4.5. The index size for 1 million documents increases from 293MB to
> 357MB. If I did not use BinaryDocValuesField, the index size increases only
> about 2%. I also tried lucene 4.8. The index size is similar to index size
> with lucene 4.5.
>
>
>
> I am wondering what the change for handling BinaryDocValuesField from 4.4
> to 4.5 or 4.8 is.
>
>
>
> Gang Zhao
>
> Software Engineer - EA Digital Platform
>
> 207 Redwood Shores Parkway
> Redwood City, CA 94065
>
> Direct Line: 650-628-3719
>
> [image: cid:image001.png@01CD68F0.6239B040]
>
>
>