You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Natarajan, Rajeswari" <ra...@sap.com> on 2017/12/07 20:27:59 UTC

Index size optimization between 4.5.1 and 4.10.4 Solr

Hi,

We have upgraded solr from 4.5.1 to 4.10.4 and we see index size reduction.  Trying to see if any optimization done to
decrease the index sizes , couldn’t locate.  If anyone knows why please share.


Thank you,
Rajeswari

Re: Index size optimization between 4.5.1 and 4.10.4 Solr

Posted by "Natarajan, Rajeswari" <ra...@sap.com>.
Thanks a lot for the response. We did not change schema or config. We simply opened 4.5 indexes with 4.10 libraries.
Thank you,
Rajeswari

On 12/7/17, 3:17 PM, "Shawn Heisey" <ap...@elyograg.org> wrote:

    On 12/7/2017 1:27 PM, Natarajan, Rajeswari wrote:
    > We have upgraded solr from 4.5.1 to 4.10.4 and we see index size reduction.  Trying to see if any optimization done to decrease the index sizes , couldn’t locate.  If anyone knows why please share.
    
    Here's a history where you can see the a summary of the changes in
    Lucene's index format in various versions:
    
    https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/codecs/lucene70/package-summary.html#History
    
    Looking over the history, I would guess that the changes mentioned
    between 4.5 and 4.10 would make little difference in most indexes, but
    for some configurations, might actually *increase* index size slightly. 
    Chances are that the change would only happen after performing some kind
    of operation on the whole index, though.
    
    Did you do anything other than simply open the 4.5.1 index in 4.10.4
    with the same config/schema?  This would include things like running an
    optimize operation on the index, running IndexUpgrader on the index,
    completely reindexing from scratch rather than using the old index, or
    any number of other possibilities.  Operations like those I mentioned
    would have eliminated deleted documents from the index, which can result
    in a size reduction.  If you changed your schema at all, that can have
    an effect on index size -- in either direction.
    
    Thanks,
    Shawn
    
    


Re: Index size optimization between 4.5.1 and 4.10.4 Solr

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/7/2017 1:27 PM, Natarajan, Rajeswari wrote:
> We have upgraded solr from 4.5.1 to 4.10.4 and we see index size reduction.  Trying to see if any optimization done to decrease the index sizes , couldn’t locate.  If anyone knows why please share.

Here's a history where you can see the a summary of the changes in
Lucene's index format in various versions:

https://lucene.apache.org/core/7_1_0/core/org/apache/lucene/codecs/lucene70/package-summary.html#History

Looking over the history, I would guess that the changes mentioned
between 4.5 and 4.10 would make little difference in most indexes, but
for some configurations, might actually *increase* index size slightly. 
Chances are that the change would only happen after performing some kind
of operation on the whole index, though.

Did you do anything other than simply open the 4.5.1 index in 4.10.4
with the same config/schema?  This would include things like running an
optimize operation on the index, running IndexUpgrader on the index,
completely reindexing from scratch rather than using the old index, or
any number of other possibilities.  Operations like those I mentioned
would have eliminated deleted documents from the index, which can result
in a size reduction.  If you changed your schema at all, that can have
an effect on index size -- in either direction.

Thanks,
Shawn