You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Markus Jelsma <ma...@openindex.io> on 2019/06/13 10:19:28 UTC

Increased disk space usage 8.1.1 vs 7.7.1

Hello,

We are upgrading to Solr 8. One of our reindexed collections takes a GB more than the production uses which is on 7.7.1. Production also has deleted documents. This means Solr 8 somehow uses more disk space. I have checked both Solr and Lucene's CHANGES but no ticket was immediately obvious.

Does anyone know what is going on?

Many thanks,
Markus

Re: Increased disk space usage 8.1.1 vs 7.7.1

Posted by Colvin Cowie <co...@gmail.com>.
Hello,

For context it would probably be helpful to know some more info about the
collection. e.g. it's 1GB bigger, but what percentage increase does that
represent? Like is it 0.5% or 50%?

On Thu, 13 Jun 2019 at 11:19, Markus Jelsma <ma...@openindex.io>
wrote:

> Hello,
>
> We are upgrading to Solr 8. One of our reindexed collections takes a GB
> more than the production uses which is on 7.7.1. Production also has
> deleted documents. This means Solr 8 somehow uses more disk space. I have
> checked both Solr and Lucene's CHANGES but no ticket was immediately
> obvious.
>
> Does anyone know what is going on?
>
> Many thanks,
> Markus
>

Re: Increased disk space usage 8.1.1 vs 7.7.1

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/13/2019 4:19 AM, Markus Jelsma wrote:
> We are upgrading to Solr 8. One of our reindexed collections takes a GB more than the production uses which is on 7.7.1. Production also has deleted documents. This means Solr 8 somehow uses more disk space. I have checked both Solr and Lucene's CHANGES but no ticket was immediately obvious.

Did you index to a core with nothing in it, or reindex on an existing 
index without deleting everything first and letting Lucene erase all the 
segments?

If you reindexed into an existing index, you could simply have deleted 
documents taking up the extra space.  Full comparison would need to be 
done after optimizing both indexes to clear out deleted documents.

You're probably already aware that optimizing in production is 
discouraged, unless you're willing to do it frequently ... which gets 
expensive with large indexes.

If the size is 1GB larger after both indexes are optimized to clear 
deleted documents, then the other replies you've gotten will be important.

Thanks,
Shawn

Re: Increased disk space usage 8.1.1 vs 7.7.1

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
If you look at the data files, is any extension suddenly taking way more
space? That may give a clue.

Also is schema the same? Like you did not enable docvalues on strings by
default or similar.

Regards,
    Alex

On Thu, Jun 13, 2019, 6:19 AM Markus Jelsma, <ma...@openindex.io>
wrote:

> Hello,
>
> We are upgrading to Solr 8. One of our reindexed collections takes a GB
> more than the production uses which is on 7.7.1. Production also has
> deleted documents. This means Solr 8 somehow uses more disk space. I have
> checked both Solr and Lucene's CHANGES but no ticket was immediately
> obvious.
>
> Does anyone know what is going on?
>
> Many thanks,
> Markus
>