You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Scott M." <qm...@top-consulting.net> on 2018/04/23 17:13:16 UTC

Optimize question

I recently installed Solr 7.1 and configured it to work with Dovecot for full-text searching. It works great but after about 2 days of indexing, I've pressed the 'Optimize' button. At that point it had collected about 17 million documents and it was taking up about 60-70GB of space. 

It completed once and the space dropped down to 30-45GB but since then it appears to be doing Optimize again on its own, regularly swelling up the total space used to double, then it shrinks again, stays a bit that way then it starts another optimize!

Logs show:
	4/22/2018, 11:04:22 PM
	WARN false
	DirectUpdateHandler2
	Starting optimize... Reading and rewriting the entire index! Use with care.
	4/23/2018, 3:18:35 AM
	WARN true
	DirectUpdateHandler2
	Starting optimize... Reading and rewriting the entire index! Use with care.
	4/23/2018, 7:33:46 AM
	WARN false
	DirectUpdateHandler2
	Starting optimize... Reading and rewriting the entire index! Use with care.
	4/23/2018, 9:48:32 AM
	WARN false
	DirectUpdateHandler2
	Starting optimize... Reading and rewriting the entire index! Use with care.
	4/23/2018, 11:25:13 AM
	WARN false
	DirectUpdateHandler2
	Starting optimize... Reading and rewriting the entire index! Use with care.
	4/23/2018, 1:00:42 PM
	WARN false
	DirectUpdateHandler2
	Starting optimize... Reading and rewriting the entire index! Use with care.
It's absolutely killing the computer this is running on. Now it just started another run...

In the logs all I see is entries like these, and it doesn't say anywhere optimize=true

2018-04-23 17:12:31.995 INFO  (qtp947679291-17200) [   x:dovecot] o.a.s.u.DirectUpdateHandler2 start commit{_version_=1598557836536709120,optimize=false,openSearcher=true,waitSearcher=false,expungeDeletes=false,softCommit=true,prepareCommit=false}

Re[2]: Optimize question

Posted by "Scott M." <qm...@top-consulting.net>.
I only have one core, 'dovecot'. This is a pretty standard config. How do I stop it from doing all these 'Optimizes' ? Is there an automatic process that triggers them ?
On Mon, Apr 23, 2018 at 01:25 PM, Shawn Heisey  wrote:
On 4/23/2018 11:13 AM, Scott M. wrote:
I recently installed Solr 7.1 and configured it to work with Dovecot for full-text searching. It works great but after about 2 days of indexing, I've pressed the 'Optimize' button. At that point it had collected about 17 million documents and it was taking up about 60-70GB of space.

It completed once and the space dropped down to 30-45GB but since then it appears to be doing Optimize again on its own, regularly swelling up the total space used to double, then it shrinks again, stays a bit that way then it starts another optimize!

Are you running in SolrCloud mode with multiple replicas and/or multiple 
shards?

If so, SolrCloud does optimize a little differently than standalone 
mode.  It will optimize every core in the entire collection, one at a 
time, regardless of which actual core receives the optimize request.  In 
standalone mode, only the specific core you run the command on will be 
optimized.

Thanks,
Shawn

Re: Optimize question

Posted by Shawn Heisey <ap...@elyograg.org>.
On 4/23/2018 11:13 AM, Scott M. wrote:
> I recently installed Solr 7.1 and configured it to work with Dovecot for full-text searching. It works great but after about 2 days of indexing, I've pressed the 'Optimize' button. At that point it had collected about 17 million documents and it was taking up about 60-70GB of space.
>
> It completed once and the space dropped down to 30-45GB but since then it appears to be doing Optimize again on its own, regularly swelling up the total space used to double, then it shrinks again, stays a bit that way then it starts another optimize!

Are you running in SolrCloud mode with multiple replicas and/or multiple 
shards?

If so, SolrCloud does optimize a little differently than standalone 
mode.  It will optimize every core in the entire collection, one at a 
time, regardless of which actual core receives the optimize request.  In 
standalone mode, only the specific core you run the command on will be 
optimized.

Thanks,
Shawn


Re[2]: Optimize question

Posted by "Scott M." <qm...@top-consulting.net>.
So, basically I made the first mistake by Optimizing ? At this point, since it seems I can't stop these optimizations from running, should I just drop all data and start fresh ?
On Mon, Apr 23, 2018 at 01:23 PM, Erick Erickson  wrote:
No, it's not "optimizing on its own". At least it better not be.

As far as your index growing after optimize, that's the little
"gotcha" with optimize, see:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ (https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/)

This is being addressed in the 7.4 time frame (hopefully), see LUCENE-7976.

Best,
Erick

On Mon, Apr 23, 2018 at 10:13 AM, Scott M.  wrote:
I recently installed Solr 7.1 and configured it to work with Dovecot for full-text searching. It works great but after about 2 days of indexing, I've pressed the 'Optimize' button. At that point it had collected about 17 million documents and it was taking up about 60-70GB of space.

It completed once and the space dropped down to 30-45GB but since then it appears to be doing Optimize again on its own, regularly swelling up the total space used to double, then it shrinks again, stays a bit that way then it starts another optimize!

Logs show:
4/22/2018, 11:04:22 PM
WARN false
DirectUpdateHandler2
Starting optimize... Reading and rewriting the entire index! Use with care.
4/23/2018, 3:18:35 AM
WARN true
DirectUpdateHandler2
Starting optimize... Reading and rewriting the entire index! Use with care.
4/23/2018, 7:33:46 AM
WARN false
DirectUpdateHandler2
Starting optimize... Reading and rewriting the entire index! Use with care.
4/23/2018, 9:48:32 AM
WARN false
DirectUpdateHandler2
Starting optimize... Reading and rewriting the entire index! Use with care.
4/23/2018, 11:25:13 AM
WARN false
DirectUpdateHandler2
Starting optimize... Reading and rewriting the entire index! Use with care.
4/23/2018, 1:00:42 PM
WARN false
DirectUpdateHandler2
Starting optimize... Reading and rewriting the entire index! Use with care.
It's absolutely killing the computer this is running on. Now it just started another run...

In the logs all I see is entries like these, and it doesn't say anywhere optimize=true

2018-04-23 17:12:31.995 INFO  (qtp947679291-17200) [   x:dovecot] o.a.s.u.DirectUpdateHandler2 start commit{_version_=1598557836536709120,optimize=false,openSearcher=true,waitSearcher=false,expungeDeletes=false,softCommit=true,prepareCommit=false}

Re: Optimize question

Posted by Erick Erickson <er...@gmail.com>.
No, it's not "optimizing on its own". At least it better not be.

As far as your index growing after optimize, that's the little
"gotcha" with optimize, see:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/

This is being addressed in the 7.4 time frame (hopefully), see LUCENE-7976.

Best,
Erick

On Mon, Apr 23, 2018 at 10:13 AM, Scott M. <qm...@top-consulting.net> wrote:
> I recently installed Solr 7.1 and configured it to work with Dovecot for full-text searching. It works great but after about 2 days of indexing, I've pressed the 'Optimize' button. At that point it had collected about 17 million documents and it was taking up about 60-70GB of space.
>
> It completed once and the space dropped down to 30-45GB but since then it appears to be doing Optimize again on its own, regularly swelling up the total space used to double, then it shrinks again, stays a bit that way then it starts another optimize!
>
> Logs show:
>         4/22/2018, 11:04:22 PM
>         WARN false
>         DirectUpdateHandler2
>         Starting optimize... Reading and rewriting the entire index! Use with care.
>         4/23/2018, 3:18:35 AM
>         WARN true
>         DirectUpdateHandler2
>         Starting optimize... Reading and rewriting the entire index! Use with care.
>         4/23/2018, 7:33:46 AM
>         WARN false
>         DirectUpdateHandler2
>         Starting optimize... Reading and rewriting the entire index! Use with care.
>         4/23/2018, 9:48:32 AM
>         WARN false
>         DirectUpdateHandler2
>         Starting optimize... Reading and rewriting the entire index! Use with care.
>         4/23/2018, 11:25:13 AM
>         WARN false
>         DirectUpdateHandler2
>         Starting optimize... Reading and rewriting the entire index! Use with care.
>         4/23/2018, 1:00:42 PM
>         WARN false
>         DirectUpdateHandler2
>         Starting optimize... Reading and rewriting the entire index! Use with care.
> It's absolutely killing the computer this is running on. Now it just started another run...
>
> In the logs all I see is entries like these, and it doesn't say anywhere optimize=true
>
> 2018-04-23 17:12:31.995 INFO  (qtp947679291-17200) [   x:dovecot] o.a.s.u.DirectUpdateHandler2 start commit{_version_=1598557836536709120,optimize=false,openSearcher=true,waitSearcher=false,expungeDeletes=false,softCommit=true,prepareCommit=false}