You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kayak28 <ka...@gmail.com> on 2020/03/10 12:28:20 UTC

Atomic Update and Optimization and segments

Hello, Community:

Currently, my index grows up to almost 1T, and I would like to minimize my
index.

I know I have a few fields that are not used or rarely used, so I want to
delete them.
I have tried to delete these fields by the atomic update, sending the
following JSON for example.
{
"id":"1",
"text":{"set": null }
}
As a result, it generated a new segment, so segment count increased +1,
index size became bigger, and mac doc is increased +1.
I have expected this result, but my goal is to minimize my index, so I sent
an expungeDeleted request and optimize request, expecting to reduce the
index size and segment count.
But, the segment did not reduce, the index size did not change, and max doc
did not change.

As of Solr 8.4.1, is there any way to minimize segment count, index size
and max doc after atomic-updating?

Sincerely,
Kaya Ota

Re: Atomic Update and Optimization and segments

Posted by Erick Erickson <er...@gmail.com>.
ExpungeDelete only deletes segments with > 10% deleted documents. Since you’re using Solr 8.4, you can use an optimize and that’ll get rid of all your documents.

However, none of this is really relevant if you only change a single doc. Deleting a field only affects that single document, actually the new copy of that doc. The old copy is still there in an old segment until it’s merged away.

All that said, I’d strongly urge you to consider more sharding. A 1T index on a single replica is pushing the bounds of operational stability. Solr will handle this, and I’ve seen larger indexes. But consider a full recovery. To copy the entire index, you’ll have to push 1T to the new replica, which will just take time and consume bandwidth.

FWIW,
Erick

> On Mar 10, 2020, at 07:28, Kayak28 <ka...@gmail.com> wrote:
> 
> Hello, Community:
> 
> Currently, my index grows up to almost 1T, and I would like to minimize my
> index.
> 
> I know I have a few fields that are not used or rarely used, so I want to
> delete them.
> I have tried to delete these fields by the atomic update, sending the
> following JSON for example.
> {
> "id":"1",
> "text":{"set": null }
> }
> As a result, it generated a new segment, so segment count increased +1,
> index size became bigger, and mac doc is increased +1.
> I have expected this result, but my goal is to minimize my index, so I sent
> an expungeDeleted request and optimize request, expecting to reduce the
> index size and segment count.
> But, the segment did not reduce, the index size did not change, and max doc
> did not change.
> 
> As of Solr 8.4.1, is there any way to minimize segment count, index size
> and max doc after atomic-updating?
> 
> Sincerely,
> Kaya Ota

Re: Atomic Update and Optimization and segments

Posted by Jörn Franke <jo...@gmail.com>.
How do you do the atomic updates? I discovered a bug when doing them via DIH or Scriptupdateprocessor (only this one! The atomic one is fine) that leads to infinite index growth when doing atomic updates 

> Am 10.03.2020 um 13:28 schrieb Kayak28 <ka...@gmail.com>:
> 
> Hello, Community:
> 
> Currently, my index grows up to almost 1T, and I would like to minimize my
> index.
> 
> I know I have a few fields that are not used or rarely used, so I want to
> delete them.
> I have tried to delete these fields by the atomic update, sending the
> following JSON for example.
> {
> "id":"1",
> "text":{"set": null }
> }
> As a result, it generated a new segment, so segment count increased +1,
> index size became bigger, and mac doc is increased +1.
> I have expected this result, but my goal is to minimize my index, so I sent
> an expungeDeleted request and optimize request, expecting to reduce the
> index size and segment count.
> But, the segment did not reduce, the index size did not change, and max doc
> did not change.
> 
> As of Solr 8.4.1, is there any way to minimize segment count, index size
> and max doc after atomic-updating?
> 
> Sincerely,
> Kaya Ota