You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brent <br...@gmail.com> on 2016/10/21 23:24:04 UTC

TTL: expungeDeletes=false when removing expired documents

I've got a DocExpirationUpdateProcessorFactory configured to periodically
remove expired documents from the Solr index, which is working in that the
documents no longer show up in queries once they've reached expiration date.
But the index size isn't reduced when they expire, and I'm wondering if it's
because of the expungeDeletes setting in this log line I'm seeing:

o.a.s.u.p.DocExpirationUpdateProcessorFactory Begining periodic deletion of
expired docs
o.a.s.u.DirectUpdateHandler2 start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}

How can I set expungeDeletes=true, and what are the drawbacks to doing so?



--
View this message in context: http://lucene.472066.n3.nabble.com/TTL-expungeDeletes-false-when-removing-expired-documents-tp4302596.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: TTL: expungeDeletes=false when removing expired documents

Posted by Erick Erickson <er...@gmail.com>.
Are you indexing to the collection? In the "usual" case,
as documents get added to the index, background
merging will reclaim the occupied space eventually, see
McCandless' excellent visualization here:
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

The third animation is the default TieredMergePolicy.

What I'm saying here is that you may not need to worry about
it, 10-15% deleted docs (see the admin screen) is pretty normal.

And you want to be cautious about this. Let's assume that
you have a few deleted docs in each segment. IIUC,
expungeDeletes=true will rewrite your entire index, which is
quite expensive.

I believe you can force this periodically by explicitly sending a curl
command like
.../update?commit=true&expungeDeletes=true
And, of course, you can also do this from SolrJ. Again, though,
be really sure it's worthwhile. My challenge: If it's actually a
good thing to do this, then I'd guess your index is relatively unchanging
and optimizing is an option....... May be all wet but...

but that would be a cron job or something similar. I've never seen
this configured in, say, solrconfig.xml and on a quick scan of the
Solr code it doesn't look like that's possible.

Best,
Erick


On Fri, Oct 21, 2016 at 7:24 PM, Brent <br...@gmail.com> wrote:
> I've got a DocExpirationUpdateProcessorFactory configured to periodically
> remove expired documents from the Solr index, which is working in that the
> documents no longer show up in queries once they've reached expiration date.
> But the index size isn't reduced when they expire, and I'm wondering if it's
> because of the expungeDeletes setting in this log line I'm seeing:
>
> o.a.s.u.p.DocExpirationUpdateProcessorFactory Begining periodic deletion of
> expired docs
> o.a.s.u.DirectUpdateHandler2 start
> commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true,prepareCommit=false}
>
> How can I set expungeDeletes=true, and what are the drawbacks to doing so?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/TTL-expungeDeletes-false-when-removing-expired-documents-tp4302596.html
> Sent from the Solr - User mailing list archive at Nabble.com.