You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by David Smiley <da...@gmail.com> on 2019/11/01 13:23:23 UTC

Re: forceMerge and unused metadata

To follow-up in a more official channel than Slack, I suggested that the
JIRA issue for this request is:
https://issues.apache.org/jira/browse/LUCENE-8551

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Oct 29, 2019 at 6:19 PM Shawn Heisey <ap...@elyograg.org> wrote:

> A question came across the #solr IRC channel, where the user was seeing
> fields in their /admin/luke endpoint about a bunch of fields they used
> to use, but are no longer in any current documents.  That URL endpoint
> provides information about the fields in the index, getting most of that
> info directly from Lucene.
>
> I asked them to run an optimize (forceMerge in Lucene) and see what that
> did.  It did not remove those fields.
>
> Discussing it with other Solr committers on the lucene-solr slack
> channel, this is apparently known -- a forceMerge does not eliminate any
> field metadata, even if the field is not referenced by any non-deleted
> document.
>
> What I'm wondering is whether it would be possible to adjust merging so
> that it can determine what pieces of metadata (like field information)
> are unused in the index and remove them.  It would be fine if this were
> only an option on forceMerge, but nice if it were something that could
> happen on any merge.  That discussion on slack indicated that it might
> be prohibitively expensive to do this.  Can one of our experts on Lucene
> merging respond?
>
> This particular user has no option that I am aware of other than to
> rebuild their index.  They're running version 4.2.1.
>
> Thanks,
> Shawn
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>