You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Marius Dumitru Florea <ma...@xwiki.com> on 2014/02/17 11:16:31 UTC

Facet cache issue when deleting documents from the index

Hi guys,

I'm using Solr 4.6.1 (embedded) and for some reason the facet cache is
not invalidated when documents are deleted from the index. Sadly, for
me, I cannot reproduce this issue with an integration test like this:

----------8<----------
SolrInstance server = getSolrInstance();

SolrInputDocument document = new SolrInputDocument();
document.setField("id", "foo");
document.setField("locale", "en");
server.add(document);

server.commit();

document = new SolrInputDocument();
document.setField("id", "bar");
document.setField("locale", "en");
server.add(document);

server.commit();

SolrQuery query = new SolrQuery("*:*");
query.set("facet", "on");
query.set("facet.field", "locale");
QueryResponse response = server.query(query);

Assert.assertEquals(2, response.getResults().size());
FacetField localeFacet = response.getFacetField("locale");
Assert.assertEquals(1, localeFacet.getValues().size());
Count en = localeFacet.getValues().get(0);
Assert.assertEquals("en", en.getName());
Assert.assertEquals(2, en.getCount());

server.delete("foo");
server.commit();

response = server.query(query);

Assert.assertEquals(1, response.getResults().size());
localeFacet = response.getFacetField("locale");
Assert.assertEquals(1, localeFacet.getValues().size());
en = localeFacet.getValues().get(0);
Assert.assertEquals("en", en.getName());
Assert.assertEquals(1, en.getCount());
---------->8----------

Nevertheless, when I do the 'same' on my real environment, the count
for the locale facet remains 2 after one of the documents is deleted.
The search result count is fine, so that's why I think it's a facet
cache issue. Note that the facet count remains 2 even after I restart
the server, so the cache is persisted on the file system.

Strangely, the facet count is updated correctly if I modify the
document instead of deleting it (i.e. removing a keyword from the
content so that it isn't matched by the search query any more). So it
looks like only delete triggers the issue.

Now, an interesting fact is that if, on my real environment, I delete
one of the documents and then add a new one, the facet count becomes
3. So the last commit to the index, which inserts a new document,
doesn't trigger a re-computation of the facet cache. The previous
facet cache is simply incremented, so the error is perpetuated. At
this point I don't even know how to fix the facet cache without
deleting the Solr data folder so that the full index is rebuild.

I'm still trying to figure out what is the difference between the
integration test and my real environment (as I used the same schema
and configuration). Do you know what might be wrong?

Thanks,
Marius

Re: Facet cache issue when deleting documents from the index

Posted by Marius Dumitru Florea <ma...@xwiki.com>.
In the end the problem was actually in my code.. sorry for the noise.
The documents were deleted from my database but not from the Solr
index and I have a display filter that filters out search results that
correspond to documents that don't exist any more in the database,
but this filter doesn't update the facets.

Thanks for the help,
Marius

On Mon, Feb 17, 2014 at 10:52 PM, Marius Dumitru Florea
<ma...@xwiki.com> wrote:
> I tried to set the expungeDeletes flag but it didn't fix the problem.
> The SolrServer doesn't expose a way to set this flag so I had to use:
>
> new UpdateRequest().setAction(UpdateRequest.ACTION.COMMIT, true, true,
> 1, true).process(solrServer);
>
> Any other hints?
>
> Note that I managed to run my test in my real environment at runtime
> and it passed, so it seems the behaviour depends on the size of the
> documents that are committed (added to or deleted from the index).
>
> Thanks,
> Marius
>
> On Mon, Feb 17, 2014 at 2:32 PM, Marius Dumitru Florea
> <ma...@xwiki.com> wrote:
>> On Mon, Feb 17, 2014 at 2:00 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>>> Hi,
>>>
>>
>>> Also I noticed that in your code snippet you have server.delete("foo"); which does not exists. deleteById and deleteByQuery methods are defined in SolrServer implementation.
>>
>> Yes, sorry, I have a wrapper over the SolrInstance that doesn't do
>> much. In the case of delete it just forwards the call to deleteById.
>> I'll check the expungeDeletes=true flag and post back the results.
>>
>> Thanks,
>> Marius
>>
>>>
>>>
>>>
>>> On Monday, February 17, 2014 1:42 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>>> Hi Marius,
>>>
>>> Facets are computed from indexed terms. Can you commit with expungeDeletes=true flag?
>>>
>>> Ahmet
>>>
>>>
>>>
>>>
>>> On Monday, February 17, 2014 12:17 PM, Marius Dumitru Florea <ma...@xwiki.com> wrote:
>>> Hi guys,
>>>
>>> I'm using Solr 4.6.1 (embedded) and for some reason the facet cache is
>>> not invalidated when documents are deleted from the index. Sadly, for
>>> me, I cannot reproduce this issue with an integration test like this:
>>>
>>> ----------8<----------
>>> SolrInstance server = getSolrInstance();
>>>
>>> SolrInputDocument document = new SolrInputDocument();
>>> document.setField("id", "foo");
>>> document.setField("locale", "en");
>>> server.add(document);
>>>
>>> server.commit();
>>>
>>> document = new SolrInputDocument();
>>> document.setField("id", "bar");
>>> document.setField("locale", "en");
>>> server.add(document);
>>>
>>> server.commit();
>>>
>>> SolrQuery query = new SolrQuery("*:*");
>>> query.set("facet", "on");
>>> query.set("facet.field", "locale");
>>> QueryResponse response = server.query(query);
>>>
>>> Assert.assertEquals(2, response.getResults().size());
>>> FacetField localeFacet = response.getFacetField("locale");
>>> Assert.assertEquals(1, localeFacet.getValues().size());
>>> Count en = localeFacet.getValues().get(0);
>>> Assert.assertEquals("en", en.getName());
>>> Assert.assertEquals(2, en.getCount());
>>>
>>> server.delete("foo");
>>> server.commit();
>>>
>>> response = server.query(query);
>>>
>>> Assert.assertEquals(1, response.getResults().size());
>>> localeFacet = response.getFacetField("locale");
>>> Assert.assertEquals(1, localeFacet.getValues().size());
>>> en = localeFacet.getValues().get(0);
>>> Assert.assertEquals("en", en.getName());
>>> Assert.assertEquals(1, en.getCount());
>>> ---------->8----------
>>>
>>> Nevertheless, when I do the 'same' on my real environment, the count
>>> for the locale facet remains 2 after one of the documents is deleted.
>>> The search result count is fine, so that's why I think it's a facet
>>> cache issue. Note that the facet count remains 2 even after I restart
>>> the server, so the cache is persisted on the file system.
>>>
>>> Strangely, the facet count is updated correctly if I modify the
>>> document instead of deleting it (i.e. removing a keyword from the
>>> content so that it isn't matched by the search query any more). So it
>>> looks like only delete triggers the issue.
>>>
>>> Now, an interesting fact is that if, on my real environment, I delete
>>> one of the documents and then add a new one, the facet count becomes
>>> 3. So the last commit to the index, which inserts a new document,
>>> doesn't trigger a re-computation of the facet cache. The previous
>>> facet cache is simply incremented, so the error is perpetuated. At
>>> this point I don't even know how to fix the facet cache without
>>> deleting the Solr data folder so that the full index is rebuild.
>>>
>>> I'm still trying to figure out what is the difference between the
>>> integration test and my real environment (as I used the same schema
>>> and configuration). Do you know what might be wrong?
>>>
>>> Thanks,
>>> Marius
>>>

Re: Facet cache issue when deleting documents from the index

Posted by Marius Dumitru Florea <ma...@xwiki.com>.
I tried to set the expungeDeletes flag but it didn't fix the problem.
The SolrServer doesn't expose a way to set this flag so I had to use:

new UpdateRequest().setAction(UpdateRequest.ACTION.COMMIT, true, true,
1, true).process(solrServer);

Any other hints?

Note that I managed to run my test in my real environment at runtime
and it passed, so it seems the behaviour depends on the size of the
documents that are committed (added to or deleted from the index).

Thanks,
Marius

On Mon, Feb 17, 2014 at 2:32 PM, Marius Dumitru Florea
<ma...@xwiki.com> wrote:
> On Mon, Feb 17, 2014 at 2:00 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>> Hi,
>>
>
>> Also I noticed that in your code snippet you have server.delete("foo"); which does not exists. deleteById and deleteByQuery methods are defined in SolrServer implementation.
>
> Yes, sorry, I have a wrapper over the SolrInstance that doesn't do
> much. In the case of delete it just forwards the call to deleteById.
> I'll check the expungeDeletes=true flag and post back the results.
>
> Thanks,
> Marius
>
>>
>>
>>
>> On Monday, February 17, 2014 1:42 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>> Hi Marius,
>>
>> Facets are computed from indexed terms. Can you commit with expungeDeletes=true flag?
>>
>> Ahmet
>>
>>
>>
>>
>> On Monday, February 17, 2014 12:17 PM, Marius Dumitru Florea <ma...@xwiki.com> wrote:
>> Hi guys,
>>
>> I'm using Solr 4.6.1 (embedded) and for some reason the facet cache is
>> not invalidated when documents are deleted from the index. Sadly, for
>> me, I cannot reproduce this issue with an integration test like this:
>>
>> ----------8<----------
>> SolrInstance server = getSolrInstance();
>>
>> SolrInputDocument document = new SolrInputDocument();
>> document.setField("id", "foo");
>> document.setField("locale", "en");
>> server.add(document);
>>
>> server.commit();
>>
>> document = new SolrInputDocument();
>> document.setField("id", "bar");
>> document.setField("locale", "en");
>> server.add(document);
>>
>> server.commit();
>>
>> SolrQuery query = new SolrQuery("*:*");
>> query.set("facet", "on");
>> query.set("facet.field", "locale");
>> QueryResponse response = server.query(query);
>>
>> Assert.assertEquals(2, response.getResults().size());
>> FacetField localeFacet = response.getFacetField("locale");
>> Assert.assertEquals(1, localeFacet.getValues().size());
>> Count en = localeFacet.getValues().get(0);
>> Assert.assertEquals("en", en.getName());
>> Assert.assertEquals(2, en.getCount());
>>
>> server.delete("foo");
>> server.commit();
>>
>> response = server.query(query);
>>
>> Assert.assertEquals(1, response.getResults().size());
>> localeFacet = response.getFacetField("locale");
>> Assert.assertEquals(1, localeFacet.getValues().size());
>> en = localeFacet.getValues().get(0);
>> Assert.assertEquals("en", en.getName());
>> Assert.assertEquals(1, en.getCount());
>> ---------->8----------
>>
>> Nevertheless, when I do the 'same' on my real environment, the count
>> for the locale facet remains 2 after one of the documents is deleted.
>> The search result count is fine, so that's why I think it's a facet
>> cache issue. Note that the facet count remains 2 even after I restart
>> the server, so the cache is persisted on the file system.
>>
>> Strangely, the facet count is updated correctly if I modify the
>> document instead of deleting it (i.e. removing a keyword from the
>> content so that it isn't matched by the search query any more). So it
>> looks like only delete triggers the issue.
>>
>> Now, an interesting fact is that if, on my real environment, I delete
>> one of the documents and then add a new one, the facet count becomes
>> 3. So the last commit to the index, which inserts a new document,
>> doesn't trigger a re-computation of the facet cache. The previous
>> facet cache is simply incremented, so the error is perpetuated. At
>> this point I don't even know how to fix the facet cache without
>> deleting the Solr data folder so that the full index is rebuild.
>>
>> I'm still trying to figure out what is the difference between the
>> integration test and my real environment (as I used the same schema
>> and configuration). Do you know what might be wrong?
>>
>> Thanks,
>> Marius
>>

Re: Facet cache issue when deleting documents from the index

Posted by Marius Dumitru Florea <ma...@xwiki.com>.
On Mon, Feb 17, 2014 at 2:00 PM, Ahmet Arslan <io...@yahoo.com> wrote:
> Hi,
>

> Also I noticed that in your code snippet you have server.delete("foo"); which does not exists. deleteById and deleteByQuery methods are defined in SolrServer implementation.

Yes, sorry, I have a wrapper over the SolrInstance that doesn't do
much. In the case of delete it just forwards the call to deleteById.
I'll check the expungeDeletes=true flag and post back the results.

Thanks,
Marius

>
>
>
> On Monday, February 17, 2014 1:42 PM, Ahmet Arslan <io...@yahoo.com> wrote:
> Hi Marius,
>
> Facets are computed from indexed terms. Can you commit with expungeDeletes=true flag?
>
> Ahmet
>
>
>
>
> On Monday, February 17, 2014 12:17 PM, Marius Dumitru Florea <ma...@xwiki.com> wrote:
> Hi guys,
>
> I'm using Solr 4.6.1 (embedded) and for some reason the facet cache is
> not invalidated when documents are deleted from the index. Sadly, for
> me, I cannot reproduce this issue with an integration test like this:
>
> ----------8<----------
> SolrInstance server = getSolrInstance();
>
> SolrInputDocument document = new SolrInputDocument();
> document.setField("id", "foo");
> document.setField("locale", "en");
> server.add(document);
>
> server.commit();
>
> document = new SolrInputDocument();
> document.setField("id", "bar");
> document.setField("locale", "en");
> server.add(document);
>
> server.commit();
>
> SolrQuery query = new SolrQuery("*:*");
> query.set("facet", "on");
> query.set("facet.field", "locale");
> QueryResponse response = server.query(query);
>
> Assert.assertEquals(2, response.getResults().size());
> FacetField localeFacet = response.getFacetField("locale");
> Assert.assertEquals(1, localeFacet.getValues().size());
> Count en = localeFacet.getValues().get(0);
> Assert.assertEquals("en", en.getName());
> Assert.assertEquals(2, en.getCount());
>
> server.delete("foo");
> server.commit();
>
> response = server.query(query);
>
> Assert.assertEquals(1, response.getResults().size());
> localeFacet = response.getFacetField("locale");
> Assert.assertEquals(1, localeFacet.getValues().size());
> en = localeFacet.getValues().get(0);
> Assert.assertEquals("en", en.getName());
> Assert.assertEquals(1, en.getCount());
> ---------->8----------
>
> Nevertheless, when I do the 'same' on my real environment, the count
> for the locale facet remains 2 after one of the documents is deleted.
> The search result count is fine, so that's why I think it's a facet
> cache issue. Note that the facet count remains 2 even after I restart
> the server, so the cache is persisted on the file system.
>
> Strangely, the facet count is updated correctly if I modify the
> document instead of deleting it (i.e. removing a keyword from the
> content so that it isn't matched by the search query any more). So it
> looks like only delete triggers the issue.
>
> Now, an interesting fact is that if, on my real environment, I delete
> one of the documents and then add a new one, the facet count becomes
> 3. So the last commit to the index, which inserts a new document,
> doesn't trigger a re-computation of the facet cache. The previous
> facet cache is simply incremented, so the error is perpetuated. At
> this point I don't even know how to fix the facet cache without
> deleting the Solr data folder so that the full index is rebuild.
>
> I'm still trying to figure out what is the difference between the
> integration test and my real environment (as I used the same schema
> and configuration). Do you know what might be wrong?
>
> Thanks,
> Marius
>

Re: Facet cache issue when deleting documents from the index

Posted by Ahmet Arslan <io...@yahoo.com>.
Hi,

Also I noticed that in your code snippet you have server.delete("foo"); which does not exists. deleteById and deleteByQuery methods are defined in SolrServer implementation.



On Monday, February 17, 2014 1:42 PM, Ahmet Arslan <io...@yahoo.com> wrote:
Hi Marius,

Facets are computed from indexed terms. Can you commit with expungeDeletes=true flag?

Ahmet




On Monday, February 17, 2014 12:17 PM, Marius Dumitru Florea <ma...@xwiki.com> wrote:
Hi guys,

I'm using Solr 4.6.1 (embedded) and for some reason the facet cache is
not invalidated when documents are deleted from the index. Sadly, for
me, I cannot reproduce this issue with an integration test like this:

----------8<----------
SolrInstance server = getSolrInstance();

SolrInputDocument document = new SolrInputDocument();
document.setField("id", "foo");
document.setField("locale", "en");
server.add(document);

server.commit();

document = new SolrInputDocument();
document.setField("id", "bar");
document.setField("locale", "en");
server.add(document);

server.commit();

SolrQuery query = new SolrQuery("*:*");
query.set("facet", "on");
query.set("facet.field", "locale");
QueryResponse response = server.query(query);

Assert.assertEquals(2, response.getResults().size());
FacetField localeFacet = response.getFacetField("locale");
Assert.assertEquals(1, localeFacet.getValues().size());
Count en = localeFacet.getValues().get(0);
Assert.assertEquals("en", en.getName());
Assert.assertEquals(2, en.getCount());

server.delete("foo");
server.commit();

response = server.query(query);

Assert.assertEquals(1, response.getResults().size());
localeFacet = response.getFacetField("locale");
Assert.assertEquals(1, localeFacet.getValues().size());
en = localeFacet.getValues().get(0);
Assert.assertEquals("en", en.getName());
Assert.assertEquals(1, en.getCount());
---------->8----------

Nevertheless, when I do the 'same' on my real environment, the count
for the locale facet remains 2 after one of the documents is deleted.
The search result count is fine, so that's why I think it's a facet
cache issue. Note that the facet count remains 2 even after I restart
the server, so the cache is persisted on the file system.

Strangely, the facet count is updated correctly if I modify the
document instead of deleting it (i.e. removing a keyword from the
content so that it isn't matched by the search query any more). So it
looks like only delete triggers the issue.

Now, an interesting fact is that if, on my real environment, I delete
one of the documents and then add a new one, the facet count becomes
3. So the last commit to the index, which inserts a new document,
doesn't trigger a re-computation of the facet cache. The previous
facet cache is simply incremented, so the error is perpetuated. At
this point I don't even know how to fix the facet cache without
deleting the Solr data folder so that the full index is rebuild.

I'm still trying to figure out what is the difference between the
integration test and my real environment (as I used the same schema
and configuration). Do you know what might be wrong?

Thanks,
Marius


Re: Facet cache issue when deleting documents from the index

Posted by Ahmet Arslan <io...@yahoo.com>.
Hi Marius,

Facets are computed from indexed terms. Can you commit with expungeDeletes=true flag?

Ahmet



On Monday, February 17, 2014 12:17 PM, Marius Dumitru Florea <ma...@xwiki.com> wrote:
Hi guys,

I'm using Solr 4.6.1 (embedded) and for some reason the facet cache is
not invalidated when documents are deleted from the index. Sadly, for
me, I cannot reproduce this issue with an integration test like this:

----------8<----------
SolrInstance server = getSolrInstance();

SolrInputDocument document = new SolrInputDocument();
document.setField("id", "foo");
document.setField("locale", "en");
server.add(document);

server.commit();

document = new SolrInputDocument();
document.setField("id", "bar");
document.setField("locale", "en");
server.add(document);

server.commit();

SolrQuery query = new SolrQuery("*:*");
query.set("facet", "on");
query.set("facet.field", "locale");
QueryResponse response = server.query(query);

Assert.assertEquals(2, response.getResults().size());
FacetField localeFacet = response.getFacetField("locale");
Assert.assertEquals(1, localeFacet.getValues().size());
Count en = localeFacet.getValues().get(0);
Assert.assertEquals("en", en.getName());
Assert.assertEquals(2, en.getCount());

server.delete("foo");
server.commit();

response = server.query(query);

Assert.assertEquals(1, response.getResults().size());
localeFacet = response.getFacetField("locale");
Assert.assertEquals(1, localeFacet.getValues().size());
en = localeFacet.getValues().get(0);
Assert.assertEquals("en", en.getName());
Assert.assertEquals(1, en.getCount());
---------->8----------

Nevertheless, when I do the 'same' on my real environment, the count
for the locale facet remains 2 after one of the documents is deleted.
The search result count is fine, so that's why I think it's a facet
cache issue. Note that the facet count remains 2 even after I restart
the server, so the cache is persisted on the file system.

Strangely, the facet count is updated correctly if I modify the
document instead of deleting it (i.e. removing a keyword from the
content so that it isn't matched by the search query any more). So it
looks like only delete triggers the issue.

Now, an interesting fact is that if, on my real environment, I delete
one of the documents and then add a new one, the facet count becomes
3. So the last commit to the index, which inserts a new document,
doesn't trigger a re-computation of the facet cache. The previous
facet cache is simply incremented, so the error is perpetuated. At
this point I don't even know how to fix the facet cache without
deleting the Solr data folder so that the full index is rebuild.

I'm still trying to figure out what is the difference between the
integration test and my real environment (as I used the same schema
and configuration). Do you know what might be wrong?

Thanks,
Marius