You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Shawn Heisey <ap...@elyograg.org> on 2022/12/22 02:05:05 UTC

Problem with uniqueKey not resulting in existing document deletion

I have a little Solr index for Dovecot full text search.

While investigating a different problem, I encountered this.

I indexed the same document 4 times.  It should have resulted in exactly 
one copy being in the index, but all four copies are there.  Before I 
file an issue I thought I'd ask here and see if maybe I caused the 
problem somehow with my config.

https://www.dropbox.com/s/uu0a0efz7ly5g0j/solr_uniquekey_problem_2022-12_query_results.png?dl=0

Solr 9.2.0-SNAPSHOT compiled from a slightly modified branch_9x.

https://www.dropbox.com/s/vz1l7d149hz6plw/solr_uniquekey_problem_2022-12_dashboard.png?dl=0

Core stats:

https://www.dropbox.com/s/uvzwpzyoqid3jqv/solr_uniquekey_problem_2022-12_corestats.png?dl=0

solrconfig.xml and managed-schema.xml:

https://paste.elyograg.org/view/adc33eba
https://paste.elyograg.org/view/fdc7e726

/etc/default/solr.in.sh:

https://paste.elyograg.org/view/574ecf30

elyograg@bilbo:~$ java -version
openjdk version "17.0.5" 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed 
mode, sharing)

elyograg@bilbo:~$ uname -a
Linux bilbo 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22 
UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

I do wonder whether the update request processor chain I have defined 
might be causing problems ... but that should not affect the ID string 
of "1" that I used when I indexed the doc.

solr.xml:

https://paste.elyograg.org/view/0282fffd

Any ideas whether this is a bug or a config problem?

Thanks,
Shawn

Re: Problem with uniqueKey not resulting in existing document deletion

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/22/22 11:55, Shawn Heisey wrote:
> Which is different than what I had, but is still not going to work, 
> because stored is false and useDocValueAsStored is not present, which 
> seems to make it false.  That is probably a bug.

I would like to create two tests that use the _default configset and 
test that uniqueId is working, one for cloud mode and one for 
standalone.  Anyone know which of the thousands of existing tests I 
should use as models for that?

I did see that there are existing uniqueId tests, but from the names, I 
don't think they test for this specific issue, and I'm going to see if I 
can figure out what's in the config that is used for those tests.  I bet 
that they either don't have a _root_ definition or that it is not the 
same as _default_.

Thanks,
Shawn

Re: Problem with uniqueKey not resulting in existing document deletion

Posted by Shawn Heisey <el...@elyograg.org>.
On 12/22/22 02:17, Eduardo Gomez wrote:
> As Jan mentioned, I experienced the same problem on Solr 8.11. I can see in
> your schema this commented out field:
> 
>   <!--
>   <dynamicField name="*" type="string" indexed="false" stored="true"/>
>   -->
> 
> Does removing the comment solve the issue? What about leaving that
> commented out and also commenting out the _root_ field? In my case, either
> including or excluding both fields seems to solve the issue. It is a bit
> concerning not really knowing what's going on though.

I actually wasn't aware that I had a wildcard dynamicField before I 
started working on this.  It's not something that I want in the config, 
so I commented it.  The problem was happening while that definition was 
there.  But the commented dynamicField is not indexed.

I updated the _root_ field so indexed and docValues were true, and that 
didn't fix it.  But when I also set stored to true, it's magically 
better.  If I use add useDocValuesAsStored="true" instead of setting 
stored to true, that also works.

I'm going to look at the documentation about _root_ to verify what it 
says I should do.

The definition of _root_ in the _default configset is as follows:

     <field name="_root_" type="string" indexed="true" stored="false" 
docValues="false" />

Which is different than what I had, but is still not going to work, 
because stored is false and useDocValueAsStored is not present, which 
seems to make it false.  That is probably a bug.

Thanks,
Shawn

Re: Problem with uniqueKey not resulting in existing document deletion

Posted by Eduardo Gomez <eg...@mintel.com.INVALID>.
As Jan mentioned, I experienced the same problem on Solr 8.11. I can see in
your schema this commented out field:

 <!--
 <dynamicField name="*" type="string" indexed="false" stored="true"/>
 -->

Does removing the comment solve the issue? What about leaving that
commented out and also commenting out the _root_ field? In my case, either
including or excluding both fields seems to solve the issue. It is a bit
concerning not really knowing what's going on though.

Eduardo


On Thu, Dec 22, 2022 at 9:06 AM Jan Høydahl <ja...@cominvent.com> wrote:

> Have you tried indexing your _root_ field? Or removing the _root_ field?
> There was a similar issue regarding this and atomic update, but perhaps it
> is a more general issue?
>
> Jan
>
> > 22. des. 2022 kl. 03:05 skrev Shawn Heisey <ap...@elyograg.org>:
> >
> > I have a little Solr index for Dovecot full text search.
> >
> > While investigating a different problem, I encountered this.
> >
> > I indexed the same document 4 times.  It should have resulted in exactly
> one copy being in the index, but all four copies are there.  Before I file
> an issue I thought I'd ask here and see if maybe I caused the problem
> somehow with my config.
> >
> >
> https://www.dropbox.com/s/uu0a0efz7ly5g0j/solr_uniquekey_problem_2022-12_query_results.png?dl=0
> >
> > Solr 9.2.0-SNAPSHOT compiled from a slightly modified branch_9x.
> >
> >
> https://www.dropbox.com/s/vz1l7d149hz6plw/solr_uniquekey_problem_2022-12_dashboard.png?dl=0
> >
> > Core stats:
> >
> >
> https://www.dropbox.com/s/uvzwpzyoqid3jqv/solr_uniquekey_problem_2022-12_corestats.png?dl=0
> >
> > solrconfig.xml and managed-schema.xml:
> >
> > https://paste.elyograg.org/view/adc33eba
> > https://paste.elyograg.org/view/fdc7e726
> >
> > /etc/default/solr.in.sh:
> >
> > https://paste.elyograg.org/view/574ecf30
> >
> > elyograg@bilbo:~$ java -version
> > openjdk version "17.0.5" 2022-10-18
> > OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
> > OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed
> mode, sharing)
> >
> > elyograg@bilbo:~$ uname -a
> > Linux bilbo 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22
> UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >
> > I do wonder whether the update request processor chain I have defined
> might be causing problems ... but that should not affect the ID string of
> "1" that I used when I indexed the doc.
> >
> > solr.xml:
> >
> > https://paste.elyograg.org/view/0282fffd
> >
> > Any ideas whether this is a bug or a config problem?
> >
> > Thanks,
> > Shawn
>
>

-- 

Mintel Group Ltd | Mintel House, 4 Playhouse Yard | London | EC4V 5EX
Registered in England: Number 1475918. | VAT Number: GB 232 9342 72

Contact details for our other offices can be found at 
http://www.mintel.com/office-locations 
<http://www.mintel.com/office-locations>.

This email and any attachments 
may include content that is confidential, privileged 
or otherwise 
protected under applicable law. Unauthorised disclosure, copying, 
distribution 
or use of the contents is prohibited and may be unlawful. If 
you have received this email in error,
including without appropriate 
authorisation, then please reply to the sender about the error 
and delete 
this email and any attachments.


Re: Problem with uniqueKey not resulting in existing document deletion

Posted by Jan Høydahl <ja...@cominvent.com>.
Have you tried indexing your _root_ field? Or removing the _root_ field? There was a similar issue regarding this and atomic update, but perhaps it is a more general issue?

Jan

> 22. des. 2022 kl. 03:05 skrev Shawn Heisey <ap...@elyograg.org>:
> 
> I have a little Solr index for Dovecot full text search.
> 
> While investigating a different problem, I encountered this.
> 
> I indexed the same document 4 times.  It should have resulted in exactly one copy being in the index, but all four copies are there.  Before I file an issue I thought I'd ask here and see if maybe I caused the problem somehow with my config.
> 
> https://www.dropbox.com/s/uu0a0efz7ly5g0j/solr_uniquekey_problem_2022-12_query_results.png?dl=0
> 
> Solr 9.2.0-SNAPSHOT compiled from a slightly modified branch_9x.
> 
> https://www.dropbox.com/s/vz1l7d149hz6plw/solr_uniquekey_problem_2022-12_dashboard.png?dl=0
> 
> Core stats:
> 
> https://www.dropbox.com/s/uvzwpzyoqid3jqv/solr_uniquekey_problem_2022-12_corestats.png?dl=0
> 
> solrconfig.xml and managed-schema.xml:
> 
> https://paste.elyograg.org/view/adc33eba
> https://paste.elyograg.org/view/fdc7e726
> 
> /etc/default/solr.in.sh:
> 
> https://paste.elyograg.org/view/574ecf30
> 
> elyograg@bilbo:~$ java -version
> openjdk version "17.0.5" 2022-10-18
> OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
> OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed mode, sharing)
> 
> elyograg@bilbo:~$ uname -a
> Linux bilbo 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> 
> I do wonder whether the update request processor chain I have defined might be causing problems ... but that should not affect the ID string of "1" that I used when I indexed the doc.
> 
> solr.xml:
> 
> https://paste.elyograg.org/view/0282fffd
> 
> Any ideas whether this is a bug or a config problem?
> 
> Thanks,
> Shawn