You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Shawn Heisey <ap...@elyograg.org> on 2022/12/22 02:05:05 UTC
Problem with uniqueKey not resulting in existing document deletion
I have a little Solr index for Dovecot full text search.
While investigating a different problem, I encountered this.
I indexed the same document 4 times. It should have resulted in exactly
one copy being in the index, but all four copies are there. Before I
file an issue I thought I'd ask here and see if maybe I caused the
problem somehow with my config.
https://www.dropbox.com/s/uu0a0efz7ly5g0j/solr_uniquekey_problem_2022-12_query_results.png?dl=0
Solr 9.2.0-SNAPSHOT compiled from a slightly modified branch_9x.
https://www.dropbox.com/s/vz1l7d149hz6plw/solr_uniquekey_problem_2022-12_dashboard.png?dl=0
Core stats:
https://www.dropbox.com/s/uvzwpzyoqid3jqv/solr_uniquekey_problem_2022-12_corestats.png?dl=0
solrconfig.xml and managed-schema.xml:
https://paste.elyograg.org/view/adc33eba
https://paste.elyograg.org/view/fdc7e726
/etc/default/solr.in.sh:
https://paste.elyograg.org/view/574ecf30
elyograg@bilbo:~$ java -version
openjdk version "17.0.5" 2022-10-18
OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed
mode, sharing)
elyograg@bilbo:~$ uname -a
Linux bilbo 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22
UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
I do wonder whether the update request processor chain I have defined
might be causing problems ... but that should not affect the ID string
of "1" that I used when I indexed the doc.
solr.xml:
https://paste.elyograg.org/view/0282fffd
Any ideas whether this is a bug or a config problem?
Thanks,
Shawn
Re: Problem with uniqueKey not resulting in existing document deletion
Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/22/22 11:55, Shawn Heisey wrote:
> Which is different than what I had, but is still not going to work,
> because stored is false and useDocValueAsStored is not present, which
> seems to make it false. That is probably a bug.
I would like to create two tests that use the _default configset and
test that uniqueId is working, one for cloud mode and one for
standalone. Anyone know which of the thousands of existing tests I
should use as models for that?
I did see that there are existing uniqueId tests, but from the names, I
don't think they test for this specific issue, and I'm going to see if I
can figure out what's in the config that is used for those tests. I bet
that they either don't have a _root_ definition or that it is not the
same as _default_.
Thanks,
Shawn
Re: Problem with uniqueKey not resulting in existing document deletion
Posted by Shawn Heisey <el...@elyograg.org>.
On 12/22/22 02:17, Eduardo Gomez wrote:
> As Jan mentioned, I experienced the same problem on Solr 8.11. I can see in
> your schema this commented out field:
>
> <!--
> <dynamicField name="*" type="string" indexed="false" stored="true"/>
> -->
>
> Does removing the comment solve the issue? What about leaving that
> commented out and also commenting out the _root_ field? In my case, either
> including or excluding both fields seems to solve the issue. It is a bit
> concerning not really knowing what's going on though.
I actually wasn't aware that I had a wildcard dynamicField before I
started working on this. It's not something that I want in the config,
so I commented it. The problem was happening while that definition was
there. But the commented dynamicField is not indexed.
I updated the _root_ field so indexed and docValues were true, and that
didn't fix it. But when I also set stored to true, it's magically
better. If I use add useDocValuesAsStored="true" instead of setting
stored to true, that also works.
I'm going to look at the documentation about _root_ to verify what it
says I should do.
The definition of _root_ in the _default configset is as follows:
<field name="_root_" type="string" indexed="true" stored="false"
docValues="false" />
Which is different than what I had, but is still not going to work,
because stored is false and useDocValueAsStored is not present, which
seems to make it false. That is probably a bug.
Thanks,
Shawn
Re: Problem with uniqueKey not resulting in existing document deletion
Posted by Eduardo Gomez <eg...@mintel.com.INVALID>.
As Jan mentioned, I experienced the same problem on Solr 8.11. I can see in
your schema this commented out field:
<!--
<dynamicField name="*" type="string" indexed="false" stored="true"/>
-->
Does removing the comment solve the issue? What about leaving that
commented out and also commenting out the _root_ field? In my case, either
including or excluding both fields seems to solve the issue. It is a bit
concerning not really knowing what's going on though.
Eduardo
On Thu, Dec 22, 2022 at 9:06 AM Jan Høydahl <ja...@cominvent.com> wrote:
> Have you tried indexing your _root_ field? Or removing the _root_ field?
> There was a similar issue regarding this and atomic update, but perhaps it
> is a more general issue?
>
> Jan
>
> > 22. des. 2022 kl. 03:05 skrev Shawn Heisey <ap...@elyograg.org>:
> >
> > I have a little Solr index for Dovecot full text search.
> >
> > While investigating a different problem, I encountered this.
> >
> > I indexed the same document 4 times. It should have resulted in exactly
> one copy being in the index, but all four copies are there. Before I file
> an issue I thought I'd ask here and see if maybe I caused the problem
> somehow with my config.
> >
> >
> https://www.dropbox.com/s/uu0a0efz7ly5g0j/solr_uniquekey_problem_2022-12_query_results.png?dl=0
> >
> > Solr 9.2.0-SNAPSHOT compiled from a slightly modified branch_9x.
> >
> >
> https://www.dropbox.com/s/vz1l7d149hz6plw/solr_uniquekey_problem_2022-12_dashboard.png?dl=0
> >
> > Core stats:
> >
> >
> https://www.dropbox.com/s/uvzwpzyoqid3jqv/solr_uniquekey_problem_2022-12_corestats.png?dl=0
> >
> > solrconfig.xml and managed-schema.xml:
> >
> > https://paste.elyograg.org/view/adc33eba
> > https://paste.elyograg.org/view/fdc7e726
> >
> > /etc/default/solr.in.sh:
> >
> > https://paste.elyograg.org/view/574ecf30
> >
> > elyograg@bilbo:~$ java -version
> > openjdk version "17.0.5" 2022-10-18
> > OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
> > OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed
> mode, sharing)
> >
> > elyograg@bilbo:~$ uname -a
> > Linux bilbo 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22
> UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> >
> > I do wonder whether the update request processor chain I have defined
> might be causing problems ... but that should not affect the ID string of
> "1" that I used when I indexed the doc.
> >
> > solr.xml:
> >
> > https://paste.elyograg.org/view/0282fffd
> >
> > Any ideas whether this is a bug or a config problem?
> >
> > Thanks,
> > Shawn
>
>
--
Mintel Group Ltd | Mintel House, 4 Playhouse Yard | London | EC4V 5EX
Registered in England: Number 1475918. | VAT Number: GB 232 9342 72
Contact details for our other offices can be found at
http://www.mintel.com/office-locations
<http://www.mintel.com/office-locations>.
This email and any attachments
may include content that is confidential, privileged
or otherwise
protected under applicable law. Unauthorised disclosure, copying,
distribution
or use of the contents is prohibited and may be unlawful. If
you have received this email in error,
including without appropriate
authorisation, then please reply to the sender about the error
and delete
this email and any attachments.
Re: Problem with uniqueKey not resulting in existing document deletion
Posted by Jan Høydahl <ja...@cominvent.com>.
Have you tried indexing your _root_ field? Or removing the _root_ field? There was a similar issue regarding this and atomic update, but perhaps it is a more general issue?
Jan
> 22. des. 2022 kl. 03:05 skrev Shawn Heisey <ap...@elyograg.org>:
>
> I have a little Solr index for Dovecot full text search.
>
> While investigating a different problem, I encountered this.
>
> I indexed the same document 4 times. It should have resulted in exactly one copy being in the index, but all four copies are there. Before I file an issue I thought I'd ask here and see if maybe I caused the problem somehow with my config.
>
> https://www.dropbox.com/s/uu0a0efz7ly5g0j/solr_uniquekey_problem_2022-12_query_results.png?dl=0
>
> Solr 9.2.0-SNAPSHOT compiled from a slightly modified branch_9x.
>
> https://www.dropbox.com/s/vz1l7d149hz6plw/solr_uniquekey_problem_2022-12_dashboard.png?dl=0
>
> Core stats:
>
> https://www.dropbox.com/s/uvzwpzyoqid3jqv/solr_uniquekey_problem_2022-12_corestats.png?dl=0
>
> solrconfig.xml and managed-schema.xml:
>
> https://paste.elyograg.org/view/adc33eba
> https://paste.elyograg.org/view/fdc7e726
>
> /etc/default/solr.in.sh:
>
> https://paste.elyograg.org/view/574ecf30
>
> elyograg@bilbo:~$ java -version
> openjdk version "17.0.5" 2022-10-18
> OpenJDK Runtime Environment (build 17.0.5+8-Ubuntu-2ubuntu120.04)
> OpenJDK 64-Bit Server VM (build 17.0.5+8-Ubuntu-2ubuntu120.04, mixed mode, sharing)
>
> elyograg@bilbo:~$ uname -a
> Linux bilbo 5.15.0-1026-aws #30~20.04.2-Ubuntu SMP Fri Nov 25 14:53:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
>
> I do wonder whether the update request processor chain I have defined might be causing problems ... but that should not affect the ID string of "1" that I used when I indexed the doc.
>
> solr.xml:
>
> https://paste.elyograg.org/view/0282fffd
>
> Any ideas whether this is a bug or a config problem?
>
> Thanks,
> Shawn