You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Thomas Corthals <th...@klascement.net> on 2022/10/19 22:41:55 UTC

Solr 9 possible analysis error with currency field and nested child

Hi,


I'm running into an exception with Solr 9.0.0 for a request that works fine
with Solr 8.11.2 and I have no idea why.


I've modified the techproducts example schema to store the _root_ field and
add a _nest_path_.


   <field name="_root_" type="string" indexed="true" stored="true"
docValues="false" />
   <fieldType name="_nest_path_" class="solr.NestPathField" />
   <field name="_nest_path_" type="_nest_path_" />


This request works fine with Solr 8.11.2, but not with Solr 9.0.0. It does
an atomic update of a field on a parent document and somehow the price
field causes an issue.


curl -s -X POST -H 'Content-Type: application/json' '
http://localhost:8983/solr/techproducts/update' --data-binary '
{
    "add":{
        "doc":{
            "id":"parent",
            "cat":["parent"],
            "child":{
                "id":"child",
                "cat":["child"],
                "price":1.5
            }
        }
    },
    "commit":{
        "softCommit":true,
        "waitSearcher":true
    },
    "add":{
        "doc":{
            "id":"parent",
            "cat":{"add":"updated"}
        }
    }
}'


Solr 8.11.2:


{
  "responseHeader":{
    "status":0,
    "QTime":1}}


Solr 9.0.0:


{
  "responseHeader":{
    "status":400,
    "QTime":41},
  "error":{
    "metadata":[
      "error-class","org.apache.solr.common.SolrException",
      "root-error-class","java.lang.IllegalArgumentException"],
    "msg":"Exception writing document id parent to the index; possible
analysis error: cannot change field \"price_c____l_ns\" from doc values
type=NONE to inconsistent doc values type=NUMERIC",
    "code":400}}


However, if I don't do a commit between the two adds, I don't get the error.


Did something change between Solr 8 and 9 that I have to account for in my
schema or my update requests? Or is this a bug?


Kind regards,


Thomas Corthals

Re: Solr 9 possible analysis error with currency field and nested child

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/27/22 15:51, Thomas Corthals wrote:
> I can reproduce it on a pristine techproducts example index by unpacking
> the Solr 9.0.0 download and changing the schema in the sample configset
> *before* I even start that Solr instance for the very first time. That
> should rule out a conflict between the existing index and the schema as the
> index was built from scratch with that schema and I didn't change it
> afterwards.
>
> The error only occurs with _nest_path_ in the schema, changing _root_ to
> stored="true" alone doesn't cause it.
>
> The error still only occurs with a commit between the initial "add" of the
> doc and the atomic update. The error doesn't occur if I commit after both
> "add"'s.

Odd.

I have no idea what "_nest_path_" even does.  I have never actually used 
the nested documents feature.  I am basing all this on things I have 
read, not actual experience.

My best guess about this is that it is throwing an exception because it 
detects that you have split the parent and child doc into different 
segments.  If you update something for a parent or child document, you 
will need to update or reindex EVERY document that is connected to 
either the child or its parent, and it will need to be done without 
interleaved commits.  Ideally, all of those documents will be in the 
same indexing batch.

I think it is entirely possible that Solr 8 does not detect a problem, 
so it allows the indexing, where Solr 9 does detect it. If that's it, it 
REALLY sucks that the detection results in that particular exception, 
because that would mean that the exception has nothing to do with the 
actual problem.

Thanks,
Shawn

Re: Solr 9 possible analysis error with currency field and nested child

Posted by Thomas Corthals <th...@klascement.net>.
Hi Jan,

The same sequence of add/commit commands results in a successful atomic
update in Solr 8.11.2. And I don't see anything in the ref guide (for Solr
8 or 9) that makes me think this shouldn't be possible. So from a user
perspective, I'd argue that this is a bug introduced in Solr 9.0.0. Should
I add it to JIRA as such?

Thomas

Op vr 28 okt. 2022 om 01:35 schreef Jan Høydahl <ja...@cominvent.com>:

> You are using three features together
> A) Nested docs
> B) Price field (which is really storing price and currency in two sub
> fields)
> C) Atomic updates
>
> I would not be surprised if these three are not really compatible.
> Since there is a requiement to ALWAYS index the entire block of parent and
> children for every update to nested docs, I think you violate that
> requirement by attempting an atomic update. Solr would take the PARENT
> document, read it from disk, construct a new version of it with the added
> value, and then storing it in a different segment. But the children would
> be gone.
>
> So your safest bet is to treat the entire block as a unit, avoid atomic
> and re-send the entire block every time you need an update to even the
> smallest part of the parent/child structure.
>
> Jan
>
> > 27. okt. 2022 kl. 23:51 skrev Thomas Corthals <th...@klascement.net>:
> >
> > Op do 27 okt. 2022 om 22:04 schreef Shawn Heisey <apache@elyograg.org
> <ma...@elyograg.org>>:
> >
> >> On 10/19/22 16:41, Thomas Corthals wrote:
> >>> I'm running into an exception with Solr 9.0.0 for a request that works
> >> fine
> >>> with Solr 8.11.2 and I have no idea why.
> >> <snip>
> >>> However, if I don't do a commit between the two adds, I don't get the
> >> error.
> >>>
> >>>
> >>> Did something change between Solr 8 and 9 that I have to account for in
> >> my
> >>> schema or my update requests? Or is this a bug?
> >>
> >> The error seems to indicate that there is a conflict between the
> >> existing index and the schema with respect to docValues on the field
> >> named price_c____l_ns.  Usually when this happens you have to completely
> >> delete the data directory, reload the index or restart Solr, and reindex
> >> from scratch.
> >>
> >
> > I can reproduce it on a pristine techproducts example index by unpacking
> > the Solr 9.0.0 download and changing the schema in the sample configset
> > *before* I even start that Solr instance for the very first time. That
> > should rule out a conflict between the existing index and the schema as
> the
> > index was built from scratch with that schema and I didn't change it
> > afterwards.
> >
> > The error only occurs with _nest_path_ in the schema, changing _root_ to
> > stored="true" alone doesn't cause it.
> >
> > The error still only occurs with a commit between the initial "add" of
> the
> > doc and the atomic update. The error doesn't occur if I commit after both
> > "add"'s.
> >
> > Thomas
>
>

Re: Solr 9 possible analysis error with currency field and nested child

Posted by Jan Høydahl <ja...@cominvent.com>.
You are using three features together
A) Nested docs
B) Price field (which is really storing price and currency in two sub fields)
C) Atomic updates

I would not be surprised if these three are not really compatible.
Since there is a requiement to ALWAYS index the entire block of parent and children for every update to nested docs, I think you violate that requirement by attempting an atomic update. Solr would take the PARENT document, read it from disk, construct a new version of it with the added value, and then storing it in a different segment. But the children would be gone.

So your safest bet is to treat the entire block as a unit, avoid atomic and re-send the entire block every time you need an update to even the smallest part of the parent/child structure.

Jan

> 27. okt. 2022 kl. 23:51 skrev Thomas Corthals <th...@klascement.net>:
> 
> Op do 27 okt. 2022 om 22:04 schreef Shawn Heisey <apache@elyograg.org <ma...@elyograg.org>>:
> 
>> On 10/19/22 16:41, Thomas Corthals wrote:
>>> I'm running into an exception with Solr 9.0.0 for a request that works
>> fine
>>> with Solr 8.11.2 and I have no idea why.
>> <snip>
>>> However, if I don't do a commit between the two adds, I don't get the
>> error.
>>> 
>>> 
>>> Did something change between Solr 8 and 9 that I have to account for in
>> my
>>> schema or my update requests? Or is this a bug?
>> 
>> The error seems to indicate that there is a conflict between the
>> existing index and the schema with respect to docValues on the field
>> named price_c____l_ns.  Usually when this happens you have to completely
>> delete the data directory, reload the index or restart Solr, and reindex
>> from scratch.
>> 
> 
> I can reproduce it on a pristine techproducts example index by unpacking
> the Solr 9.0.0 download and changing the schema in the sample configset
> *before* I even start that Solr instance for the very first time. That
> should rule out a conflict between the existing index and the schema as the
> index was built from scratch with that schema and I didn't change it
> afterwards.
> 
> The error only occurs with _nest_path_ in the schema, changing _root_ to
> stored="true" alone doesn't cause it.
> 
> The error still only occurs with a commit between the initial "add" of the
> doc and the atomic update. The error doesn't occur if I commit after both
> "add"'s.
> 
> Thomas


Re: Solr 9 possible analysis error with currency field and nested child

Posted by Thomas Corthals <th...@klascement.net>.
Op do 27 okt. 2022 om 22:04 schreef Shawn Heisey <ap...@elyograg.org>:

> On 10/19/22 16:41, Thomas Corthals wrote:
> > I'm running into an exception with Solr 9.0.0 for a request that works
> fine
> > with Solr 8.11.2 and I have no idea why.
> <snip>
> > However, if I don't do a commit between the two adds, I don't get the
> error.
> >
> >
> > Did something change between Solr 8 and 9 that I have to account for in
> my
> > schema or my update requests? Or is this a bug?
>
> The error seems to indicate that there is a conflict between the
> existing index and the schema with respect to docValues on the field
> named price_c____l_ns.  Usually when this happens you have to completely
> delete the data directory, reload the index or restart Solr, and reindex
> from scratch.
>

I can reproduce it on a pristine techproducts example index by unpacking
the Solr 9.0.0 download and changing the schema in the sample configset
*before* I even start that Solr instance for the very first time. That
should rule out a conflict between the existing index and the schema as the
index was built from scratch with that schema and I didn't change it
afterwards.

The error only occurs with _nest_path_ in the schema, changing _root_ to
stored="true" alone doesn't cause it.

The error still only occurs with a commit between the initial "add" of the
doc and the atomic update. The error doesn't occur if I commit after both
"add"'s.

Thomas

Re: Solr 9 possible analysis error with currency field and nested child

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/19/22 16:41, Thomas Corthals wrote:
> I'm running into an exception with Solr 9.0.0 for a request that works fine
> with Solr 8.11.2 and I have no idea why.
<snip>
> However, if I don't do a commit between the two adds, I don't get the error.
>
>
> Did something change between Solr 8 and 9 that I have to account for in my
> schema or my update requests? Or is this a bug?

The error seems to indicate that there is a conflict between the 
existing index and the schema with respect to docValues on the field 
named price_c____l_ns.  Usually when this happens you have to completely 
delete the data directory, reload the index or restart Solr, and reindex 
from scratch.

I wonder if that's a red herring, though.  Here's my thought process, 
and I would like someone with more internals knowledge to tell me if I 
have this all wrong:

One of the canonical rules of parent/child documents is that child 
documents must be in the same Lucene index segment as the parents.

When the two indexing requests are done without a commit in the middle, 
this requirement is almost certain to be satisfied.

But if you commit between the two indexing requests, then the updated 
document will be in a different Lucene segment than the document(s) it 
is tied to.  Maybe Solr 9 detects this problem and throws an exception, 
where Solr 8 didn't, and it is being misreported as a docValues problem.

Or maybe the error isn't being misreported.  If it is actually valid, 
then you will have to either fix the mismatch on the price_c____l_ns 
field and restart, or wipe the index and rebuild it from scratch.

But the statement about the updated document being in a different 
segment is still valid.  You might run into other problems with 
documents for a parent/child relationship being in different segments.

Thanks,
Shawn


Re: Solr 9 possible analysis error with currency field and nested child

Posted by Thomas Corthals <th...@klascement.net>.
Bumping this to the list again in case anyone has any insights before I
open an issue in JIRA for this.

Op do 20 okt. 2022 om 00:41 schreef Thomas Corthals <th...@klascement.net>:

> Hi,
>
>
> I'm running into an exception with Solr 9.0.0 for a request that works
> fine with Solr 8.11.2 and I have no idea why.
>
>
> I've modified the techproducts example schema to store the _root_ field
> and add a _nest_path_.
>
>
>    <field name="_root_" type="string" indexed="true" stored="true"
> docValues="false" />
>    <fieldType name="_nest_path_" class="solr.NestPathField" />
>    <field name="_nest_path_" type="_nest_path_" />
>
>
> This request works fine with Solr 8.11.2, but not with Solr 9.0.0. It does
> an atomic update of a field on a parent document and somehow the price
> field causes an issue.
>
>
> curl -s -X POST -H 'Content-Type: application/json' '
> http://localhost:8983/solr/techproducts/update' --data-binary '
> {
>     "add":{
>         "doc":{
>             "id":"parent",
>             "cat":["parent"],
>             "child":{
>                 "id":"child",
>                 "cat":["child"],
>                 "price":1.5
>             }
>         }
>     },
>     "commit":{
>         "softCommit":true,
>         "waitSearcher":true
>     },
>     "add":{
>         "doc":{
>             "id":"parent",
>             "cat":{"add":"updated"}
>         }
>     }
> }'
>
>
> Solr 8.11.2:
>
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":1}}
>
>
> Solr 9.0.0:
>
>
> {
>   "responseHeader":{
>     "status":400,
>     "QTime":41},
>   "error":{
>     "metadata":[
>       "error-class","org.apache.solr.common.SolrException",
>       "root-error-class","java.lang.IllegalArgumentException"],
>     "msg":"Exception writing document id parent to the index; possible
> analysis error: cannot change field \"price_c____l_ns\" from doc values
> type=NONE to inconsistent doc values type=NUMERIC",
>     "code":400}}
>
>
> However, if I don't do a commit between the two adds, I don't get the
> error.
>
>
> Did something change between Solr 8 and 9 that I have to account for in my
> schema or my update requests? Or is this a bug?
>
>
> Kind regards,
>
>
> Thomas Corthals
>