You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Thomas Corthals <th...@klascement.net> on 2022/05/01 14:02:36 UTC

Re: Indexing "single nested child" in XML

Hi Mikhail,

Adding a single nested child document isn't mentioned in the docs at all.
The ref guide for 8.11 has more elaborate examples, but doesn't show that
you can do this either. Even when only one child is added, the JSON
examples put it in an array. But it does work when you nest a JSON object
directly instead of in an array. And I do believe this behaviour is
intentional as it's part of the tests
in TestChildDocTransformerHierarchy.java.

What I'm looking for is a way to have it indexed identically with an XML
update request. Something like this seemed like a logical way to me (based
on how a single child is represented in the result with wt=xml):
<doc>
  <field name="id">1</field>
  <doc name="single_child">
    <field name="id">2</field>
  </doc>
  <field name="children">
    <doc>
      <field name="id">3</field>
    </doc>
    <doc>
      <field name="id">4</field>
    </doc>
  </field>
</doc>

When I do that, the name="single_child" is ignored (and a warning for an
invalid XML attr is logged). That doc is added as an anonymous child
instead.

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"id:1",
      "fl":"id,single_child,children,[child]"}},
  "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
      {
        "id":"1",
        "children":[
          {
            "id":"3"},

          {
            "id":"4"}]}]
  }}

{
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"{!child of='id:1'}id:1",
      "fl":"id"}},
  "response":{"numFound":3,"start":0,"numFoundExact":true,"docs":[
      {
        "id":"2"},
      {
        "id":"3"},
      {
        "id":"4"}]
  }}

When I put only one child document in a field with the same name, it's
indexed as a "multivalued" child document.

<doc>
  <field name="id">1</field>
  <field name="single_child">
    <doc>
      <field name="id">2</field>
    </doc>
  </field>
  <field name="children">
    <doc>
      <field name="id">3</field>
    </doc>
    <doc>
      <field name="id">4</field>
    </doc>
  </field>
</doc>

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "q":"id:1",
      "fl":"id,single_child,children,[child]"}},
  "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
      {
        "id":"1",
        "single_child":[
          {
            "id":"2"}],
        "children":[
          {
            "id":"3"},

          {
            "id":"4"}]}]
  }}

What I'm looking for is a way to add a single labelled child document in
XML the way you can do in JSON.

Regards,

Thomas

Op za 30 apr. 2022 om 15:29 schreef Mikhail Khludnev <mk...@apache.org>:

> Hello Thomas.
>
> Isn't it covered here
>
> https://solr.apache.org/guide/8_4/indexing-nested-documents.html#xml-examples
> ?
>
> сб, 30 апр. 2022 г., 0:18 Thomas Corthals <th...@klascement.net>:
>
> > Hi,
> >
> >
> > In a JSON request, you can add nested child documents as a single
> document
> > or an array of documents.
> >
> >
> > JSON data:
> >
> > {
> >   "id": "1",
> >   "single_child": {
> >     "id": "2"
> >   },
> >   "children": [{
> >     "id": "3"
> >   },
> >   {
> >     "id": "4"
> >   }]
> > }
> >
> >
> > Response:
> >
> > {
> >   "responseHeader":{
> >     "status":0,
> >     "QTime":1,
> >     "params":{
> >       "q":"id:1",
> >       "fl":"id,single_child,children,[child]"}},
> >   "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
> >       {
> >         "id":"1",
> >         "single_child":
> >         {
> >           "id":"2"},
> >         "children":[
> >           {
> >             "id":"3"},
> >
> >           {
> >             "id":"4"}]}]
> >   }}
> >
> >
> > I'm trying to achieve the same with an XML request. What is the correct
> > syntax for adding that single child as a labelled nested document in XML?
> >
> >
> > Kind regards,
> >
> >
> > Thomas
> >
>

Re: Indexing "single nested child" in XML

Posted by Thomas Corthals <th...@klascement.net>.
Hi Mikhail,

The JSON Loader does distinguish a singleton from an array in
JsonLoader#buildDoc() and JsonLoader#mapEntryIsChildDoc(). I've opened an
issue in JIRA with a syntax proposal for XML:
https://issues.apache.org/jira/browse/SOLR-16183

It looks like it would be a small adjustment, but I only read Java and
don't really speak it.

Thomas

Op zo 1 mei 2022 om 22:44 schreef Mikhail Khludnev <mk...@apache.org>:

> Hello, Thomas.
>
> I think we never think about singleton as a special case, never distinguish
> it from array.
>
>
> On Sun, May 1, 2022 at 5:03 PM Thomas Corthals <th...@klascement.net>
> wrote:
>
> > When I put only one child document in a field with the same name, it's
> > indexed as a "multivalued" child document.
> >
> > <doc>
> >   <field name="id">1</field>
> >   <field name="single_child">
> >     <doc>
> >       <field name="id">2</field>
> >     </doc>
> >   </field>
> >   <field name="children">
> >     <doc>
> >       <field name="id">3</field>
> >     </doc>
> >     <doc>
> >       <field name="id">4</field>
> >     </doc>
> >   </field>
> > </doc>
> >
> > {
> >   "responseHeader":{
> >     "status":0,
> >     "QTime":2,
> >     "params":{
> >       "q":"id:1",
> >       "fl":"id,single_child,children,[child]"}},
> >   "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
> >       {
> >         "id":"1",
> >         "single_child":[
> >           {
> >             "id":"2"}],
> >         "children":[
> >           {
> >             "id":"3"},
> >
> >           {
> >             "id":"4"}]}]
> >   }}
> >
> >
> if the problem is an array
>         "single_child":*[*
>           {
>             "id":"2"}*]*,
> I believe it's inserted in ChildDocTransformer#addChildrenToParent(), and
> perhaps can be hacked in custom transformer.
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: Indexing "single nested child" in XML

Posted by Mikhail Khludnev <mk...@apache.org>.
Hello, Thomas.

I think we never think about singleton as a special case, never distinguish
it from array.


On Sun, May 1, 2022 at 5:03 PM Thomas Corthals <th...@klascement.net>
wrote:

> When I put only one child document in a field with the same name, it's
> indexed as a "multivalued" child document.
>
> <doc>
>   <field name="id">1</field>
>   <field name="single_child">
>     <doc>
>       <field name="id">2</field>
>     </doc>
>   </field>
>   <field name="children">
>     <doc>
>       <field name="id">3</field>
>     </doc>
>     <doc>
>       <field name="id">4</field>
>     </doc>
>   </field>
> </doc>
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":2,
>     "params":{
>       "q":"id:1",
>       "fl":"id,single_child,children,[child]"}},
>   "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
>       {
>         "id":"1",
>         "single_child":[
>           {
>             "id":"2"}],
>         "children":[
>           {
>             "id":"3"},
>
>           {
>             "id":"4"}]}]
>   }}
>
>
if the problem is an array
        "single_child":*[*
          {
            "id":"2"}*]*,
I believe it's inserted in ChildDocTransformer#addChildrenToParent(), and
perhaps can be hacked in custom transformer.


-- 
Sincerely yours
Mikhail Khludnev