You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gajendra Dadheech <ga...@gmail.com> on 2020/02/24 10:21:58 UTC

Ordering in Nested Document

Hi

i want to ingest below documents, where there is a mix of nested and
un-nested documents:
<add>
  <doc>
      <field name="id">5</field>
      <field name="_root_">5</field>
      <field name="title">5Solr adds block join support</field>
      <field name="content_type">sparentDocument</field>
  </doc>
 <doc>
      <field name="id">1</field>
      <field name="_root_">1</field>
      <field name="title">Solr adds block join support</field>
      <field name="content_type">parentDocument</field>
      <doc>
          <field name="id">2</field>
          <field name="_root_">1</field>
          <field name="comments">SolrCloud supports it too!</field>
          <field name="content_type">childDocument</field>
      </doc>
  </doc>
  <doc>
      <field name="id">3</field>
      <field name="_root_">3</field>
      <field name="title">New Lucene and Solr release is out</field>
      <field name="content_type">parentDocument</field>
      <doc>
        <field name="id">4</field>
        <field name="_root_">4</field>
        <field name="comments">Lots of new features</field>
        <field name="content_type">childDocument</field>
      </doc>
  </doc>
</add>


Output of block join query after ingesting above docs:
[image: image.png]

So doc id 5 is getting linked to doc id 1. Is this expected behavior, I
believ Id-5 should be a different document tree.

Shall I Ingest them in some order ?

Re: Ordering in Nested Document

Posted by Gajendra Dadheech <ga...@gmail.com>.
Thanks, Mikhail for Reply.
One more question here:
Lets say xml is like this:
<add>
<doc>
<field name="id">4</field>
<field name="title">Regular Shirt</field>
<field name="doc_type">Parent</field>
<field name="color">Black</field>
</doc>
<doc>
<field name="id">8</field>
<field name="title">Solid Rug</field>
<field name="doc_type">Parent</field>
<field name="pattern">Solid</field>
</doc>
<doc>
<field name="id">1</field>
<field name="title">Regular color Shirts</field>
<field name="doc_type">Parent</field>
<field name="items">
<doc>
<field name="id">2</field>
<field name="doc_type">Child</field>
<field name="pcs_color">Red</field>>
</doc>
<doc>
<field name="id">3</field>
<field name="doc_type">Child</field>
<field name="color">Blue</field>>
</doc>
</field>
</doc>
<doc>
<field name="id">5</field>
<field name="title">Rugs</field>
<field name="doc_type">Parent</field>
<field name="items">
<doc>
<field name="id">6</field>
<field name="doc_type">Child</field>
<field name="pattern">Abstract</field>
</doc>
<doc>
<field name="id">7</field>
<field name="doc_type">Child</field>
<field name="pattern">Printed</field>
</doc>
</field>
</doc>
</add>
Now i want to write a query which fetched all items [with child ->1,5 and
without child -> 4,8] with color:red, title having shirt
here is query
/solr/test/select?q={!parent%20which=doc_type:Parent%20score=max}%20%20{!boost%20b=100.0%20}color:Red%20{!dismax%20qf=title%20v=%27Regular%27%20score=total}&fl=id,product_class_type,title,score,color&wt=json
<http://localhost:8983/solr/test/select?q=%7B!parent%20which=doc_type:Parent%20score=max%7D%20%20%7B!boost%20b=100.0%20%7Dpcs_color:Red%20%7B!dismax%20qf=title%20v=%27Regular%27%20score=total%7D&fl=id,product_class_type,title,score,pcs_color&wt=json>


Error:
"msg": "Child query must not match same docs with parent filter. Combine
them as must clauses (+) to find a problem doc. docId=4, class
org.apache.lucene.search.DisjunctionSumScorer",
I think docId 4 is matching with query but as this is a parent document,
solr is throwing error. Is this kind of schema supported in solr 7.6 [mix
of child-free and nested document]



On Mon, Feb 24, 2020 at 5:24 PM Mikhail Khludnev <mk...@apache.org> wrote:

> You may try. Content-type should be absolutely the same across parents and
> child-free. It may work now.
> Earlier, mixing blocks and childfrees in one index wasn't supported.
>
> On Mon, Feb 24, 2020 at 2:57 AM Gajendra Dadheech <ga...@gmail.com>
> wrote:
>
> > That extra s was intentional, should have added a better name.
> >
> > So ideally we shouldn't have childfree and blocks together while
> indexing?
> > Or in the whole index they shouldn't be together, i.e. We should have
> > atleast one child doc for all if any of doc has one?
> >
> > On Mon, Feb 24, 2020 at 4:24 PM Mikhail Khludnev <mk...@apache.org>
> wrote:
> >
> > > Hello, Gajendra.
> > > Pics doesn't come through mailing list.
> > > May it caused by unnecessary s  <field name="content_type">*s*
> > > parentDocument</field>?
> > > At least earlier mixing childfrees and blocks wasn't allowed, and
> caused
> > > some troubles. Usually, child stub used to keep childfrees in the
> index.
> > >
> > > On Mon, Feb 24, 2020 at 2:22 AM Gajendra Dadheech <gajju3588@gmail.com
> >
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > i want to ingest below documents, where there is a mix of nested and
> > > > un-nested documents:
> > > > <add>
> > > >   <doc>
> > > >       <field name="id">5</field>
> > > >       <field name="_root_">5</field>
> > > >       <field name="title">5Solr adds block join support</field>
> > > >       <field name="content_type">sparentDocument</field>
> > > >   </doc>
> > > >  <doc>
> > > >       <field name="id">1</field>
> > > >       <field name="_root_">1</field>
> > > >       <field name="title">Solr adds block join support</field>
> > > >       <field name="content_type">parentDocument</field>
> > > >       <doc>
> > > >           <field name="id">2</field>
> > > >           <field name="_root_">1</field>
> > > >           <field name="comments">SolrCloud supports it too!</field>
> > > >           <field name="content_type">childDocument</field>
> > > >       </doc>
> > > >   </doc>
> > > >   <doc>
> > > >       <field name="id">3</field>
> > > >       <field name="_root_">3</field>
> > > >       <field name="title">New Lucene and Solr release is out</field>
> > > >       <field name="content_type">parentDocument</field>
> > > >       <doc>
> > > >         <field name="id">4</field>
> > > >         <field name="_root_">4</field>
> > > >         <field name="comments">Lots of new features</field>
> > > >         <field name="content_type">childDocument</field>
> > > >       </doc>
> > > >   </doc>
> > > > </add>
> > > >
> > > >
> > > > Output of block join query after ingesting above docs:
> > > > [image: image.png]
> > > >
> > > > So doc id 5 is getting linked to doc id 1. Is this expected
> behavior, I
> > > > believ Id-5 should be a different document tree.
> > > >
> > > > Shall I Ingest them in some order ?
> > > >
> > > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: Ordering in Nested Document

Posted by Mikhail Khludnev <mk...@apache.org>.
You may try. Content-type should be absolutely the same across parents and
child-free. It may work now.
Earlier, mixing blocks and childfrees in one index wasn't supported.

On Mon, Feb 24, 2020 at 2:57 AM Gajendra Dadheech <ga...@gmail.com>
wrote:

> That extra s was intentional, should have added a better name.
>
> So ideally we shouldn't have childfree and blocks together while indexing?
> Or in the whole index they shouldn't be together, i.e. We should have
> atleast one child doc for all if any of doc has one?
>
> On Mon, Feb 24, 2020 at 4:24 PM Mikhail Khludnev <mk...@apache.org> wrote:
>
> > Hello, Gajendra.
> > Pics doesn't come through mailing list.
> > May it caused by unnecessary s  <field name="content_type">*s*
> > parentDocument</field>?
> > At least earlier mixing childfrees and blocks wasn't allowed, and caused
> > some troubles. Usually, child stub used to keep childfrees in the index.
> >
> > On Mon, Feb 24, 2020 at 2:22 AM Gajendra Dadheech <ga...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > i want to ingest below documents, where there is a mix of nested and
> > > un-nested documents:
> > > <add>
> > >   <doc>
> > >       <field name="id">5</field>
> > >       <field name="_root_">5</field>
> > >       <field name="title">5Solr adds block join support</field>
> > >       <field name="content_type">sparentDocument</field>
> > >   </doc>
> > >  <doc>
> > >       <field name="id">1</field>
> > >       <field name="_root_">1</field>
> > >       <field name="title">Solr adds block join support</field>
> > >       <field name="content_type">parentDocument</field>
> > >       <doc>
> > >           <field name="id">2</field>
> > >           <field name="_root_">1</field>
> > >           <field name="comments">SolrCloud supports it too!</field>
> > >           <field name="content_type">childDocument</field>
> > >       </doc>
> > >   </doc>
> > >   <doc>
> > >       <field name="id">3</field>
> > >       <field name="_root_">3</field>
> > >       <field name="title">New Lucene and Solr release is out</field>
> > >       <field name="content_type">parentDocument</field>
> > >       <doc>
> > >         <field name="id">4</field>
> > >         <field name="_root_">4</field>
> > >         <field name="comments">Lots of new features</field>
> > >         <field name="content_type">childDocument</field>
> > >       </doc>
> > >   </doc>
> > > </add>
> > >
> > >
> > > Output of block join query after ingesting above docs:
> > > [image: image.png]
> > >
> > > So doc id 5 is getting linked to doc id 1. Is this expected behavior, I
> > > believ Id-5 should be a different document tree.
> > >
> > > Shall I Ingest them in some order ?
> > >
> > >
> >
> > --
> > Sincerely yours
> > Mikhail Khludnev
> >
>


-- 
Sincerely yours
Mikhail Khludnev

Re: Ordering in Nested Document

Posted by Gajendra Dadheech <ga...@gmail.com>.
That extra s was intentional, should have added a better name.

So ideally we shouldn't have childfree and blocks together while indexing?
Or in the whole index they shouldn't be together, i.e. We should have
atleast one child doc for all if any of doc has one?

On Mon, Feb 24, 2020 at 4:24 PM Mikhail Khludnev <mk...@apache.org> wrote:

> Hello, Gajendra.
> Pics doesn't come through mailing list.
> May it caused by unnecessary s  <field name="content_type">*s*
> parentDocument</field>?
> At least earlier mixing childfrees and blocks wasn't allowed, and caused
> some troubles. Usually, child stub used to keep childfrees in the index.
>
> On Mon, Feb 24, 2020 at 2:22 AM Gajendra Dadheech <ga...@gmail.com>
> wrote:
>
> > Hi
> >
> > i want to ingest below documents, where there is a mix of nested and
> > un-nested documents:
> > <add>
> >   <doc>
> >       <field name="id">5</field>
> >       <field name="_root_">5</field>
> >       <field name="title">5Solr adds block join support</field>
> >       <field name="content_type">sparentDocument</field>
> >   </doc>
> >  <doc>
> >       <field name="id">1</field>
> >       <field name="_root_">1</field>
> >       <field name="title">Solr adds block join support</field>
> >       <field name="content_type">parentDocument</field>
> >       <doc>
> >           <field name="id">2</field>
> >           <field name="_root_">1</field>
> >           <field name="comments">SolrCloud supports it too!</field>
> >           <field name="content_type">childDocument</field>
> >       </doc>
> >   </doc>
> >   <doc>
> >       <field name="id">3</field>
> >       <field name="_root_">3</field>
> >       <field name="title">New Lucene and Solr release is out</field>
> >       <field name="content_type">parentDocument</field>
> >       <doc>
> >         <field name="id">4</field>
> >         <field name="_root_">4</field>
> >         <field name="comments">Lots of new features</field>
> >         <field name="content_type">childDocument</field>
> >       </doc>
> >   </doc>
> > </add>
> >
> >
> > Output of block join query after ingesting above docs:
> > [image: image.png]
> >
> > So doc id 5 is getting linked to doc id 1. Is this expected behavior, I
> > believ Id-5 should be a different document tree.
> >
> > Shall I Ingest them in some order ?
> >
> >
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: Ordering in Nested Document

Posted by Mikhail Khludnev <mk...@apache.org>.
Hello, Gajendra.
Pics doesn't come through mailing list.
May it caused by unnecessary s  <field name="content_type">*s*
parentDocument</field>?
At least earlier mixing childfrees and blocks wasn't allowed, and caused
some troubles. Usually, child stub used to keep childfrees in the index.

On Mon, Feb 24, 2020 at 2:22 AM Gajendra Dadheech <ga...@gmail.com>
wrote:

> Hi
>
> i want to ingest below documents, where there is a mix of nested and
> un-nested documents:
> <add>
>   <doc>
>       <field name="id">5</field>
>       <field name="_root_">5</field>
>       <field name="title">5Solr adds block join support</field>
>       <field name="content_type">sparentDocument</field>
>   </doc>
>  <doc>
>       <field name="id">1</field>
>       <field name="_root_">1</field>
>       <field name="title">Solr adds block join support</field>
>       <field name="content_type">parentDocument</field>
>       <doc>
>           <field name="id">2</field>
>           <field name="_root_">1</field>
>           <field name="comments">SolrCloud supports it too!</field>
>           <field name="content_type">childDocument</field>
>       </doc>
>   </doc>
>   <doc>
>       <field name="id">3</field>
>       <field name="_root_">3</field>
>       <field name="title">New Lucene and Solr release is out</field>
>       <field name="content_type">parentDocument</field>
>       <doc>
>         <field name="id">4</field>
>         <field name="_root_">4</field>
>         <field name="comments">Lots of new features</field>
>         <field name="content_type">childDocument</field>
>       </doc>
>   </doc>
> </add>
>
>
> Output of block join query after ingesting above docs:
> [image: image.png]
>
> So doc id 5 is getting linked to doc id 1. Is this expected behavior, I
> believ Id-5 should be a different document tree.
>
> Shall I Ingest them in some order ?
>
>

-- 
Sincerely yours
Mikhail Khludnev