You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Alexandre Rafalovitch <ar...@gmail.com> on 2020/10/02 03:37:31 UTC

Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

I am indexing a deeply nested structure and am trying to return it
with fl=*,[child].

And it is supposed to have 5 children under the top element but
returns only 4. Two hours of debugging later, I realize that the
"limit" parameter is set to 10 by default and that 10 seems to be
counting children at ANY level. And calculating them depth-first. So,
it was quite unobvious to discover when the children suddenly stopped
showing up.

The documentation says:
> The maximum number of child documents to be returned per parent document. > The default is `10`.

So, is that (all nested children included in limit) what we actually
mean? Or did we mean maximum number of "immediate children" for any
specific document/level and the code is wrong?

I can update the doc to clarify the results, but I don't know whether
I am looking at the bug or the feature.

Regards,
   Alex.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

Posted by David Smiley <ds...@apache.org>.
On Thu, Oct 8, 2020 at 9:13 AM Bar Rotstein <ba...@gmail.com> wrote:

> Hey David,
> long time no speak.
>
> I think I'll start working on SOLR-14869.
>
> Do you have any tips that might enable me to tackle it a little faster?
>
>
ChildDocTransformer loops over document IDs.  They should be in the same
segment.  You should get the LeafReader for that segment and call
getLiveDocs on it.  In the transformer when you loop the IDs, check to see
if the doc is "live".

Re: Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

Posted by Bar Rotstein <ba...@gmail.com>.
Hey David,
long time no speak.

I think I'll start working on SOLR-14869.

Do you have any tips that might enable me to tackle it a little faster?

Thanks,
Bar.

On Sun, Oct 4, 2020 at 12:25 AM David Smiley <ds...@apache.org> wrote:

> Glad to hear from you again Bar!
> Also, FYI https://issues.apache.org/jira/browse/SOLR-14869 is a serious
> bug relating to child documents.  It returns deleted docs!
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sat, Oct 3, 2020 at 3:23 PM Bar Rotstein <ba...@gmail.com> wrote:
>
>> Hey,
>> Was a ticket opened?
>>
>> I'd gladly tackle that one if it hasn't been assigned yet.
>>
>> Thanks in advance,
>> Bar
>> On Fri, Oct 2, 2020 at 3:13 PM David Smiley <ds...@apache.org> wrote:
>>
>>> I think that's a bug!  Good catch!
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Thu, Oct 1, 2020 at 11:38 PM Alexandre Rafalovitch <
>>> arafalov@gmail.com> wrote:
>>>
>>>> I am indexing a deeply nested structure and am trying to return it
>>>> with fl=*,[child].
>>>>
>>>> And it is supposed to have 5 children under the top element but
>>>> returns only 4. Two hours of debugging later, I realize that the
>>>> "limit" parameter is set to 10 by default and that 10 seems to be
>>>> counting children at ANY level. And calculating them depth-first. So,
>>>> it was quite unobvious to discover when the children suddenly stopped
>>>> showing up.
>>>>
>>>> The documentation says:
>>>> > The maximum number of child documents to be returned per parent
>>>> document. > The default is `10`.
>>>>
>>>> So, is that (all nested children included in limit) what we actually
>>>> mean? Or did we mean maximum number of "immediate children" for any
>>>> specific document/level and the code is wrong?
>>>>
>>>> I can update the doc to clarify the results, but I don't know whether
>>>> I am looking at the bug or the feature.
>>>>
>>>> Regards,
>>>>    Alex.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>>
>>>>

Re: Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

Posted by David Smiley <ds...@apache.org>.
Glad to hear from you again Bar!
Also, FYI https://issues.apache.org/jira/browse/SOLR-14869 is a serious bug
relating to child documents.  It returns deleted docs!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Oct 3, 2020 at 3:23 PM Bar Rotstein <ba...@gmail.com> wrote:

> Hey,
> Was a ticket opened?
>
> I'd gladly tackle that one if it hasn't been assigned yet.
>
> Thanks in advance,
> Bar
> On Fri, Oct 2, 2020 at 3:13 PM David Smiley <ds...@apache.org> wrote:
>
>> I think that's a bug!  Good catch!
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Thu, Oct 1, 2020 at 11:38 PM Alexandre Rafalovitch <ar...@gmail.com>
>> wrote:
>>
>>> I am indexing a deeply nested structure and am trying to return it
>>> with fl=*,[child].
>>>
>>> And it is supposed to have 5 children under the top element but
>>> returns only 4. Two hours of debugging later, I realize that the
>>> "limit" parameter is set to 10 by default and that 10 seems to be
>>> counting children at ANY level. And calculating them depth-first. So,
>>> it was quite unobvious to discover when the children suddenly stopped
>>> showing up.
>>>
>>> The documentation says:
>>> > The maximum number of child documents to be returned per parent
>>> document. > The default is `10`.
>>>
>>> So, is that (all nested children included in limit) what we actually
>>> mean? Or did we mean maximum number of "immediate children" for any
>>> specific document/level and the code is wrong?
>>>
>>> I can update the doc to clarify the results, but I don't know whether
>>> I am looking at the bug or the feature.
>>>
>>> Regards,
>>>    Alex.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>
>>>

Re: Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
I did not create a ticket (got distracted). Feel free to make one and
add me to watchers. I will be happy to test it with my dataset.

Thanks,
   Alex.

On Sat, 3 Oct 2020 at 15:23, Bar Rotstein <ba...@gmail.com> wrote:
>
> Hey,
> Was a ticket opened?
>
> I'd gladly tackle that one if it hasn't been assigned yet.
>
> Thanks in advance,
> Bar
> On Fri, Oct 2, 2020 at 3:13 PM David Smiley <ds...@apache.org> wrote:
>>
>> I think that's a bug!  Good catch!
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Thu, Oct 1, 2020 at 11:38 PM Alexandre Rafalovitch <ar...@gmail.com> wrote:
>>>
>>> I am indexing a deeply nested structure and am trying to return it
>>> with fl=*,[child].
>>>
>>> And it is supposed to have 5 children under the top element but
>>> returns only 4. Two hours of debugging later, I realize that the
>>> "limit" parameter is set to 10 by default and that 10 seems to be
>>> counting children at ANY level. And calculating them depth-first. So,
>>> it was quite unobvious to discover when the children suddenly stopped
>>> showing up.
>>>
>>> The documentation says:
>>> > The maximum number of child documents to be returned per parent document. > The default is `10`.
>>>
>>> So, is that (all nested children included in limit) what we actually
>>> mean? Or did we mean maximum number of "immediate children" for any
>>> specific document/level and the code is wrong?
>>>
>>> I can update the doc to clarify the results, but I don't know whether
>>> I am looking at the bug or the feature.
>>>
>>> Regards,
>>>    Alex.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

Posted by Bar Rotstein <ba...@gmail.com>.
Hey,
Was a ticket opened?

I'd gladly tackle that one if it hasn't been assigned yet.

Thanks in advance,
Bar
On Fri, Oct 2, 2020 at 3:13 PM David Smiley <ds...@apache.org> wrote:

> I think that's a bug!  Good catch!
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Oct 1, 2020 at 11:38 PM Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
>
>> I am indexing a deeply nested structure and am trying to return it
>> with fl=*,[child].
>>
>> And it is supposed to have 5 children under the top element but
>> returns only 4. Two hours of debugging later, I realize that the
>> "limit" parameter is set to 10 by default and that 10 seems to be
>> counting children at ANY level. And calculating them depth-first. So,
>> it was quite unobvious to discover when the children suddenly stopped
>> showing up.
>>
>> The documentation says:
>> > The maximum number of child documents to be returned per parent
>> document. > The default is `10`.
>>
>> So, is that (all nested children included in limit) what we actually
>> mean? Or did we mean maximum number of "immediate children" for any
>> specific document/level and the code is wrong?
>>
>> I can update the doc to clarify the results, but I don't know whether
>> I am looking at the bug or the feature.
>>
>> Regards,
>>    Alex.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>>

Re: Should ChildDocTransformerFactory's limit be local or global for deep-nested documents?

Posted by David Smiley <ds...@apache.org>.
I think that's a bug!  Good catch!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Oct 1, 2020 at 11:38 PM Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> I am indexing a deeply nested structure and am trying to return it
> with fl=*,[child].
>
> And it is supposed to have 5 children under the top element but
> returns only 4. Two hours of debugging later, I realize that the
> "limit" parameter is set to 10 by default and that 10 seems to be
> counting children at ANY level. And calculating them depth-first. So,
> it was quite unobvious to discover when the children suddenly stopped
> showing up.
>
> The documentation says:
> > The maximum number of child documents to be returned per parent
> document. > The default is `10`.
>
> So, is that (all nested children included in limit) what we actually
> mean? Or did we mean maximum number of "immediate children" for any
> specific document/level and the code is wrong?
>
> I can update the doc to clarify the results, but I don't know whether
> I am looking at the bug or the feature.
>
> Regards,
>    Alex.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>