You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Joel Bernstein <jo...@gmail.com> on 2020/12/27 16:59:18 UTC

LeafReaderContext ord is unexpectedly 0

I ran into this while writing some Solr code today.

List<LeafReaderContext> leaves =
req.getSearcher().getTopReaderContext().leaves();

The req is a SolrQueryRequest object.

Now if I do this:

leaves.get(5).reader().getContext().ord

I would expect *ord* in this scenario to be *5*.

But in my testing in master it's returning 0.

It seems like this is a bug. Not sure yet if this is a bug in Sor or
Lucene. Am I missing anything here that anyone can see?


Joel Bernstein
http://joelsolr.blogspot.com/

Re: LeafReaderContext ord is unexpectedly 0

Posted by Joel Bernstein <jo...@gmail.com>.
That's exactly how this problem occurred. I'll make sure this is fixed
before merging into the codebase.


Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Dec 27, 2020 at 6:12 PM Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi,
>
>
>
> just to add: Any public query API (weight, query, DocIdSetIterators,…)
> should always take LeafReaderContext as parameter. If you have some solr
> plugin that maybe implements some method only taking LeafReader, this one
> lost context and it’s impossible to restore from that. So if sending
> IndexReader instances around (no matter what type), always use
> ReaderContexts, especially in public APIs.
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: uwe@thetaphi.de
>
>
>
> *From:* Joel Bernstein <jo...@gmail.com>
> *Sent:* Sunday, December 27, 2020 7:36 PM
> *To:* lucene dev <de...@lucene.apache.org>
> *Subject:* Re: LeafReaderContext ord is unexpectedly 0
>
>
>
> Ok this makes sense. I suspect I never ran across this before because I
> always accessed the ord through the context before getting the reader.
>
>
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>
>
>
>
>
> On Sun, Dec 27, 2020 at 1:10 PM Uwe Schindler <uw...@thetaphi.de> wrote:
>
> Hi,
>
>
>
> that behaviour is fully correct and was always like that. Just for info (I
> had some slides on berlinbuzzwords like 8.5 years ago):
>
> https://youtu.be/iZZ1AbJ6dik?t=1975
>
>
>
> The problem is a classical “wrong point of view” problem!
>
>
>
> IndexReaders and their subclasses have no idea about their neighbours or
> parents, they can always be used on their own. They can also be in multiple
> contexts (!!!!), like a LeafReader (in that talk we used AtomicReader) is
> part of a DirectoryReader but at same time somebody else has constructed
> another composite reader  with LeafReaders from totally different
> directories (e.g., when merging different indexes together). So in short: A
> reader does not know anything about its own “where I am”.
>
>
>
> The method getContext() is only there as a helper method (it’s a bit
> misnomed), to create a **new** context that describes this reader as the
> only one in it, so inside this new context it has an ord of 0.
>
>
>
> The problem in your code is: you dive down through the correct context
> from top-level (the top context is from the point of view of the
> SolrSearcher), but then you leave this hierarchy by calling reader(). At
> that point you lost context information. After that you get a new context
> and this one returns 0, because its no longer form SolrIndexSearcher’s
> point of view, but its own PoV.
>
>
>
> Replace: leaves.get(5).reader().getContext().ord
>
> By: leaves.get(5).ord
>
>
>
> And you’re fine. The red part leaves the top level context and then
> creates a new one – an then you’re lost!
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: uwe@thetaphi.de
>
>
>
> *From:* Joel Bernstein <jo...@gmail.com>
> *Sent:* Sunday, December 27, 2020 5:59 PM
> *To:* lucene dev <de...@lucene.apache.org>
> *Subject:* LeafReaderContext ord is unexpectedly 0
>
>
>
> I ran into this while writing some Solr code today.
>
>
>
> List<LeafReaderContext> leaves =
> req.getSearcher().getTopReaderContext().leaves();
>
>
>
> The req is a SolrQueryRequest object.
>
>
>
> Now if I do this:
>
>
>
> leaves.get(5).reader().getContext().ord
>
>
>
> I would expect *ord* in this scenario to be *5*.
>
>
>
> But in my testing in master it's returning 0.
>
>
>
> It seems like this is a bug. Not sure yet if this is a bug in Sor or
> Lucene. Am I missing anything here that anyone can see?
>
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>
>

RE: LeafReaderContext ord is unexpectedly 0

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

 

just to add: Any public query API (weight, query, DocIdSetIterators,…) should always take LeafReaderContext as parameter. If you have some solr plugin that maybe implements some method only taking LeafReader, this one lost context and it’s impossible to restore from that. So if sending IndexReader instances around (no matter what type), always use ReaderContexts, especially in public APIs.

 

Uwe

 

-----

Uwe Schindler

Achterdiek 19, D-28357 Bremen

https://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Joel Bernstein <jo...@gmail.com> 
Sent: Sunday, December 27, 2020 7:36 PM
To: lucene dev <de...@lucene.apache.org>
Subject: Re: LeafReaderContext ord is unexpectedly 0

 

Ok this makes sense. I suspect I never ran across this before because I always accessed the ord through the context before getting the reader.  

 

 

Joel Bernstein

http://joelsolr.blogspot.com/

 

 

On Sun, Dec 27, 2020 at 1:10 PM Uwe Schindler <uwe@thetaphi.de <ma...@thetaphi.de> > wrote:

Hi,

 

that behaviour is fully correct and was always like that. Just for info (I had some slides on berlinbuzzwords like 8.5 years ago):

https://youtu.be/iZZ1AbJ6dik?t=1975

 

The problem is a classical “wrong point of view” problem!

 

IndexReaders and their subclasses have no idea about their neighbours or parents, they can always be used on their own. They can also be in multiple contexts (!!!!), like a LeafReader (in that talk we used AtomicReader) is part of a DirectoryReader but at same time somebody else has constructed another composite reader  with LeafReaders from totally different directories (e.g., when merging different indexes together). So in short: A reader does not know anything about its own “where I am”.

 

The method getContext() is only there as a helper method (it’s a bit misnomed), to create a *new* context that describes this reader as the only one in it, so inside this new context it has an ord of 0.

 

The problem in your code is: you dive down through the correct context from top-level (the top context is from the point of view of the SolrSearcher), but then you leave this hierarchy by calling reader(). At that point you lost context information. After that you get a new context and this one returns 0, because its no longer form SolrIndexSearcher’s point of view, but its own PoV.

 

Replace: leaves.get(5).reader().getContext().ord 

By: leaves.get(5).ord 

 

And you’re fine. The red part leaves the top level context and then creates a new one – an then you’re lost!

 

Uwe

 

-----

Uwe Schindler

Achterdiek 19, D-28357 Bremen

https://www.thetaphi.de

eMail: uwe@thetaphi.de <ma...@thetaphi.de> 

 

From: Joel Bernstein <joelsolr@gmail.com <ma...@gmail.com> > 
Sent: Sunday, December 27, 2020 5:59 PM
To: lucene dev <dev@lucene.apache.org <ma...@lucene.apache.org> >
Subject: LeafReaderContext ord is unexpectedly 0

 

I ran into this while writing some Solr code today.

 

List<LeafReaderContext> leaves = req.getSearcher().getTopReaderContext().leaves();

 

The req is a SolrQueryRequest object.

 

Now if I do this:

 

leaves.get(5).reader().getContext().ord 

 

I would expect ord in this scenario to be 5.

 

But in my testing in master it's returning 0. 

 

It seems like this is a bug. Not sure yet if this is a bug in Sor or Lucene. Am I missing anything here that anyone can see?

 




Joel Bernstein

http://joelsolr.blogspot.com/


Re: LeafReaderContext ord is unexpectedly 0

Posted by Joel Bernstein <jo...@gmail.com>.
Ok this makes sense. I suspect I never ran across this before because I
always accessed the ord through the context before getting the reader.


Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Dec 27, 2020 at 1:10 PM Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi,
>
>
>
> that behaviour is fully correct and was always like that. Just for info (I
> had some slides on berlinbuzzwords like 8.5 years ago):
>
> https://youtu.be/iZZ1AbJ6dik?t=1975
>
>
>
> The problem is a classical “wrong point of view” problem!
>
>
>
> IndexReaders and their subclasses have no idea about their neighbours or
> parents, they can always be used on their own. They can also be in multiple
> contexts (!!!!), like a LeafReader (in that talk we used AtomicReader) is
> part of a DirectoryReader but at same time somebody else has constructed
> another composite reader  with LeafReaders from totally different
> directories (e.g., when merging different indexes together). So in short: A
> reader does not know anything about its own “where I am”.
>
>
>
> The method getContext() is only there as a helper method (it’s a bit
> misnomed), to create a **new** context that describes this reader as the
> only one in it, so inside this new context it has an ord of 0.
>
>
>
> The problem in your code is: you dive down through the correct context
> from top-level (the top context is from the point of view of the
> SolrSearcher), but then you leave this hierarchy by calling reader(). At
> that point you lost context information. After that you get a new context
> and this one returns 0, because its no longer form SolrIndexSearcher’s
> point of view, but its own PoV.
>
>
>
> Replace: leaves.get(5).reader().getContext().ord
>
> By: leaves.get(5).ord
>
>
>
> And you’re fine. The red part leaves the top level context and then
> creates a new one – an then you’re lost!
>
>
>
> Uwe
>
>
>
> -----
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: uwe@thetaphi.de
>
>
>
> *From:* Joel Bernstein <jo...@gmail.com>
> *Sent:* Sunday, December 27, 2020 5:59 PM
> *To:* lucene dev <de...@lucene.apache.org>
> *Subject:* LeafReaderContext ord is unexpectedly 0
>
>
>
> I ran into this while writing some Solr code today.
>
>
>
> List<LeafReaderContext> leaves =
> req.getSearcher().getTopReaderContext().leaves();
>
>
>
> The req is a SolrQueryRequest object.
>
>
>
> Now if I do this:
>
>
>
> leaves.get(5).reader().getContext().ord
>
>
>
> I would expect *ord* in this scenario to be *5*.
>
>
>
> But in my testing in master it's returning 0.
>
>
>
> It seems like this is a bug. Not sure yet if this is a bug in Sor or
> Lucene. Am I missing anything here that anyone can see?
>
>
>
>
> Joel Bernstein
>
> http://joelsolr.blogspot.com/
>

RE: LeafReaderContext ord is unexpectedly 0

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

 

that behaviour is fully correct and was always like that. Just for info (I had some slides on berlinbuzzwords like 8.5 years ago):

https://youtu.be/iZZ1AbJ6dik?t=1975

 

The problem is a classical “wrong point of view” problem!

 

IndexReaders and their subclasses have no idea about their neighbours or parents, they can always be used on their own. They can also be in multiple contexts (!!!!), like a LeafReader (in that talk we used AtomicReader) is part of a DirectoryReader but at same time somebody else has constructed another composite reader  with LeafReaders from totally different directories (e.g., when merging different indexes together). So in short: A reader does not know anything about its own “where I am”.

 

The method getContext() is only there as a helper method (it’s a bit misnomed), to create a *new* context that describes this reader as the only one in it, so inside this new context it has an ord of 0.

 

The problem in your code is: you dive down through the correct context from top-level (the top context is from the point of view of the SolrSearcher), but then you leave this hierarchy by calling reader(). At that point you lost context information. After that you get a new context and this one returns 0, because its no longer form SolrIndexSearcher’s point of view, but its own PoV.

 

Replace: leaves.get(5).reader().getContext().ord 

By: leaves.get(5).ord 

 

And you’re fine. The red part leaves the top level context and then creates a new one – an then you’re lost!

 

Uwe

 

-----

Uwe Schindler

Achterdiek 19, D-28357 Bremen

https://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Joel Bernstein <jo...@gmail.com> 
Sent: Sunday, December 27, 2020 5:59 PM
To: lucene dev <de...@lucene.apache.org>
Subject: LeafReaderContext ord is unexpectedly 0

 

I ran into this while writing some Solr code today.

 

List<LeafReaderContext> leaves = req.getSearcher().getTopReaderContext().leaves();

 

The req is a SolrQueryRequest object.

 

Now if I do this:

 

leaves.get(5).reader().getContext().ord 

 

I would expect ord in this scenario to be 5.

 

But in my testing in master it's returning 0. 

 

It seems like this is a bug. Not sure yet if this is a bug in Sor or Lucene. Am I missing anything here that anyone can see?

 




Joel Bernstein

http://joelsolr.blogspot.com/


Re: LeafReaderContext ord is unexpectedly 0

Posted by Joel Bernstein <jo...@gmail.com>.
I'll have to dig around in some collector code. I could swear that you
could track the ord of the leaf this way at collection time. But there may
be different code paths used then one I showed above.

Joel Bernstein
http://joelsolr.blogspot.com/


On Sun, Dec 27, 2020 at 12:25 PM Haoyu Zhai <zh...@gmail.com> wrote:

> Hi Joel,
> LeafReader.getContext() is expected to return "the root IndexReaderContext
> <https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/IndexReaderContext.html> for
> this IndexReader
> <https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/IndexReader.html>'s
> sub-reader tree." (
> https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/LeafReader.html#getContext()
> )
> Which means it will returns a context with ord 0 (a newly constructed, not
> the previous one [1]) if it is already a leaf. So I think this is expected?
>
> [1]:
> https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/LeafReader.java#L43
>
> Best
> Patrick
>
> Joel Bernstein <jo...@gmail.com> 于2020年12月27日周日 上午8:59写道:
>
>> I ran into this while writing some Solr code today.
>>
>> List<LeafReaderContext> leaves =
>> req.getSearcher().getTopReaderContext().leaves();
>>
>> The req is a SolrQueryRequest object.
>>
>> Now if I do this:
>>
>> leaves.get(5).reader().getContext().ord
>>
>> I would expect *ord* in this scenario to be *5*.
>>
>> But in my testing in master it's returning 0.
>>
>> It seems like this is a bug. Not sure yet if this is a bug in Sor or
>> Lucene. Am I missing anything here that anyone can see?
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>

Re: LeafReaderContext ord is unexpectedly 0

Posted by Haoyu Zhai <zh...@gmail.com>.
Hi Joel,
LeafReader.getContext() is expected to return "the root IndexReaderContext
<https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/IndexReaderContext.html>
for
this IndexReader
<https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/IndexReader.html>'s
sub-reader tree." (
https://lucene.apache.org/core/5_2_0/core/org/apache/lucene/index/LeafReader.html#getContext()
)
Which means it will returns a context with ord 0 (a newly constructed, not
the previous one [1]) if it is already a leaf. So I think this is expected?

[1]:
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/index/LeafReader.java#L43

Best
Patrick

Joel Bernstein <jo...@gmail.com> 于2020年12月27日周日 上午8:59写道:

> I ran into this while writing some Solr code today.
>
> List<LeafReaderContext> leaves =
> req.getSearcher().getTopReaderContext().leaves();
>
> The req is a SolrQueryRequest object.
>
> Now if I do this:
>
> leaves.get(5).reader().getContext().ord
>
> I would expect *ord* in this scenario to be *5*.
>
> But in my testing in master it's returning 0.
>
> It seems like this is a bug. Not sure yet if this is a bug in Sor or
> Lucene. Am I missing anything here that anyone can see?
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>