You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Alex <az...@gmail.com> on 2010/04/25 04:52:46 UTC

Small sorting problem in spatial contrib

Hi,

I've been playing around with the SpatialFilter that Chris Male and others
ares developing.
I must say it seems to work fine and I'm really eager to see the final
release.

However I think I found a small bug  that occurs at scoring time in some
situations.

The SpatialFilter does its job pretty well but when

TestSpatialFilter.DistanceCustomScoreQuery.customScore(int doc, float
subQueryScore, float valSrcScore) {
Double docDistance = spatialFilter.getDistanceFilter().getDistance(doc);

 is called I get

docDistance = null

which prevents the scoring to occur correctly.

I've been reading the code a litle to see what occurs under the hood and it
seems to me the problem is linked with the ThreadedDistanceFilter and
IndexSearcher.

The problem really occurs when IndexSearcher's public void search(Weight
weight, Filter filter, Collector collector) is called and when it has more
than 1 subReader

near line 260 of IndexSearcher

for (int i = 0; i < subReaders.length; i++) { // search each subreader
        collector.setNextReader(subReaders[i], docStarts[i]);
        searchWithFilter(subReaders[i], weight, filter, collector);
}



What seems to happen is that ThreadedDistanceFilter stores the correct
document id in it's inner distances Map but somehow the value that is passed
to
TestSpatialFilter.DistanceCustomScoreQuery.customScore(int doc, float
subQueryScore, float valSrcScore)
is not correct and probably corresponds to the bit position of the subReader
...

I hope I'm clear enough ...
:)

I have no idea on how to fix this. I hope this helps you guys and you can
find a workaround.
Let me know what I can do to fix this problem.


Thanks for helping.

Cheers,


Alex

Re: Small sorting problem in spatial contrib

Posted by Alex <az...@gmail.com>.
Thank you Uwe.

I'll check out latest version of Lucene and see if I can find a temporary
fix with the neew CustomScoreQuery

Cheers,

Alex

RE: Small sorting problem in spatial contrib

Posted by Uwe Schindler <uw...@thetaphi.de>.
Here some links to issues, that show the problem.

 

>From you code it looks like you are still using 3.0.0 not 3.0.1 (as CustomScoreQuery.customScore is deprecated and fixed in 3.0.1, as this is the root of the problem): https://issues.apache.org/jira/browse/LUCENE-2190

 

The new DistanceQuery is planned here:

https://issues.apache.org/jira/browse/LUCENE-2395

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Sunday, April 25, 2010 10:51 AM
To: dev@lucene.apache.org
Subject: RE: Small sorting problem in spatial contrib

 

If you look into the source code of this CustomScore Spatial Test, there is mentioned that it is broken. The problem is that the spatial contrib. sort does not work correct with multiple segments. We are working on that, no solution here. We will have DistanceQuery in future, that does not need sorting, as it will score with the distance in adition to normal scoring.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: uwe@thetaphi.de

 

From: Alex [mailto:azlist1@gmail.com] 
Sent: Sunday, April 25, 2010 4:53 AM
To: java-dev@lucene.apache.org
Subject: Small sorting problem in spatial contrib

 

Hi, 

I've been playing around with the SpatialFilter that Chris Male and others ares developing. 
I must say it seems to work fine and I'm really eager to see the final release. 

However I think I found a small bug  that occurs at scoring time in some situations.

The SpatialFilter does its job pretty well but when

TestSpatialFilter.DistanceCustomScoreQuery.customScore(int doc, float subQueryScore, float valSrcScore) {
Double docDistance = spatialFilter.getDistanceFilter().getDistance(doc);

 is called I get 

docDistance = null 

which prevents the scoring to occur correctly. 

I've been reading the code a litle to see what occurs under the hood and it seems to me the problem is linked with the ThreadedDistanceFilter and IndexSearcher. 

The problem really occurs when IndexSearcher's public void search(Weight weight, Filter filter, Collector collector) is called and when it has more than 1 subReader 

near line 260 of IndexSearcher

for (int i = 0; i < subReaders.length; i++) { // search each subreader
        collector.setNextReader(subReaders[i], docStarts[i]);
        searchWithFilter(subReaders[i], weight, filter, collector);
}



What seems to happen is that ThreadedDistanceFilter stores the correct document id in it's inner distances Map but somehow the value that is passed to 
TestSpatialFilter.DistanceCustomScoreQuery.customScore(int doc, float subQueryScore, float valSrcScore)
is not correct and probably corresponds to the bit position of the subReader ...

I hope I'm clear enough ... 
:)

I have no idea on how to fix this. I hope this helps you guys and you can find a workaround. 
Let me know what I can do to fix this problem.


Thanks for helping. 

Cheers,


Alex


RE: Small sorting problem in spatial contrib

Posted by Uwe Schindler <uw...@thetaphi.de>.
If you look into the source code of this CustomScore Spatial Test, there is mentioned that it is broken. The problem is that the spatial contrib. sort does not work correct with multiple segments. We are working on that, no solution here. We will have DistanceQuery in future, that does not need sorting, as it will score with the distance in adition to normal scoring.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: uwe@thetaphi.de

 

From: Alex [mailto:azlist1@gmail.com] 
Sent: Sunday, April 25, 2010 4:53 AM
To: java-dev@lucene.apache.org
Subject: Small sorting problem in spatial contrib

 

Hi, 

I've been playing around with the SpatialFilter that Chris Male and others ares developing. 
I must say it seems to work fine and I'm really eager to see the final release. 

However I think I found a small bug  that occurs at scoring time in some situations.

The SpatialFilter does its job pretty well but when

TestSpatialFilter.DistanceCustomScoreQuery.customScore(int doc, float subQueryScore, float valSrcScore) {
Double docDistance = spatialFilter.getDistanceFilter().getDistance(doc);

 is called I get 

docDistance = null 

which prevents the scoring to occur correctly. 

I've been reading the code a litle to see what occurs under the hood and it seems to me the problem is linked with the ThreadedDistanceFilter and IndexSearcher. 

The problem really occurs when IndexSearcher's public void search(Weight weight, Filter filter, Collector collector) is called and when it has more than 1 subReader 

near line 260 of IndexSearcher

for (int i = 0; i < subReaders.length; i++) { // search each subreader
        collector.setNextReader(subReaders[i], docStarts[i]);
        searchWithFilter(subReaders[i], weight, filter, collector);
}



What seems to happen is that ThreadedDistanceFilter stores the correct document id in it's inner distances Map but somehow the value that is passed to 
TestSpatialFilter.DistanceCustomScoreQuery.customScore(int doc, float subQueryScore, float valSrcScore)
is not correct and probably corresponds to the bit position of the subReader ...

I hope I'm clear enough ... 
:)

I have no idea on how to fix this. I hope this helps you guys and you can find a workaround. 
Let me know what I can do to fix this problem.


Thanks for helping. 

Cheers,


Alex