You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by javaguy44 <ja...@yahoo.com> on 2009/08/21 16:00:37 UTC

Lucene SORT does a sort on entire index..how do I filter SORT?

Hi,

I'm currently looking at sorting in lucene, and to get started I took a look
at the distance sorting example from the Lucene in Action book.

Working through the test DistanceSortingTest, I've noticed that performing
the SORT ends up sorting the whole index!

To test this I did the following:
 - added a few more lines in setup()
    addPoint(writer, "Nico's Fish Shop", "fishmongerie", 10, 10);
    addPoint(writer, "Nico's Fish Shop deux", "fishmongerie", 10, 10);

 - I added a log statement to DistanceComparatorSource in the
while(termDocs.next()) loop
 - I ran DistanceSortingTest.testNearesRestaurantToHome and to my surprise I
had 6 sorts / log lines of output in while(termDocs.next()) loop

DistanceSortingTest.testNearesRestaurantToHome searches by the query term
new TermQuery(new Term("type", "restaurant")).  As such shouldn't the index
be filtered first (to 4 documents) before the DistanceSort occurs?

Obviously this is not ideal in a million+ document index and assuming you
had 100, 200 records that were hit based on the term.

Would appreciate someone's input / advice on how to filter first
-- 
View this message in context: http://www.nabble.com/Lucene-SORT-does-a-sort-on-entire-index..how-do-I-filter-SORT--tp25080365p25080365.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene SORT does a sort on entire index..how do I filter SORT?

Posted by javaguy44 <ja...@yahoo.com>.
Hi Jason,

Thanks v much for your replies + help.

I got it - all is swell.

Thx again


Jason Rutherglen-2 wrote:
> 
> Only the results from the query should be sorted. The field
> caches do get loaded for all values of a field though, is that
> what you're seeing?
> 
> On Fri, Aug 21, 2009 at 4:09 PM, javaguy44<ja...@yahoo.com> wrote:
>>
>> Hi Jason,
>>
>> Thanks for the advice.
>>
>> However I was just working through the example.  I actually don't want to
>> search on numbers / dates / geo etc and was looking at custom sorting.
>>
>> It appears that custom sorting, or even sorting for that matter is not
>> useful if every document will have sorting applied against it before the
>> document filter even hits.
>>
>> Is there no way to limit the sorting to only the documents that were
>> found
>> in the query?
>>
>> Thanks
>>
>>
>>
>> Jason Rutherglen-2 wrote:
>>>
>>> Take a look at contrib/spatial.
>>>
>>> On Fri, Aug 21, 2009 at 7:00 AM, javaguy44<ja...@yahoo.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm currently looking at sorting in lucene, and to get started I took a
>>>> look
>>>> at the distance sorting example from the Lucene in Action book.
>>>>
>>>> Working through the test DistanceSortingTest, I've noticed that
>>>> performing
>>>> the SORT ends up sorting the whole index!
>>>>
>>>> To test this I did the following:
>>>>  - added a few more lines in setup()
>>>>    addPoint(writer, "Nico's Fish Shop", "fishmongerie", 10, 10);
>>>>    addPoint(writer, "Nico's Fish Shop deux", "fishmongerie", 10, 10);
>>>>
>>>>  - I added a log statement to DistanceComparatorSource in the
>>>> while(termDocs.next()) loop
>>>>  - I ran DistanceSortingTest.testNearesRestaurantToHome and to my
>>>> surprise I
>>>> had 6 sorts / log lines of output in while(termDocs.next()) loop
>>>>
>>>> DistanceSortingTest.testNearesRestaurantToHome searches by the query
>>>> term
>>>> new TermQuery(new Term("type", "restaurant")).  As such shouldn't the
>>>> index
>>>> be filtered first (to 4 documents) before the DistanceSort occurs?
>>>>
>>>> Obviously this is not ideal in a million+ document index and assuming
>>>> you
>>>> had 100, 200 records that were hit based on the term.
>>>>
>>>> Would appreciate someone's input / advice on how to filter first
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Lucene-SORT-does-a-sort-on-entire-index..how-do-I-filter-SORT--tp25080365p25080365.html
>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Lucene-SORT-does-a-sort-on-entire-index..how-do-I-filter-SORT--tp25080365p25088699.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Lucene-SORT-does-a-sort-on-entire-index..how-do-I-filter-SORT--tp25080365p25179313.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene SORT does a sort on entire index..how do I filter SORT?

Posted by Jason Rutherglen <ja...@gmail.com>.
Take a look at contrib/spatial.

On Fri, Aug 21, 2009 at 7:00 AM, javaguy44<ja...@yahoo.com> wrote:
>
> Hi,
>
> I'm currently looking at sorting in lucene, and to get started I took a look
> at the distance sorting example from the Lucene in Action book.
>
> Working through the test DistanceSortingTest, I've noticed that performing
> the SORT ends up sorting the whole index!
>
> To test this I did the following:
>  - added a few more lines in setup()
>    addPoint(writer, "Nico's Fish Shop", "fishmongerie", 10, 10);
>    addPoint(writer, "Nico's Fish Shop deux", "fishmongerie", 10, 10);
>
>  - I added a log statement to DistanceComparatorSource in the
> while(termDocs.next()) loop
>  - I ran DistanceSortingTest.testNearesRestaurantToHome and to my surprise I
> had 6 sorts / log lines of output in while(termDocs.next()) loop
>
> DistanceSortingTest.testNearesRestaurantToHome searches by the query term
> new TermQuery(new Term("type", "restaurant")).  As such shouldn't the index
> be filtered first (to 4 documents) before the DistanceSort occurs?
>
> Obviously this is not ideal in a million+ document index and assuming you
> had 100, 200 records that were hit based on the term.
>
> Would appreciate someone's input / advice on how to filter first
> --
> View this message in context: http://www.nabble.com/Lucene-SORT-does-a-sort-on-entire-index..how-do-I-filter-SORT--tp25080365p25080365.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org