You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Artem Chereisky <a....@gmail.com> on 2010/03/08 05:26:40 UTC
new 2.9 search API
Hi,
I'm migrating my code to 2.9 and I'm having trouble understanding the new
search API
Here is a sample 2.4 code:
var searcher = new IndexSearcher(someReader);
searcher.SetSimilarity(new MySimilarity());
var hitCollector = new MyHitCollector();
var query = MyBuildQueryMethod();
searcher.Search(query, hitCollector);
The Collect method of MyHitCollector gets called for every matching document
with a docId and a score based on MySimilarity implementation.
In 2.9 HitCollector is replaced with Collector with 4 abstract methods to
implement. One of them is SetScorer(Scorer). This is where I'm getting lost.
Where do I get an instance of scorer from? There are many, one for each
query type. I think I'm missing something fundamental. Please clarify.
Regards,
Art
Re: new 2.9 search API
Posted by Artem Chereisky <a....@gmail.com>.
Of course, thanks for spotting it Michael. What was I thinking?!
Art
On 09/03/2010, at 15:55, Michael Garski <mg...@myspace-inc.com> wrote:
> Artem,
>
> You're on the right track, except for step 5. The score comes
> directly from the Scorer.Score() method, docBase is added to the
> document id passed into Collect to get the overall index document
> number. The number passed into Collect is the number of the document
> in the current segment being searched.
>
> Michael
>
> On Mar 8, 2010, at 8:44 PM, "Artem Chereisky"
> <a....@gmail.com> wrote:
>
>> Thanks Michael, you pointed me in the right direction.
>>
>> To answer my own question and to close the thread, I didn't need to
>> worry
>> about passing an instance of Scorer to MyCollector instance. The
>> SetScorer
>> method is called by the Searcher, so to enable scoring I needed to
>> do 5
>> things:
>>
>> 1. have an instance of Scorer in MyCollector class
>> 2. have an int member _docBase
>>
>> private int _docBase;
>> private Scorer _scorer;
>>
>> 3. implement SetNextReader
>>
>> public override void SetNextReader(Lucene.Net.Index.IndexReader
>> reader, int docBase)
>> {
>> _docBase = docBase;
>> }
>>
>> 4. Implement SetScorer
>>
>> public override void SetScorer(Scorer scorer)
>> {
>> _scorer = scorer;
>> }
>>
>>
>> 5. and finally call _scorer.Score() in Collect method adding
>> _docBase to the
>> result of _scorer.Score().
>>
>> Thanks,
>> Art
>>
>>
>> On Tue, Mar 9, 2010 at 3:11 AM, Michael Garski <mgarski@myspace-inc.com
>> >wrote:
>>
>>> Artem,
>>>
>>> The four methods on the Collector abstract class are invoked by the
>>> searcher that is performing the search, and are up to you to
>>> implement
>>> in your collector as is necessary.
>>>
>>> With the change to segment-by-segment searching in 2.9, SetScorer
>>> and
>>> SetNextReader allow the searcher to pass the current Scorer and
>>> IndexReader to the collector.
>>>
>>> The javadocs have the same content as the .NET documentation
>>> comments,
>>> and the one for Collector is:
>>> http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col
>>> lector.html<http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col%0Alector.html
>>> >
>>>
>>> Additionally, take a look at the Test project file Search
>>> \QueryUtils.cs
>>> - there are two Collector implementations in it -
>>> AnonymousClassCollector & AnonymousClassCollector1 that are good
>>> examples of how to implement a concrete Collector.
>>>
>>> Michael
>>>
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: Artem Chereisky [mailto:a.chereisky@gmail.com]
>>> Sent: Sunday, March 07, 2010 8:27 PM
>>> To: lucene-net-user@lucene.apache.org
>>> Subject: new 2.9 search API
>>>
>>> Hi,
>>>
>>> I'm migrating my code to 2.9 and I'm having trouble understanding
>>> the
>>> new
>>> search API
>>>
>>> Here is a sample 2.4 code:
>>>
>>> var searcher = new IndexSearcher(someReader);
>>> searcher.SetSimilarity(new MySimilarity());
>>>
>>> var hitCollector = new MyHitCollector();
>>> var query = MyBuildQueryMethod();
>>>
>>> searcher.Search(query, hitCollector);
>>>
>>> The Collect method of MyHitCollector gets called for every matching
>>> document
>>> with a docId and a score based on MySimilarity implementation.
>>>
>>> In 2.9 HitCollector is replaced with Collector with 4 abstract
>>> methods
>>> to
>>> implement. One of them is SetScorer(Scorer). This is where I'm
>>> getting
>>> lost.
>>> Where do I get an instance of scorer from? There are many, one for
>>> each
>>> query type. I think I'm missing something fundamental. Please
>>> clarify.
>>>
>>> Regards,
>>> Art
>>>
>>>
>
Re: new 2.9 search API
Posted by Michael Garski <mg...@myspace-inc.com>.
Artem,
You're on the right track, except for step 5. The score comes directly
from the Scorer.Score() method, docBase is added to the document id
passed into Collect to get the overall index document number. The
number passed into Collect is the number of the document in the
current segment being searched.
Michael
On Mar 8, 2010, at 8:44 PM, "Artem Chereisky" <a....@gmail.com>
wrote:
> Thanks Michael, you pointed me in the right direction.
>
> To answer my own question and to close the thread, I didn't need to
> worry
> about passing an instance of Scorer to MyCollector instance. The
> SetScorer
> method is called by the Searcher, so to enable scoring I needed to
> do 5
> things:
>
> 1. have an instance of Scorer in MyCollector class
> 2. have an int member _docBase
>
> private int _docBase;
> private Scorer _scorer;
>
> 3. implement SetNextReader
>
> public override void SetNextReader(Lucene.Net.Index.IndexReader
> reader, int docBase)
> {
> _docBase = docBase;
> }
>
> 4. Implement SetScorer
>
> public override void SetScorer(Scorer scorer)
> {
> _scorer = scorer;
> }
>
>
> 5. and finally call _scorer.Score() in Collect method adding
> _docBase to the
> result of _scorer.Score().
>
> Thanks,
> Art
>
>
> On Tue, Mar 9, 2010 at 3:11 AM, Michael Garski <mgarski@myspace-inc.com
> >wrote:
>
>> Artem,
>>
>> The four methods on the Collector abstract class are invoked by the
>> searcher that is performing the search, and are up to you to
>> implement
>> in your collector as is necessary.
>>
>> With the change to segment-by-segment searching in 2.9, SetScorer and
>> SetNextReader allow the searcher to pass the current Scorer and
>> IndexReader to the collector.
>>
>> The javadocs have the same content as the .NET documentation
>> comments,
>> and the one for Collector is:
>> http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col
>> lector.html<http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col%0Alector.html
>> >
>>
>> Additionally, take a look at the Test project file Search
>> \QueryUtils.cs
>> - there are two Collector implementations in it -
>> AnonymousClassCollector & AnonymousClassCollector1 that are good
>> examples of how to implement a concrete Collector.
>>
>> Michael
>>
>>
>>
>>
>> -----Original Message-----
>> From: Artem Chereisky [mailto:a.chereisky@gmail.com]
>> Sent: Sunday, March 07, 2010 8:27 PM
>> To: lucene-net-user@lucene.apache.org
>> Subject: new 2.9 search API
>>
>> Hi,
>>
>> I'm migrating my code to 2.9 and I'm having trouble understanding the
>> new
>> search API
>>
>> Here is a sample 2.4 code:
>>
>> var searcher = new IndexSearcher(someReader);
>> searcher.SetSimilarity(new MySimilarity());
>>
>> var hitCollector = new MyHitCollector();
>> var query = MyBuildQueryMethod();
>>
>> searcher.Search(query, hitCollector);
>>
>> The Collect method of MyHitCollector gets called for every matching
>> document
>> with a docId and a score based on MySimilarity implementation.
>>
>> In 2.9 HitCollector is replaced with Collector with 4 abstract
>> methods
>> to
>> implement. One of them is SetScorer(Scorer). This is where I'm
>> getting
>> lost.
>> Where do I get an instance of scorer from? There are many, one for
>> each
>> query type. I think I'm missing something fundamental. Please
>> clarify.
>>
>> Regards,
>> Art
>>
>>
Re: new 2.9 search API
Posted by Artem Chereisky <a....@gmail.com>.
Thanks Michael, you pointed me in the right direction.
To answer my own question and to close the thread, I didn't need to worry
about passing an instance of Scorer to MyCollector instance. The SetScorer
method is called by the Searcher, so to enable scoring I needed to do 5
things:
1. have an instance of Scorer in MyCollector class
2. have an int member _docBase
private int _docBase;
private Scorer _scorer;
3. implement SetNextReader
public override void SetNextReader(Lucene.Net.Index.IndexReader
reader, int docBase)
{
_docBase = docBase;
}
4. Implement SetScorer
public override void SetScorer(Scorer scorer)
{
_scorer = scorer;
}
5. and finally call _scorer.Score() in Collect method adding _docBase to the
result of _scorer.Score().
Thanks,
Art
On Tue, Mar 9, 2010 at 3:11 AM, Michael Garski <mg...@myspace-inc.com>wrote:
> Artem,
>
> The four methods on the Collector abstract class are invoked by the
> searcher that is performing the search, and are up to you to implement
> in your collector as is necessary.
>
> With the change to segment-by-segment searching in 2.9, SetScorer and
> SetNextReader allow the searcher to pass the current Scorer and
> IndexReader to the collector.
>
> The javadocs have the same content as the .NET documentation comments,
> and the one for Collector is:
> http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col
> lector.html<http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col%0Alector.html>
>
> Additionally, take a look at the Test project file Search\QueryUtils.cs
> - there are two Collector implementations in it -
> AnonymousClassCollector & AnonymousClassCollector1 that are good
> examples of how to implement a concrete Collector.
>
> Michael
>
>
>
>
> -----Original Message-----
> From: Artem Chereisky [mailto:a.chereisky@gmail.com]
> Sent: Sunday, March 07, 2010 8:27 PM
> To: lucene-net-user@lucene.apache.org
> Subject: new 2.9 search API
>
> Hi,
>
> I'm migrating my code to 2.9 and I'm having trouble understanding the
> new
> search API
>
> Here is a sample 2.4 code:
>
> var searcher = new IndexSearcher(someReader);
> searcher.SetSimilarity(new MySimilarity());
>
> var hitCollector = new MyHitCollector();
> var query = MyBuildQueryMethod();
>
> searcher.Search(query, hitCollector);
>
> The Collect method of MyHitCollector gets called for every matching
> document
> with a docId and a score based on MySimilarity implementation.
>
> In 2.9 HitCollector is replaced with Collector with 4 abstract methods
> to
> implement. One of them is SetScorer(Scorer). This is where I'm getting
> lost.
> Where do I get an instance of scorer from? There are many, one for each
> query type. I think I'm missing something fundamental. Please clarify.
>
> Regards,
> Art
>
>
RE: new 2.9 search API
Posted by Michael Garski <mg...@myspace-inc.com>.
Artem,
The four methods on the Collector abstract class are invoked by the
searcher that is performing the search, and are up to you to implement
in your collector as is necessary.
With the change to segment-by-segment searching in 2.9, SetScorer and
SetNextReader allow the searcher to pass the current Scorer and
IndexReader to the collector.
The javadocs have the same content as the .NET documentation comments,
and the one for Collector is:
http://lucene.apache.org/java/2_9_2/api/all/org/apache/lucene/search/Col
lector.html
Additionally, take a look at the Test project file Search\QueryUtils.cs
- there are two Collector implementations in it -
AnonymousClassCollector & AnonymousClassCollector1 that are good
examples of how to implement a concrete Collector.
Michael
-----Original Message-----
From: Artem Chereisky [mailto:a.chereisky@gmail.com]
Sent: Sunday, March 07, 2010 8:27 PM
To: lucene-net-user@lucene.apache.org
Subject: new 2.9 search API
Hi,
I'm migrating my code to 2.9 and I'm having trouble understanding the
new
search API
Here is a sample 2.4 code:
var searcher = new IndexSearcher(someReader);
searcher.SetSimilarity(new MySimilarity());
var hitCollector = new MyHitCollector();
var query = MyBuildQueryMethod();
searcher.Search(query, hitCollector);
The Collect method of MyHitCollector gets called for every matching
document
with a docId and a score based on MySimilarity implementation.
In 2.9 HitCollector is replaced with Collector with 4 abstract methods
to
implement. One of them is SetScorer(Scorer). This is where I'm getting
lost.
Where do I get an instance of scorer from? There are many, one for each
query type. I think I'm missing something fundamental. Please clarify.
Regards,
Art