You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Paul Libbrecht <pa...@activemath.org> on 2009/03/22 21:30:18 UTC

How to know the matched field?

Hello list,

in an auto-completion task, I would like to show to the user the field  
that's been matched against the query in the found document.

Typically, my documents have multiple fields for each field-name and I  
would like the index's findings to give me the field used. How can I  
do that?

It seems to me a task of the highlighter (or of the QueryScorer?) but  
I am actually not interested into extracting the fragment found just  
to know the exact field found.

thanks in advance

paul

Re: How to know the matched field?

Posted by Paul Libbrecht <pa...@activemath.org>.
Here's my first approach but I note that, typically, I have fields  
(which are not stored) which may be the matching field but still not  
be the one I want to return.
Typically, I have a field "names in all languages along the standard- 
analyzer" which is not the one I want to "see as matched".

         query = query.rewrite(this.getReader());
         QueryScorer scorer = new QueryScorer(query);
         String found = null;
         float maxScore = 0;
         for(Field f: (List<Field>) doc.getFields()) {
             String text = f.stringValue();
             scorer.startFragment(new TextFragment(new  
StringBuffer(text),0,text.length()));
             TokenStream tok = analyzer
                     .tokenStream(f.name(),new StringReader(text));
             System.out.println("Field: " + f + ":: " +f.name() + ": "  
+ f.stringValue());
             Token t=new Token();
             while(tok!=null && (t=tok.next(t))!=null) {
                 float s = scorer.getTokenScore(t);
             }

             float score = scorer.getFragmentScore();
             if(score > maxScore) {
                 maxScore = score;
                 found = text;
             }
         }


I still don't grasp why there's TextFragment(stringbuffer) and the  
pass through the tokenizers but removing any of them breaks my unit- 
test. I guess this is the whole idead behind LUCENE-1522 which I would  
up-take later.

paul


Le 23-mars-09 à 11:35, Paul Libbrecht a écrit :

> Thanks Erick,
>
> I browsed but no full answer yet.
>
> The closest seems to be the explain method with which I could find  
> the exact term-query or prefix-query that matched it, so I would be  
> able to find the name of the field. I am still left with iterating  
> through the (stored) fields and try to find the individual fields  
> that matched.
>
> I could also make a token-stream with all fields' contents and find  
> the field (the fragment) which gets the best score with  
> QueryScorer(query)?
> (provided query is "rewritten" so that no prefixquery appears  
> anymore, right?)
>
> Sounds doable but please confirm this is a correct usage of  
> QueryScorer, I am feeling a bit unsafe here.
>
> paul
>
> Le 22-mars-09 à 22:22, Erick Erickson a écrit :
>
>> Try searching the mail archives, the searchable archive is linked to
>> off the Wiki. This topic has been discussed multiple times but I  
>> forget
>> the solutions...
>>
>> On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht  
>> <pa...@activemath.org> wrote:
>>> in an auto-completion task, I would like to show to the user the  
>>> field
>>> that's been matched against the query in the found document.
>>>
>>> Typically, my documents have multiple fields for each field-name  
>>> and I
>>> would like the index's findings to give me the field used. How can  
>>> I do
>>> that?
>>>
>>> It seems to me a task of the highlighter (or of the QueryScorer?)  
>>> but I am
>>> actually not interested into extracting the fragment found just to  
>>> know the
>>> exact field found.
>>>
>>> thanks in advance
>>>
>>> paul
>


Re: How to know the matched field?

Posted by Paul Libbrecht <pa...@activemath.org>.
Thanks Erick,

I browsed but no full answer yet.

The closest seems to be the explain method with which I could find the  
exact term-query or prefix-query that matched it, so I would be able  
to find the name of the field. I am still left with iterating through  
the (stored) fields and try to find the individual fields that matched.

I could also make a token-stream with all fields' contents and find  
the field (the fragment) which gets the best score with  
QueryScorer(query)?
(provided query is "rewritten" so that no prefixquery appears anymore,  
right?)

Sounds doable but please confirm this is a correct usage of  
QueryScorer, I am feeling a bit unsafe here.

paul

Le 22-mars-09 à 22:22, Erick Erickson a écrit :

> Try searching the mail archives, the searchable archive is linked to
> off the Wiki. This topic has been discussed multiple times but I  
> forget
> the solutions...
>
> On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht  
> <pa...@activemath.org> wrote:
>> in an auto-completion task, I would like to show to the user the  
>> field
>> that's been matched against the query in the found document.
>>
>> Typically, my documents have multiple fields for each field-name  
>> and I
>> would like the index's findings to give me the field used. How can  
>> I do
>> that?
>>
>> It seems to me a task of the highlighter (or of the QueryScorer?)  
>> but I am
>> actually not interested into extracting the fragment found just to  
>> know the
>> exact field found.
>>
>> thanks in advance
>>
>> paul


Re: How to know the matched field?

Posted by Erick Erickson <er...@gmail.com>.
Try searching the mail archives, the searchable archive is linked to
off the Wiki. This topic has been discussed multiple times but I forget
the solutions...

Best
Erick

On Sun, Mar 22, 2009 at 4:30 PM, Paul Libbrecht <pa...@activemath.org> wrote:

>
> Hello list,
>
> in an auto-completion task, I would like to show to the user the field
> that's been matched against the query in the found document.
>
> Typically, my documents have multiple fields for each field-name and I
> would like the index's findings to give me the field used. How can I do
> that?
>
> It seems to me a task of the highlighter (or of the QueryScorer?) but I am
> actually not interested into extracting the fragment found just to know the
> exact field found.
>
> thanks in advance
>
> paul