You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by damian2b <da...@o2.pl> on 2011/10/20 10:21:37 UTC

Return Lucene field name when a query is matched

Hi,

I was given a task to investigate whether it is possible to return Lucene field name when a query is matched.
At the moment our application returns the usual matched docs, but the new requirement would be to also know which field matched the query (e.g. found in title, header, etc.).
We use Lucene 3.0.3.

Is this doable? Any pointers please.
Is this supported in newer versions of Lucene? If so this might be my recommendation. I have googled a bit, but with no luck.
Any help or suggestions much appreciated.

Many thanks,

Damo

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Return Lucene field name when a query is matched

Posted by Ian Lea <ia...@gmail.com>.
Here's the output from a little test program on a 2 doc index.

Query: title:bends author:bends
Hit: title=the bends, author=radiohead
exp:  0.24439742 = (MATCH) product of:
  0.48879483 = (MATCH) sum of:
    0.48879483 = (MATCH) weight(title:bends in 1), product of:
      0.5564505 = queryWeight(title:bends), product of:
        1.4054651 = idf(docFreq=1, maxDocs=3)
        0.3959191 = queryNorm
      0.8784157 = (MATCH) fieldWeight(title:bends in 1), product of:
        1.0 = tf(termFreq(title:bends)=1)
        1.4054651 = idf(docFreq=1, maxDocs=3)
        0.625 = fieldNorm(field=title, doc=1)
  0.5 = coord(1/2)


Query: title:radiohead author:radiohead
Hit: title=the bends, author=radiohead
exp:  0.21508263 = (MATCH) product of:
  0.43016526 = (MATCH) sum of:
    0.43016526 = (MATCH) weight(author:radiohead in 1), product of:
      0.43016526 = queryWeight(author:radiohead), product of:
        1.0 = idf(docFreq=2, maxDocs=3)
        0.43016526 = queryNorm
      1.0 = (MATCH) fieldWeight(author:radiohead in 1), product of:
        1.0 = tf(termFreq(author:radiohead)=1)
        1.0 = idf(docFreq=2, maxDocs=3)
        1.0 = fieldNorm(field=author, doc=1)
  0.5 = coord(1/2)

Hit: title=ok computer, author=radiohead
exp:  0.21508263 = (MATCH) product of:
  0.43016526 = (MATCH) sum of:
    0.43016526 = (MATCH) weight(author:radiohead in 2), product of:
      0.43016526 = queryWeight(author:radiohead), product of:
        1.0 = idf(docFreq=2, maxDocs=3)
        0.43016526 = queryNorm
      1.0 = (MATCH) fieldWeight(author:radiohead in 2), product of:
        1.0 = tf(termFreq(author:radiohead)=1)
        1.0 = idf(docFreq=2, maxDocs=3)
        1.0 = fieldNorm(field=author, doc=2)
  0.5 = coord(1/2)

The "exp" is simply the result of toString() on the Explanation
returned by IndexSearcher.explain().

The output shows that the single hit for search term "bends" matched
on the title field and the 2 hits for search term "radiohead" matched
on the author field.


Any clearer?


--
Ian.


On Fri, Oct 21, 2011 at 3:11 AM, Mead Lai <la...@gmail.com> wrote:
> you description was not clear.
>
> a query will return lots of results,
> so every item will be different on matched field name, if you use
> Boolean_OR to query.
>
> if you use boolean and, then every field will match.
>
> Regards,
> Mead
>
>
> On Thu, Oct 20, 2011 at 11:22 PM, Ian Lea <ia...@gmail.com> wrote:
>
>> You can work it out from the Explanation returned by
>> IndexSearcher.explain method.  Note the performance warning in the
>> javadocs.
>>
>> --
>> Ian.
>>
>>
>> On Thu, Oct 20, 2011 at 2:35 PM, Mihai Caraman <ca...@gmail.com>
>> wrote:
>> > So now you have something like query[title,content,header,...].
>> > Evidently you can find out by query[title], query[content],
>> query[header].
>> > But you'd have to the merge the results. Maybe there's a collector for
>> this.
>> >
>> > 2011/10/20 damian2b <da...@o2.pl>
>> >
>> >> Hi,
>> >>
>> >> I was given a task to investigate whether it is possible to return
>> Lucene
>> >> field name when a query is matched.
>> >> At the moment our application returns the usual matched docs, but the
>> new
>> >> requirement would be to also know which field matched the query (e.g.
>> found
>> >> in title, header, etc.).
>> >> We use Lucene 3.0.3.
>> >>
>> >> Is this doable? Any pointers please.
>> >> Is this supported in newer versions of Lucene? If so this might be my
>> >> recommendation. I have googled a bit, but with no luck.
>> >> Any help or suggestions much appreciated.
>> >>
>> >> Many thanks,
>> >>
>> >> Damo
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Return Lucene field name when a query is matched

Posted by Mead Lai <la...@gmail.com>.
you description was not clear.

a query will return lots of results,
so every item will be different on matched field name, if you use
Boolean_OR to query.

if you use boolean and, then every field will match.

Regards,
Mead


On Thu, Oct 20, 2011 at 11:22 PM, Ian Lea <ia...@gmail.com> wrote:

> You can work it out from the Explanation returned by
> IndexSearcher.explain method.  Note the performance warning in the
> javadocs.
>
> --
> Ian.
>
>
> On Thu, Oct 20, 2011 at 2:35 PM, Mihai Caraman <ca...@gmail.com>
> wrote:
> > So now you have something like query[title,content,header,...].
> > Evidently you can find out by query[title], query[content],
> query[header].
> > But you'd have to the merge the results. Maybe there's a collector for
> this.
> >
> > 2011/10/20 damian2b <da...@o2.pl>
> >
> >> Hi,
> >>
> >> I was given a task to investigate whether it is possible to return
> Lucene
> >> field name when a query is matched.
> >> At the moment our application returns the usual matched docs, but the
> new
> >> requirement would be to also know which field matched the query (e.g.
> found
> >> in title, header, etc.).
> >> We use Lucene 3.0.3.
> >>
> >> Is this doable? Any pointers please.
> >> Is this supported in newer versions of Lucene? If so this might be my
> >> recommendation. I have googled a bit, but with no luck.
> >> Any help or suggestions much appreciated.
> >>
> >> Many thanks,
> >>
> >> Damo
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Return Lucene field name when a query is matched

Posted by Ian Lea <ia...@gmail.com>.
You can work it out from the Explanation returned by
IndexSearcher.explain method.  Note the performance warning in the
javadocs.

--
Ian.


On Thu, Oct 20, 2011 at 2:35 PM, Mihai Caraman <ca...@gmail.com> wrote:
> So now you have something like query[title,content,header,...].
> Evidently you can find out by query[title], query[content], query[header].
> But you'd have to the merge the results. Maybe there's a collector for this.
>
> 2011/10/20 damian2b <da...@o2.pl>
>
>> Hi,
>>
>> I was given a task to investigate whether it is possible to return Lucene
>> field name when a query is matched.
>> At the moment our application returns the usual matched docs, but the new
>> requirement would be to also know which field matched the query (e.g. found
>> in title, header, etc.).
>> We use Lucene 3.0.3.
>>
>> Is this doable? Any pointers please.
>> Is this supported in newer versions of Lucene? If so this might be my
>> recommendation. I have googled a bit, but with no luck.
>> Any help or suggestions much appreciated.
>>
>> Many thanks,
>>
>> Damo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Return Lucene field name when a query is matched

Posted by Mihai Caraman <ca...@gmail.com>.
So now you have something like query[title,content,header,...].
Evidently you can find out by query[title], query[content], query[header].
But you'd have to the merge the results. Maybe there's a collector for this.

2011/10/20 damian2b <da...@o2.pl>

> Hi,
>
> I was given a task to investigate whether it is possible to return Lucene
> field name when a query is matched.
> At the moment our application returns the usual matched docs, but the new
> requirement would be to also know which field matched the query (e.g. found
> in title, header, etc.).
> We use Lucene 3.0.3.
>
> Is this doable? Any pointers please.
> Is this supported in newer versions of Lucene? If so this might be my
> recommendation. I have googled a bit, but with no luck.
> Any help or suggestions much appreciated.
>
> Many thanks,
>
> Damo
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>