You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Mysurf Mail <st...@gmail.com> on 2013/08/06 10:24:46 UTC

Knowing what field caused the retrival of the document

I have two indexed fields in my document.- Name, Comment.
The user searches for a phrase and I need to act differently if it appeared
in the comment or the name.
Is there a way to know why the document was retrieved?
Thanks.

Re: Knowing what field caused the retrival of the document

Posted by Raymond Wiker <rw...@gmail.com>.

One option might be to run two queries with fq set to +name:"whatever phrase" and +comment:"whatever phrase". The query results may then be annotated and merged (assuming that the hit scores only depend on the main query and the document content - i.e, no normalization, and no score contribution from fq.)

On Aug 6, 2013, at 18:27 , Jeff Wartes <jw...@whitepages.com> wrote:
> 
> For what it's worth, I had the same question last year, and I never really
> got a good solution:
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3C81
> E9A7879C550B42A767F0B86B2B81591A15BFDD@ex4.corp.w3data.com%3E
> 
> 
> I dug into the highlight component for a while, but it turned out I
> couldn't use that approach. I'm afraid I don't recall exactly why. The
> debugQuery method had a *huge* performance cost, so that was a non-starter.
> 
> I managed to solve a subset of my problem by writing a custom
> QueryComponent that (re)examines the documents being returned and
> annotates the response. This worked because I was able to reduced my
> problem space to just determining whether a document was a string-literal
> match with the query vs via synonyms or other fuzzy expansion.
> It still required that I had stored and was returning the fields I wanted
> to (re)examine, so it was hardly ideal on several fronts.
> 
> If you can, I'd suggest just doing two queries.
> 
> 
> On 8/6/13 7:38 AM, "Jack Krupansky" <ja...@basetechnology.com> wrote:
> 
>> Add the debugQuery=true parameter and the "explain"section will detail
>> exactly what terms matched for each document.
>> 
>> You could also use the Solr term sectors component to get info on what
>> terms 
>> occur where in a document, but that adds more overhead to the index for
>> "stored term vectors".
>> 
>> -- Jack Krupansky
>> 
>> -----Original Message-----
>> From: Mysurf Mail
>> Sent: Tuesday, August 06, 2013 5:59 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Knowing what field caused the retrival of the document
>> 
>> But what if this for multiple words ?
>> I am guessing solr knows why the document is there since I get to see the
>> paragraph in the highlight.(hl) section.
>> 
>> 
>> On Tue, Aug 6, 2013 at 11:36 AM, Raymond Wiker <rw...@gmail.com> wrote:
>> 
>>> If you were searching for single words (terms), you could use the 'tf'
>>> function, by adding something like
>>> 
>>> matchesinname:tf(name, "whatever")
>>> 
>>> to the 'fl' parameter - if the 'name' field contains "whatever", the
>>> (result) field 'matchesinname' will be 1.
>>> 
>>> 
>>> 
>>> 
>>> On Tue, Aug 6, 2013 at 10:24 AM, Mysurf Mail <st...@gmail.com>
>>> wrote:
>>> 
>>>> I have two indexed fields in my document.- Name, Comment.
>>>> The user searches for a phrase and I need to act differently if it
>>> appeared
>>>> in the comment or the name.
>>>> Is there a way to know why the document was retrieved?
>>>> Thanks.
>>>> 
>>> 
>> 
>

Re: Knowing what field caused the retrival of the document

Posted by Jeff Wartes <jw...@whitepages.com>.

For what it's worth, I had the same question last year, and I never really
got a good solution:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3C81
E9A7879C550B42A767F0B86B2B81591A15BFDD@ex4.corp.w3data.com%3E

I dug into the highlight component for a while, but it turned out I
couldn't use that approach. I'm afraid I don't recall exactly why. The
debugQuery method had a *huge* performance cost, so that was a non-starter.

I managed to solve a subset of my problem by writing a custom
QueryComponent that (re)examines the documents being returned and
annotates the response. This worked because I was able to reduced my
problem space to just determining whether a document was a string-literal
match with the query vs via synonyms or other fuzzy expansion.
It still required that I had stored and was returning the fields I wanted
to (re)examine, so it was hardly ideal on several fronts.

If you can, I'd suggest just doing two queries.

On 8/6/13 7:38 AM, "Jack Krupansky" <ja...@basetechnology.com> wrote:

>Add the debugQuery=true parameter and the "explain"section will detail
>exactly what terms matched for each document.
>
>You could also use the Solr term sectors component to get info on what
>terms 
>occur where in a document, but that adds more overhead to the index for
>"stored term vectors".
>
>-- Jack Krupansky
>
>-----Original Message-----
>From: Mysurf Mail
>Sent: Tuesday, August 06, 2013 5:59 AM
>To: solr-user@lucene.apache.org
>Subject: Re: Knowing what field caused the retrival of the document
>
>But what if this for multiple words ?
>I am guessing solr knows why the document is there since I get to see the
>paragraph in the highlight.(hl) section.
>
>
>On Tue, Aug 6, 2013 at 11:36 AM, Raymond Wiker <rw...@gmail.com> wrote:
>
>> If you were searching for single words (terms), you could use the 'tf'
>> function, by adding something like
>>
>> matchesinname:tf(name, "whatever")
>>
>> to the 'fl' parameter - if the 'name' field contains "whatever", the
>> (result) field 'matchesinname' will be 1.
>>
>>
>>
>>
>> On Tue, Aug 6, 2013 at 10:24 AM, Mysurf Mail <st...@gmail.com>
>> wrote:
>>
>> > I have two indexed fields in my document.- Name, Comment.
>> > The user searches for a phrase and I need to act differently if it
>> appeared
>> > in the comment or the name.
>> > Is there a way to know why the document was retrieved?
>> > Thanks.
>> >
>> 
>

Re: Knowing what field caused the retrival of the document

Posted by Jack Krupansky <ja...@basetechnology.com>.

Add the debugQuery=true parameter and the "explain"section will detail 
exactly what terms matched for each document.

You could also use the Solr term sectors component to get info on what terms 
occur where in a document, but that adds more overhead to the index for 
"stored term vectors".

-- Jack Krupansky

-----Original Message----- 
From: Mysurf Mail
Sent: Tuesday, August 06, 2013 5:59 AM
To: solr-user@lucene.apache.org
Subject: Re: Knowing what field caused the retrival of the document

But what if this for multiple words ?
I am guessing solr knows why the document is there since I get to see the
paragraph in the highlight.(hl) section.

On Tue, Aug 6, 2013 at 11:36 AM, Raymond Wiker <rw...@gmail.com> wrote:

> If you were searching for single words (terms), you could use the 'tf'
> function, by adding something like
>
> matchesinname:tf(name, "whatever")
>
> to the 'fl' parameter - if the 'name' field contains "whatever", the
> (result) field 'matchesinname' will be 1.
>
>
>
>
> On Tue, Aug 6, 2013 at 10:24 AM, Mysurf Mail <st...@gmail.com>
> wrote:
>
> > I have two indexed fields in my document.- Name, Comment.
> > The user searches for a phrase and I need to act differently if it
> appeared
> > in the comment or the name.
> > Is there a way to know why the document was retrieved?
> > Thanks.
> >
>

Re: Knowing what field caused the retrival of the document

Posted by Mysurf Mail <st...@gmail.com>.

But what if this for multiple words ?
I am guessing solr knows why the document is there since I get to see the
paragraph in the highlight.(hl) section.


On Tue, Aug 6, 2013 at 11:36 AM, Raymond Wiker <rw...@gmail.com> wrote:

> If you were searching for single words (terms), you could use the 'tf'
> function, by adding something like
>
> matchesinname:tf(name, "whatever")
>
> to the 'fl' parameter - if the 'name' field contains "whatever", the
> (result) field 'matchesinname' will be 1.
>
>
>
>
> On Tue, Aug 6, 2013 at 10:24 AM, Mysurf Mail <st...@gmail.com>
> wrote:
>
> > I have two indexed fields in my document.- Name, Comment.
> > The user searches for a phrase and I need to act differently if it
> appeared
> > in the comment or the name.
> > Is there a way to know why the document was retrieved?
> > Thanks.
> >
>

Re: Knowing what field caused the retrival of the document

Posted by Raymond Wiker <rw...@gmail.com>.

If you were searching for single words (terms), you could use the 'tf'
function, by adding something like

matchesinname:tf(name, "whatever")

to the 'fl' parameter - if the 'name' field contains "whatever", the
(result) field 'matchesinname' will be 1.

On Tue, Aug 6, 2013 at 10:24 AM, Mysurf Mail <st...@gmail.com> wrote:

> I have two indexed fields in my document.- Name, Comment.
> The user searches for a phrase and I need to act differently if it appeared
> in the comment or the name.
> Is there a way to know why the document was retrieved?
> Thanks.
>