You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jeff <je...@gmail.com> on 2007/12/20 20:54:14 UTC

Hit Count per Document

I don't care about score, but I do care about the # of times a query was hit
within a document? example:

the quick brown fox jumped over the lazy dog
 the quick brown fox jumped over the lazy dog
 the quick brown fox jumped over the lazy dog
 the quick brown fox jumped over the lazy dog
the slow brown fox jumped over the lazy dog

If I searched for "quick brown", is there a way I could see that it was hit
4 times within the document?

Thanks,
Jeff

Re: Hit Count per Document

Posted by Mark Miller <ma...@gmail.com>.
Gotchya. Well, if you want to check a doc at a time you could use 
getSpans for a NearSpan query and just count how many you get. No ideas 
off the top of my head if you want the result like a score in that you 
get it for each hit in a search of a whole corpus.

- Mark

Jeff wrote:
> If I am not mistaken, that is for a term.. Is it possible for a query? In
> the below example, I don't want to know how many times brown is in the
> document I want to know how many times "quick brown" is in the document.
>
> Thanks,
> Jeff
>
> On Dec 20, 2007 3:03 PM, Mark Miller <ma...@gmail.com> wrote:
>
>   
>> You can override the scoring system and only score by term frequency
>> (use a 1 or whatever creates a no-op for the other factors). If you have
>> indexed with norms than you will have to use a Reader that ignores them
>> to do this.
>>
>> - Mark
>>
>> Jeff wrote:
>>     
>>> I don't care about score, but I do care about the # of times a query was
>>>       
>> hit
>>     
>>> within a document? example:
>>>
>>> the quick brown fox jumped over the lazy dog
>>>  the quick brown fox jumped over the lazy dog
>>>  the quick brown fox jumped over the lazy dog
>>>  the quick brown fox jumped over the lazy dog
>>> the slow brown fox jumped over the lazy dog
>>>
>>> If I searched for "quick brown", is there a way I could see that it was
>>>       
>> hit
>>     
>>> 4 times within the document?
>>>
>>> Thanks,
>>> Jeff
>>>
>>>
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>     
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Hit Count per Document

Posted by Jeff <je...@gmail.com>.
If I am not mistaken, that is for a term.. Is it possible for a query? In
the below example, I don't want to know how many times brown is in the
document I want to know how many times "quick brown" is in the document.

Thanks,
Jeff

On Dec 20, 2007 3:03 PM, Mark Miller <ma...@gmail.com> wrote:

> You can override the scoring system and only score by term frequency
> (use a 1 or whatever creates a no-op for the other factors). If you have
> indexed with norms than you will have to use a Reader that ignores them
> to do this.
>
> - Mark
>
> Jeff wrote:
> > I don't care about score, but I do care about the # of times a query was
> hit
> > within a document? example:
> >
> > the quick brown fox jumped over the lazy dog
> >  the quick brown fox jumped over the lazy dog
> >  the quick brown fox jumped over the lazy dog
> >  the quick brown fox jumped over the lazy dog
> > the slow brown fox jumped over the lazy dog
> >
> > If I searched for "quick brown", is there a way I could see that it was
> hit
> > 4 times within the document?
> >
> > Thanks,
> > Jeff
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Hit Count per Document

Posted by Mark Miller <ma...@gmail.com>.
You can override the scoring system and only score by term frequency 
(use a 1 or whatever creates a no-op for the other factors). If you have 
indexed with norms than you will have to use a Reader that ignores them 
to do this.

- Mark

Jeff wrote:
> I don't care about score, but I do care about the # of times a query was hit
> within a document? example:
>
> the quick brown fox jumped over the lazy dog
>  the quick brown fox jumped over the lazy dog
>  the quick brown fox jumped over the lazy dog
>  the quick brown fox jumped over the lazy dog
> the slow brown fox jumped over the lazy dog
>
> If I searched for "quick brown", is there a way I could see that it was hit
> 4 times within the document?
>
> Thanks,
> Jeff
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org