You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by teramera <te...@gmail.com> on 2007/03/07 22:04:35 UTC

Term Frequency within Hits

So after I execute a search I end up with a 'Hits' object. The number of Hits
is the order of a million.
What I want to do is from these Hits is extract term frequencies for a few
known fields. I don't have a global list of terms for any of the fields but
want to generate  the term frequency based on terms from the Hits.

Iterating over the hits and doing this later is of course turning out to be
very expensive.
Is there a known Lucene way of solving such a problem so that this
calculation happens as the hits are being accumulated?  
Appreciate any pointers,

-- 
View this message in context: http://www.nabble.com/Term-Frequency-within-Hits-tf3364987.html#a9362169
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Term Frequency within Hits

Posted by Erick Erickson <er...@gmail.com>.
See TermFreqVector, HitCollector, perhaps TopDocs, perhaps
TermEnum. Make sure you create your index such that frequencies
are stored (see the FAQ).

Erick

On 3/7/07, teramera <te...@gmail.com> wrote:
>
>
> So after I execute a search I end up with a 'Hits' object. The number of
> Hits
> is the order of a million.
> What I want to do is from these Hits is extract term frequencies for a few
> known fields. I don't have a global list of terms for any of the fields
> but
> want to generate  the term frequency based on terms from the Hits.
>
> Iterating over the hits and doing this later is of course turning out to
> be
> very expensive.
> Is there a known Lucene way of solving such a problem so that this
> calculation happens as the hits are being accumulated?
> Appreciate any pointers,
>
> --
> View this message in context:
> http://www.nabble.com/Term-Frequency-within-Hits-tf3364987.html#a9362169
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>