You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by to...@aim.com on 2009/05/27 19:32:39 UTC

Top N Phrases in subset of documents

Hi All,
I need to determine top words/phrases in my documents, and?currently using the ShingleAnalyzerWrapper for indexing.
Through Luke it seems the top terms are correct for the whole index.

Is it possible to determine the top terms for?a subset of documents in the index?? Or do I need to?create a new index for the subset of documents?

Thus, a usage example would be:
?a) User searched and found 1000 documents
?b) Based on these new 1K documents, I need to recalculate the top words/phrases.


Thanks in advance for any assistance.

-tommy

Re: Top N Phrases in subset of documents

Posted by Preetham Kajekar <pr...@cisco.com>.

http://stackoverflow.com/questions/195434/how-can-i-get-top-terms-for-a-subset-of-documents-in-a-lucene-index 



tommyha@aim.com wrote:
> Hi All,
> I need to determine top words/phrases in my documents, and?currently using the ShingleAnalyzerWrapper for indexing.
> Through Luke it seems the top terms are correct for the whole index.
>
> Is it possible to determine the top terms for?a subset of documents in the index?? Or do I need to?create a new index for the subset of documents?
>
> Thus, a usage example would be:
> ?a) User searched and found 1000 documents
> ?b) Based on these new 1K documents, I need to recalculate the top words/phrases.
>
>
> Thanks in advance for any assistance.
>
> -tommy
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org