You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Twomey, David" <da...@novartis.com> on 2011/07/28 03:12:14 UTC
colocated term stats
Given a query term, is it possible to get from the index the top 10 collocated terms in the index.
ie: return the top 10 terms that appear with this term based on doc count.
A plus would be to add some constraints on how near the terms are in the docs.
Re: colocated term stats
Posted by Jonathan Rochkind <ro...@jhu.edu>.
Not sure if this will do what you want, but one way might be using facets.
Take the term you are interested in, and apply it as an fq. Now the
result set will include only documents that include that term. So also
request facets for that result set, the top 10 facets are the top 10
terms that appear in that result set -- which is the top 10 terms that
appear in documents together with your fq constraint. (Okay, you might
need to look at 11, because one of the facet values will be the same
term you fq constrained). You don't need to look at actual documents at
all (&rows=0), just facet response.
Make sense? Does that do what you want?
On 7/27/2011 9:12 PM, Twomey, David wrote:
> Given a query term, is it possible to get from the index the top 10 collocated terms in the index.
>
> ie: return the top 10 terms that appear with this term based on doc count.
>
> A plus would be to add some constraints on how near the terms are in the docs.
>
>
>
>