You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Edward Garrett <he...@gmail.com> on 2012/11/07 19:15:56 UTC

get a list of terms sorted by total term frequency

hi,

is there a simple way to get a list of all terms that occur in a field
sorted by their total term frequency within that field?

TermsComponent (http://wiki.apache.org/solr/TermsComponent) "provides
fast field faceting over the whole index", but as counts it gives the
number of documents that each term occurs in (given a field or set of
fields). in place of document counts, i want total term frequency
counts. the ttf function
(http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides
this, but only if you know what term to pass to the function.

edward

Re: get a list of terms sorted by total term frequency

Posted by Edward Garrett <he...@gmail.com>.
i see... using the -t flag

it would be cool if TermsComponent had an option to sort by total term
frequency, something like

terms.sort={count|index|ttf}

surely that's a common enough use case


On Wed, Nov 7, 2012 at 6:17 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> Lucene's misc module has HighFreqTerms tool.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett <he...@gmail.com> wrote:
>> hi,
>>
>> is there a simple way to get a list of all terms that occur in a field
>> sorted by their total term frequency within that field?
>>
>> TermsComponent (http://wiki.apache.org/solr/TermsComponent) "provides
>> fast field faceting over the whole index", but as counts it gives the
>> number of documents that each term occurs in (given a field or set of
>> fields). in place of document counts, i want total term frequency
>> counts. the ttf function
>> (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides
>> this, but only if you know what term to pass to the function.
>>
>> edward



-- 
edge

Re: get a list of terms sorted by total term frequency

Posted by Michael McCandless <lu...@mikemccandless.com>.
Lucene's misc module has HighFreqTerms tool.

Mike McCandless

http://blog.mikemccandless.com


On Wed, Nov 7, 2012 at 1:15 PM, Edward Garrett <he...@gmail.com> wrote:
> hi,
>
> is there a simple way to get a list of all terms that occur in a field
> sorted by their total term frequency within that field?
>
> TermsComponent (http://wiki.apache.org/solr/TermsComponent) "provides
> fast field faceting over the whole index", but as counts it gives the
> number of documents that each term occurs in (given a field or set of
> fields). in place of document counts, i want total term frequency
> counts. the ttf function
> (http://wiki.apache.org/solr/FunctionQuery#totaltermfreq) provides
> this, but only if you know what term to pass to the function.
>
> edward