You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by prashant ullegaddi <pr...@gmail.com> on 2009/07/30 20:41:40 UTC

Term's frequency

How to get the number of times a term occurs in the Lucene index?

Regards,
Prashant.

Re: Term's frequency

Posted by prashant ullegaddi <pr...@gmail.com>.
Thanks Ahmet. This answers my question.

On Fri, Jul 31, 2009 at 1:30 PM, AHMET ARSLAN <io...@yahoo.com> wrote:

>
>
> > Given a term say "apache", I want to look up the lucene index
> > programmatically to find out its frequency in the corpus.
>
> I think you are asking collection frequency of a term. Term Frequency is
> defined between a document and a term which is printed in the loop in the
> following code. And at the end there is collection freq. which sum of tfs.
>
> String path = "E:\\ThesaurusSolrHome\\data\\index";
>        String field = "contents";
>        Term term = new Term(field, "apache");
>
>        IndexReader indexReader = IndexReader.open(path);
>
>        TermDocs termDocs = indexReader.termDocs(term);
>        int collectionFreq = 0;
>        while (termDocs.next()) {
>            System.out.print("Document " + termDocs.doc() + " contains the
> term " + term.text() + " ");
>            System.out.println(termDocs.freq() + " times");
>            collectionFreq += termDocs.freq();
>        }
>        indexReader.close();
>        System.out.println("Collection frequency of " + term.text() + " = "
> + collectionFreq);
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Term's frequency

Posted by AHMET ARSLAN <io...@yahoo.com>.

> Given a term say "apache", I want to look up the lucene index
> programmatically to find out its frequency in the corpus.

I think you are asking collection frequency of a term. Term Frequency is defined between a document and a term which is printed in the loop in the following code. And at the end there is collection freq. which sum of tfs.

String path = "E:\\ThesaurusSolrHome\\data\\index";
        String field = "contents";
        Term term = new Term(field, "apache");

        IndexReader indexReader = IndexReader.open(path);

        TermDocs termDocs = indexReader.termDocs(term);
        int collectionFreq = 0;
        while (termDocs.next()) {
            System.out.print("Document " + termDocs.doc() + " contains the term " + term.text() + " ");
            System.out.println(termDocs.freq() + " times");
            collectionFreq += termDocs.freq();
        }
        indexReader.close();
        System.out.println("Collection frequency of " + term.text() + " = " + collectionFreq);






      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Term's frequency

Posted by prashant ullegaddi <pr...@gmail.com>.
Given a term say "apache", I want to look up the lucene index
programmatically to find out its frequency in the corpus.

On Fri, Jul 31, 2009 at 12:23 AM, <oh...@cox.net> wrote:

>
> ---- prashant ullegaddi <pr...@gmail.com> wrote:
> > How to get the number of times a term occurs in the Lucene index?
> >
> > Regards,
> > Prashant.
>
>
> Hi,
>
> You didn't mention if you were looking for something programmatic or not,
> but there's a tool called "Luke", and when you start that up and point it to
> your index dir, it shows the terms and frequency.
>
> Jim
>

Re: Term's frequency

Posted by oh...@cox.net.
---- prashant ullegaddi <pr...@gmail.com> wrote: 
> How to get the number of times a term occurs in the Lucene index?
> 
> Regards,
> Prashant.


Hi,

You didn't mention if you were looking for something programmatic or not, but there's a tool called "Luke", and when you start that up and point it to your index dir, it shows the terms and frequency.

Jim

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org