You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ravikumar Govindarajan <ra...@gmail.com> on 2013/05/10 14:24:50 UTC

TermsEnum.docFreq() returns 0

We have the following code

SegmentInfos segments = new SegmentInfos();
 segments.read(luceneDir);
 for(SegmentInfoPerCommit sipc: segments)
{
String name = sipc.info.name;
SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
Terms terms = reader.terms("content");
TermsEnum tEnum = terms.iterator(null);
 tEnum.docFreq(); //VAL=0
 tEnum.totalTermFreq(); //VAL=-1
}

The field "content" is indexed as DOCS_FREQ_AND_POSITION

Why does the docFreq returned as 0 for all terms. Is this expected or am I
doing something wrong?

--
Ravi

Re: TermsEnum.docFreq() returns 0

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Thanks for the help Mike. Was quick to jump to a wrong conclusion

My codec does not implement Term-Vectors, Payloads, DocValues and Norms.

It should be trivial to implement Payloads, but I am not sure about others.

Anyways, I can generate a HTML report and identify failures based on
individual tests

--
Ravi


On Tue, May 14, 2013 at 3:31 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> On Tue, May 14, 2013 at 3:03 AM, Ravikumar Govindarajan
> <ra...@gmail.com> wrote:
> > We ran the checkIndex and a simple test case. It passes. Actually, I had
> > assumed problem with lucene, whereas it was an issue with our custom
> codec.
>
> Phew, thanks for bringing closure!
>
> > I do not know how to confirm whether a new codec works correctly. Are
> there
> > any tools/existing test-cases available for validation?
>
> One really healthy way to test your new codec is to run all Lucene
> tests against it (assume your codec is general, i.e. implements
> everything).
>
> You just need to 1) get your codec onto the test classpath and 2) pass
> -Dtests.codec=YourCodecName to force tests to use it.
>
> I'm not certain about step 1) ... it could be passing -lib to ant does
> that?  But I'm not sure that will propagate to the classpath when ant
> runs the tests ...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
>
> > --
> > Ravi
> >
> >
> >
> > On Mon, May 13, 2013 at 9:19 PM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >> That code looks correct.
> >>
> >> But can you tie it all together into a runnable test case?  Ie add in
> >> the terms enum, calling docFreq and getting 0 when it should be 1.
> >>
> >> Also, if you run CheckIndex on the index produced by the code below,
> >> how many terms/freqs/positions does it report?
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Mon, May 13, 2013 at 9:25 AM, Ravikumar Govindarajan
> >> <ra...@gmail.com> wrote:
> >> > Indexing code below. Looks very simple. Is this correct?
> >> >
> >> >            IndexWriterConfig conf = new
> >> > IndexWriterConfig(Version.LUCENE_42, new
> >> > StandardAnalyzer(Version.LUCENE_42));
> >> >             conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
> >> >             String indexPath = "<some-file-path>";
> >> >             Directory dir=FSDirectory.open(new File(indexPath));
> >> >             writer = new IndexWriter(dir,conf);
> >> >             FieldType type = new FieldType();
> >> >             type.setTokenized(true);
> >> >             type.setIndexed(true);
> >> >  type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
> >> >         Field field = new Field("content", "one two two three", type);
> >> >         luceneDoc.add(field);
> >> >         writer.addDocument(luceneDoc);
> >> >         writer.close();
> >> >
> >> > Reading docFreq and totalTermFreq through terms-enum returns 0 and -1,
> >> for
> >> > all terms
> >> >
> >> > --
> >> > Ravi
> >> >
> >> >
> >> > On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
> >> > lucene@mikemccandless.com> wrote:
> >> >
> >> >> It should not be 0, as long as TermsEnum.next() does not return null
> >> >> ... can you make a small test case?  Thanks.
> >> >>
> >> >> Mike McCandless
> >> >>
> >> >> http://blog.mikemccandless.com
> >> >>
> >> >>
> >> >> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
> >> >> <ra...@gmail.com> wrote:
> >> >> > I have to add that the above code is wrong.
> >> >> >
> >> >> > It has to be
> >> >> >
> >> >> >  while((ref=tEnum.next())!=null)
> >> >> >                     {
> >> >> >                         ref = tEnum.term();
> >> >> >                         tEnum.docFreq(); // Even here VAL=0
> >> >> >                     }
> >> >> >
> >> >> > Apologies for the mistake, but the problem remains
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
> >> >> > ravikumar.govindarajan@gmail.com> wrote:
> >> >> >
> >> >> >> We have the following code
> >> >> >>
> >> >> >> SegmentInfos segments = new SegmentInfos();
> >> >> >>  segments.read(luceneDir);
> >> >> >>  for(SegmentInfoPerCommit sipc: segments)
> >> >> >> {
> >> >> >> String name = sipc.info.name;
> >> >> >> SegmentReader reader = new SegmentReader(sipc, 1, new
> IOContext());
> >> >> >> Terms terms = reader.terms("content");
> >> >> >> TermsEnum tEnum = terms.iterator(null);
> >> >> >>  tEnum.docFreq(); //VAL=0
> >> >> >>  tEnum.totalTermFreq(); //VAL=-1
> >> >> >> }
> >> >> >>
> >> >> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
> >> >> >>
> >> >> >> Why does the docFreq returned as 0 for all terms. Is this
> expected or
> >> >> am I
> >> >> >> doing something wrong?
> >> >> >>
> >> >> >> --
> >> >> >> Ravi
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: TermsEnum.docFreq() returns 0

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Tue, May 14, 2013 at 3:03 AM, Ravikumar Govindarajan
<ra...@gmail.com> wrote:
> We ran the checkIndex and a simple test case. It passes. Actually, I had
> assumed problem with lucene, whereas it was an issue with our custom codec.

Phew, thanks for bringing closure!

> I do not know how to confirm whether a new codec works correctly. Are there
> any tools/existing test-cases available for validation?

One really healthy way to test your new codec is to run all Lucene
tests against it (assume your codec is general, i.e. implements
everything).

You just need to 1) get your codec onto the test classpath and 2) pass
-Dtests.codec=YourCodecName to force tests to use it.

I'm not certain about step 1) ... it could be passing -lib to ant does
that?  But I'm not sure that will propagate to the classpath when ant
runs the tests ...

Mike McCandless

http://blog.mikemccandless.com



> --
> Ravi
>
>
>
> On Mon, May 13, 2013 at 9:19 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> That code looks correct.
>>
>> But can you tie it all together into a runnable test case?  Ie add in
>> the terms enum, calling docFreq and getting 0 when it should be 1.
>>
>> Also, if you run CheckIndex on the index produced by the code below,
>> how many terms/freqs/positions does it report?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Mon, May 13, 2013 at 9:25 AM, Ravikumar Govindarajan
>> <ra...@gmail.com> wrote:
>> > Indexing code below. Looks very simple. Is this correct?
>> >
>> >            IndexWriterConfig conf = new
>> > IndexWriterConfig(Version.LUCENE_42, new
>> > StandardAnalyzer(Version.LUCENE_42));
>> >             conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
>> >             String indexPath = "<some-file-path>";
>> >             Directory dir=FSDirectory.open(new File(indexPath));
>> >             writer = new IndexWriter(dir,conf);
>> >             FieldType type = new FieldType();
>> >             type.setTokenized(true);
>> >             type.setIndexed(true);
>> >  type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
>> >         Field field = new Field("content", "one two two three", type);
>> >         luceneDoc.add(field);
>> >         writer.addDocument(luceneDoc);
>> >         writer.close();
>> >
>> > Reading docFreq and totalTermFreq through terms-enum returns 0 and -1,
>> for
>> > all terms
>> >
>> > --
>> > Ravi
>> >
>> >
>> > On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
>> > lucene@mikemccandless.com> wrote:
>> >
>> >> It should not be 0, as long as TermsEnum.next() does not return null
>> >> ... can you make a small test case?  Thanks.
>> >>
>> >> Mike McCandless
>> >>
>> >> http://blog.mikemccandless.com
>> >>
>> >>
>> >> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
>> >> <ra...@gmail.com> wrote:
>> >> > I have to add that the above code is wrong.
>> >> >
>> >> > It has to be
>> >> >
>> >> >  while((ref=tEnum.next())!=null)
>> >> >                     {
>> >> >                         ref = tEnum.term();
>> >> >                         tEnum.docFreq(); // Even here VAL=0
>> >> >                     }
>> >> >
>> >> > Apologies for the mistake, but the problem remains
>> >> >
>> >> >
>> >> >
>> >> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
>> >> > ravikumar.govindarajan@gmail.com> wrote:
>> >> >
>> >> >> We have the following code
>> >> >>
>> >> >> SegmentInfos segments = new SegmentInfos();
>> >> >>  segments.read(luceneDir);
>> >> >>  for(SegmentInfoPerCommit sipc: segments)
>> >> >> {
>> >> >> String name = sipc.info.name;
>> >> >> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
>> >> >> Terms terms = reader.terms("content");
>> >> >> TermsEnum tEnum = terms.iterator(null);
>> >> >>  tEnum.docFreq(); //VAL=0
>> >> >>  tEnum.totalTermFreq(); //VAL=-1
>> >> >> }
>> >> >>
>> >> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
>> >> >>
>> >> >> Why does the docFreq returned as 0 for all terms. Is this expected or
>> >> am I
>> >> >> doing something wrong?
>> >> >>
>> >> >> --
>> >> >> Ravi
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: TermsEnum.docFreq() returns 0

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
We ran the checkIndex and a simple test case. It passes. Actually, I had
assumed problem with lucene, whereas it was an issue with our custom codec.

I do not know how to confirm whether a new codec works correctly. Are there
any tools/existing test-cases available for validation?

--
Ravi



On Mon, May 13, 2013 at 9:19 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> That code looks correct.
>
> But can you tie it all together into a runnable test case?  Ie add in
> the terms enum, calling docFreq and getting 0 when it should be 1.
>
> Also, if you run CheckIndex on the index produced by the code below,
> how many terms/freqs/positions does it report?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Mon, May 13, 2013 at 9:25 AM, Ravikumar Govindarajan
> <ra...@gmail.com> wrote:
> > Indexing code below. Looks very simple. Is this correct?
> >
> >            IndexWriterConfig conf = new
> > IndexWriterConfig(Version.LUCENE_42, new
> > StandardAnalyzer(Version.LUCENE_42));
> >             conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
> >             String indexPath = "<some-file-path>";
> >             Directory dir=FSDirectory.open(new File(indexPath));
> >             writer = new IndexWriter(dir,conf);
> >             FieldType type = new FieldType();
> >             type.setTokenized(true);
> >             type.setIndexed(true);
> >  type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
> >         Field field = new Field("content", "one two two three", type);
> >         luceneDoc.add(field);
> >         writer.addDocument(luceneDoc);
> >         writer.close();
> >
> > Reading docFreq and totalTermFreq through terms-enum returns 0 and -1,
> for
> > all terms
> >
> > --
> > Ravi
> >
> >
> > On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >> It should not be 0, as long as TermsEnum.next() does not return null
> >> ... can you make a small test case?  Thanks.
> >>
> >> Mike McCandless
> >>
> >> http://blog.mikemccandless.com
> >>
> >>
> >> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
> >> <ra...@gmail.com> wrote:
> >> > I have to add that the above code is wrong.
> >> >
> >> > It has to be
> >> >
> >> >  while((ref=tEnum.next())!=null)
> >> >                     {
> >> >                         ref = tEnum.term();
> >> >                         tEnum.docFreq(); // Even here VAL=0
> >> >                     }
> >> >
> >> > Apologies for the mistake, but the problem remains
> >> >
> >> >
> >> >
> >> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
> >> > ravikumar.govindarajan@gmail.com> wrote:
> >> >
> >> >> We have the following code
> >> >>
> >> >> SegmentInfos segments = new SegmentInfos();
> >> >>  segments.read(luceneDir);
> >> >>  for(SegmentInfoPerCommit sipc: segments)
> >> >> {
> >> >> String name = sipc.info.name;
> >> >> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
> >> >> Terms terms = reader.terms("content");
> >> >> TermsEnum tEnum = terms.iterator(null);
> >> >>  tEnum.docFreq(); //VAL=0
> >> >>  tEnum.totalTermFreq(); //VAL=-1
> >> >> }
> >> >>
> >> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
> >> >>
> >> >> Why does the docFreq returned as 0 for all terms. Is this expected or
> >> am I
> >> >> doing something wrong?
> >> >>
> >> >> --
> >> >> Ravi
> >> >>
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: TermsEnum.docFreq() returns 0

Posted by Michael McCandless <lu...@mikemccandless.com>.
That code looks correct.

But can you tie it all together into a runnable test case?  Ie add in
the terms enum, calling docFreq and getting 0 when it should be 1.

Also, if you run CheckIndex on the index produced by the code below,
how many terms/freqs/positions does it report?

Mike McCandless

http://blog.mikemccandless.com


On Mon, May 13, 2013 at 9:25 AM, Ravikumar Govindarajan
<ra...@gmail.com> wrote:
> Indexing code below. Looks very simple. Is this correct?
>
>            IndexWriterConfig conf = new
> IndexWriterConfig(Version.LUCENE_42, new
> StandardAnalyzer(Version.LUCENE_42));
>             conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
>             String indexPath = "<some-file-path>";
>             Directory dir=FSDirectory.open(new File(indexPath));
>             writer = new IndexWriter(dir,conf);
>             FieldType type = new FieldType();
>             type.setTokenized(true);
>             type.setIndexed(true);
>  type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
>         Field field = new Field("content", "one two two three", type);
>         luceneDoc.add(field);
>         writer.addDocument(luceneDoc);
>         writer.close();
>
> Reading docFreq and totalTermFreq through terms-enum returns 0 and -1, for
> all terms
>
> --
> Ravi
>
>
> On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> It should not be 0, as long as TermsEnum.next() does not return null
>> ... can you make a small test case?  Thanks.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
>> <ra...@gmail.com> wrote:
>> > I have to add that the above code is wrong.
>> >
>> > It has to be
>> >
>> >  while((ref=tEnum.next())!=null)
>> >                     {
>> >                         ref = tEnum.term();
>> >                         tEnum.docFreq(); // Even here VAL=0
>> >                     }
>> >
>> > Apologies for the mistake, but the problem remains
>> >
>> >
>> >
>> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
>> > ravikumar.govindarajan@gmail.com> wrote:
>> >
>> >> We have the following code
>> >>
>> >> SegmentInfos segments = new SegmentInfos();
>> >>  segments.read(luceneDir);
>> >>  for(SegmentInfoPerCommit sipc: segments)
>> >> {
>> >> String name = sipc.info.name;
>> >> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
>> >> Terms terms = reader.terms("content");
>> >> TermsEnum tEnum = terms.iterator(null);
>> >>  tEnum.docFreq(); //VAL=0
>> >>  tEnum.totalTermFreq(); //VAL=-1
>> >> }
>> >>
>> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
>> >>
>> >> Why does the docFreq returned as 0 for all terms. Is this expected or
>> am I
>> >> doing something wrong?
>> >>
>> >> --
>> >> Ravi
>> >>
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: TermsEnum.docFreq() returns 0

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Indexing code below. Looks very simple. Is this correct?

           IndexWriterConfig conf = new
IndexWriterConfig(Version.LUCENE_42, new
StandardAnalyzer(Version.LUCENE_42));
            conf.setOpenMode(OpenMode.CREATE_OR_APPEND);
            String indexPath = "<some-file-path>";
            Directory dir=FSDirectory.open(new File(indexPath));
            writer = new IndexWriter(dir,conf);
            FieldType type = new FieldType();
            type.setTokenized(true);
            type.setIndexed(true);
 type.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
        Field field = new Field("content", "one two two three", type);
        luceneDoc.add(field);
        writer.addDocument(luceneDoc);
        writer.close();

Reading docFreq and totalTermFreq through terms-enum returns 0 and -1, for
all terms

--
Ravi


On Fri, May 10, 2013 at 10:19 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> It should not be 0, as long as TermsEnum.next() does not return null
> ... can you make a small test case?  Thanks.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
> <ra...@gmail.com> wrote:
> > I have to add that the above code is wrong.
> >
> > It has to be
> >
> >  while((ref=tEnum.next())!=null)
> >                     {
> >                         ref = tEnum.term();
> >                         tEnum.docFreq(); // Even here VAL=0
> >                     }
> >
> > Apologies for the mistake, but the problem remains
> >
> >
> >
> > On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
> > ravikumar.govindarajan@gmail.com> wrote:
> >
> >> We have the following code
> >>
> >> SegmentInfos segments = new SegmentInfos();
> >>  segments.read(luceneDir);
> >>  for(SegmentInfoPerCommit sipc: segments)
> >> {
> >> String name = sipc.info.name;
> >> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
> >> Terms terms = reader.terms("content");
> >> TermsEnum tEnum = terms.iterator(null);
> >>  tEnum.docFreq(); //VAL=0
> >>  tEnum.totalTermFreq(); //VAL=-1
> >> }
> >>
> >> The field "content" is indexed as DOCS_FREQ_AND_POSITION
> >>
> >> Why does the docFreq returned as 0 for all terms. Is this expected or
> am I
> >> doing something wrong?
> >>
> >> --
> >> Ravi
> >>
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: TermsEnum.docFreq() returns 0

Posted by Michael McCandless <lu...@mikemccandless.com>.
It should not be 0, as long as TermsEnum.next() does not return null
... can you make a small test case?  Thanks.

Mike McCandless

http://blog.mikemccandless.com


On Fri, May 10, 2013 at 8:26 AM, Ravikumar Govindarajan
<ra...@gmail.com> wrote:
> I have to add that the above code is wrong.
>
> It has to be
>
>  while((ref=tEnum.next())!=null)
>                     {
>                         ref = tEnum.term();
>                         tEnum.docFreq(); // Even here VAL=0
>                     }
>
> Apologies for the mistake, but the problem remains
>
>
>
> On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
>> We have the following code
>>
>> SegmentInfos segments = new SegmentInfos();
>>  segments.read(luceneDir);
>>  for(SegmentInfoPerCommit sipc: segments)
>> {
>> String name = sipc.info.name;
>> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
>> Terms terms = reader.terms("content");
>> TermsEnum tEnum = terms.iterator(null);
>>  tEnum.docFreq(); //VAL=0
>>  tEnum.totalTermFreq(); //VAL=-1
>> }
>>
>> The field "content" is indexed as DOCS_FREQ_AND_POSITION
>>
>> Why does the docFreq returned as 0 for all terms. Is this expected or am I
>> doing something wrong?
>>
>> --
>> Ravi
>>
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: TermsEnum.docFreq() returns 0

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
I have to add that the above code is wrong.

It has to be

 while((ref=tEnum.next())!=null)
                    {
                        ref = tEnum.term();
                        tEnum.docFreq(); // Even here VAL=0
                    }

Apologies for the mistake, but the problem remains



On Fri, May 10, 2013 at 5:54 PM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> We have the following code
>
> SegmentInfos segments = new SegmentInfos();
>  segments.read(luceneDir);
>  for(SegmentInfoPerCommit sipc: segments)
> {
> String name = sipc.info.name;
> SegmentReader reader = new SegmentReader(sipc, 1, new IOContext());
> Terms terms = reader.terms("content");
> TermsEnum tEnum = terms.iterator(null);
>  tEnum.docFreq(); //VAL=0
>  tEnum.totalTermFreq(); //VAL=-1
> }
>
> The field "content" is indexed as DOCS_FREQ_AND_POSITION
>
> Why does the docFreq returned as 0 for all terms. Is this expected or am I
> doing something wrong?
>
> --
> Ravi
>
>
>