You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bhavin Pandya <bh...@rediff.co.in> on 2007/04/16 12:41:58 UTC

How to get list of all terms from given doc id?

Hi guys,

In my existing index, i have not stored my index with TermVector.YES.
is there any way to fetch all the terms from the given document id.

Thanks.
Bhavin pandya

Re: How to get list of all terms from given doc id?

Posted by Erick Erickson <er...@gmail.com>.
You can still recover all the terms. However,
depending upon whether you've stored the field values, the reconstruction
may or may not be completely accurate.

Luke has to do something like this because you can use the
"Reconstruct & Edit" button, you can download that source and see
how it's done there.

Conceptually, you would do something like
Document.getFields. For each field, enumerate all
the terms in the index using TermEnum (use a value of
"" to enumerate all the terms). For each such term, use
a TermDocs to skipTo the id of the doc you're working
on. Use a TermPositions object to find the relative offsets
of each term.

Eventually, you'll have several lists of all the terms and their
positions that you'll have to merge.

This is a complex and time-consuming process, though. Why do
you need to do this anyway?

Erick

On 4/16/07, Bhavin Pandya <bh...@rediff.co.in> wrote:
>
> Hi guys,
>
> In my existing index, i have not stored my index with TermVector.YES.
> is there any way to fetch all the terms from the given document id.
>
> Thanks.
> Bhavin pandya