You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by praveen pathiyil <pa...@gmail.com> on 2005/04/27 20:42:08 UTC

Creating an index (as in books!) from TermFreqVector

Hi,

I need some advice in creating an index of words (as there would be in
the last few pages of a book) and the documents that contains that
word. I know the concept of an index might not have much validity when
you can search for the terms, however users who are used to seeing an
index for a website seems to prefer that.

Is it possible to create a full listing of words and the documents
that contain them from the TermFreqVector ? I was searching for any
information on this, and found a couple links which said that its not
used in Nutch.

TIA,
Praveen.