You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Slavisa Radic <Ko...@gmx.de> on 2002/04/09 20:42:45 UTC
Getting Terms sequentially without using TermEnumeration
Hi,
I have to build a Terms-Document-Matrix to be able to do some Matrix
operations on it.
the Matrix should look like (I hope this will be displayed correctly):
term1 term2 term3 term4 ...
----------------------------------
Doc1 freq freq freq freq
Doc2 freq...................
Doc3 .......................
Doc4 .......................
.
.
.
I tried that by using IndexReader.terms() and IndexReader.TermDocs(term)
to get all the terms and the document-numbers which contain them.
As you can imagine, the TermEnumeration-Object has become to big to fit
into memory.
Is there an "easy" way to get the Terms one by one (without writing my
own IndexReader)?
--
To unsubscribe, e-mail: <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>