You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Mathijs Homminga <ma...@knowlogy.nl> on 2007/06/02 22:15:45 UTC
Cleaning up segments after indexing
Hi all,
Is there a way to clean up my segments to only include those documents
that are part of my index?
Should I use the SegmentMerger (make slices) and apply a filter? I guess
I have to reindex then.
The reason I ask is because my final index contains only 1% - 5% of all
the documents I have crawled.
Thanks in advance,
Mathijs
--
Knowlogy
Helperpark 290 C
9723 ZA Groningen
mathijs.homminga@knowlogy.nl
+31 (0)6 15312977
http://www.knowlogy.nl