You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Mathijs Homminga <ma...@knowlogy.nl> on 2007/06/02 22:15:45 UTC

Cleaning up segments after indexing

Hi all,

Is there a way to clean up my segments to only include those documents 
that are part of my index?
Should I use the SegmentMerger (make slices) and apply a filter? I guess 
I have to reindex then.

The reason I ask is because my final index contains only 1% - 5% of all 
the documents I have crawled.

Thanks in advance,
Mathijs

-- 
Knowlogy
Helperpark 290 C
9723 ZA Groningen

mathijs.homminga@knowlogy.nl
+31 (0)6 15312977
http://www.knowlogy.nl