You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by charlie w <sp...@gmail.com> on 2008/05/14 16:40:08 UTC

large content/parse segments

This is in reference to the Nutch "content" segments
(segments/<timestamp>/parse_text, etc.), not the segments of a Lucene
index.

I am considering using SegmentMerger to combine a large number of
fetch segments into a single huge segment.  Will doing so create a
performance problem when generating page summaries at search time?  If
so, is there a recommended maximum size for one of these segments?

Thanks,
Charlie