You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by charlie w <sp...@gmail.com> on 2008/05/14 16:40:08 UTC
large content/parse segments
This is in reference to the Nutch "content" segments
(segments/<timestamp>/parse_text, etc.), not the segments of a Lucene
index.
I am considering using SegmentMerger to combine a large number of
fetch segments into a single huge segment. Will doing so create a
performance problem when generating page summaries at search time? If
so, is there a recommended maximum size for one of these segments?
Thanks,
Charlie