You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ravi Chintakunta <ra...@gmail.com> on 2006/02/07 02:59:34 UTC

Dynamic merging of indices

I have multiple indices for the crawls across various intranet sites
stored in separate folders. My search application should support
searching across one or more of these indices dynamically - by way of
checkboxes on the web page.  For this, I have modified NutchBean to
create the IndexSearcher and FetchedSegments from the segments
directory (not the merged index directory) in these folders.  Based on
the selected intranet sites, a NutchBean is instantiated for the
indices  of the selected sites and the results are displayed.

With this I had the "Too many open files error" and have increased the
number of files limit.

This seems to work well now. But if I have 5 such sites, then I am
opening 2^5 = 32 times more files than I would have opened.

My question is: Is there a better way of doing this? Like:

- Can I open an IndexReader on each of the merged index directory and
dynamically create an IndexSearcher by merging these readers using
MultiReader?

- Is an IndexReader thread safe and can it be used simultaneously in
different IndexSearchers?

- Can I create the IndexReader on the merged index directory and
create the corresponding FetchedSegments on the corresponding
non-merged segments directory?

Thanks
Ravi Chintakunta