You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Marko Bauhardt <mb...@media-style.com> on 2005/11/08 13:40:02 UTC
index folder structure
Hello all,
i use nutch from the mapred branch. I have a little problem with the
folder structure from the generated index. If i index a segment in a
folder with name "indexes" the folder structure is
$NUTCH_HOME/indexes/part-00000
$NUTCH_HOME/indexes/part-00001
$NUTCH_HOME/indexes/part-00002
....
So, if i index my next segment in this folder, the old index will be
overwritten. In a result of that i index every segment to indexes/
SEGMENT_NAME. Now my folder structure is:
$NUTCH_HOME/indexes/SEGMENT_NAME/part-00000
$NUTCH_HOME/indexes/SEGMENT_NAME/part-00001
$NUTCH_HOME/indexes/SEGMENT_NAME/part-00002
$NUTCH_HOME/indexes/OTHER_SEGMENT_NAME/part-00000
$NUTCH_HOME/indexes/OTHER_SEGMENT_NAME/part-00001
$NUTCH_HOME/indexes/OTHER_SEGMENT_NAME/part-00002
...
With this folder structure the searcher found no index. I look at the
code and i think the NutchBean (method init) looks only in the first
level depth to find the index.done file.
It would be nice, if the NutchBean search the index.done file in a
level depth 2.
An other solution is the merging from my generated index. But i think
this is only a workaround.
Or has anybody an other solution.
Thanks and bye, Marko