You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alexis Votta <al...@gmail.com> on 2007/09/19 19:34:13 UTC

Nutch recrawl script for 0.9 doesn't work with trunk. Help

The recrawl script for 0.9 I found in
http://wiki.apache.org/nutch/IntranetRecrawl is not working. It works
first time successfully. Second time, it fails with this error.

merging indexes to: crawl/index
IndexMerger: org.apache.hadoop.mapred.FileAlreadyExistsException:
Output directory crawl/index already exists!
        at org.apache.nutch.indexer.IndexMerger.merge(IndexMerger.java:74)
        at org.apache.nutch.indexer.IndexMerger.run(IndexMerger.java:148)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
        at org.apache.nutch.indexer.IndexMerger.main(IndexMerger.java:111)

I am trying this with the latest version available in trunk. Please
help me to rectify this.