You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2007/08/15 19:28:17 UTC
[Nutch Wiki] Update of "IntranetRecrawl" by JamesVictor
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by JamesVictor:
http://wiki.apache.org/nutch/IntranetRecrawl
The comment on the change is:
added instructions for Nutch 0.9.0 script
------------------------------------------------------------------------------
}}}
- == Version 0.8.0 ==
+ == Version 0.8.0 and 0.9.0 ==
Place in the bin sub-directory within Nutch and run.
** MUST CALL SCRIPT USING THE FULL PATH TO THE SCRIPT OR IT WON'T WORK***
+
=== Example Usage ===
- ./usr/local/nutch/bin/recrawl /usr/local/tomcat/webapps/ROOT /usr/local/nutch/crawl 10 31
+ `./usr/local/nutch/bin/recrawl /usr/local/tomcat/webapps/ROOT /usr/local/nutch/crawl 10 31`
(with adddays being '31', all pages will be recrawled)
+
+ === Changes for 0.9.0 ===
+
+ Change line 76 to read
+ {{{
+ #Sets the path to bin
+ nutch_dir=`dirname $0`/bin
+ }}}
+
+ in order for the proper path to be built. Everything else may remain the same.
=== Code ===