You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2006/07/21 23:10:47 UTC
[Nutch Wiki] Trivial Update of "IntranetRecrawl" by RenaudRichardet
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The following page has been changed by RenaudRichardet:
http://wiki.apache.org/nutch/IntranetRecrawl
------------------------------------------------------------------------------
./recrawl crawl 10 31
=== Script ===
-
+ {{{
#!/bin/bash
# A simple script to run a Nutch re-crawl
@@ -74, +74 @@
# Merge indexes
ls -d $segments_dir/* | xargs bin/nutch merge $index_dir
+ }}}
== Version 0.8.0 ==
Place in the bin sub-directory within Nutch and run.
@@ -82, +83 @@
./usr/local/nutch/bin/recrawl /usr/local/tomcat/webapps/ROOT /usr/local/nutch/crawl 10 30
=== Code ===
+
+ {{{
#!/bin/bash
# Nutch recrawl script.
@@ -164, +167 @@
# Clean up
rm -rf $new_indexes
+ }}}
+