You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2006/07/21 23:10:47 UTC

[Nutch Wiki] Trivial Update of "IntranetRecrawl" by RenaudRichardet

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by RenaudRichardet:
http://wiki.apache.org/nutch/IntranetRecrawl

------------------------------------------------------------------------------
  ./recrawl crawl 10 31
  
  === Script ===
- 
+ {{{
  #!/bin/bash
  
  # A simple script to run a Nutch re-crawl
@@ -74, +74 @@

  # Merge indexes
  ls -d $segments_dir/* | xargs bin/nutch merge $index_dir
  
+ }}}
  
  == Version 0.8.0 ==
  Place in the bin sub-directory within Nutch and run.
@@ -82, +83 @@

  ./usr/local/nutch/bin/recrawl /usr/local/tomcat/webapps/ROOT /usr/local/nutch/crawl 10 30
  
  === Code ===
+ 
+ {{{
  #!/bin/bash
  
  # Nutch recrawl script.
@@ -164, +167 @@

  # Clean up
  rm -rf $new_indexes
  
+ }}}
+