You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2007/08/20 21:08:21 UTC

[Nutch Wiki] Trivial Update of "Crawl" by susam

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by susam:
http://wiki.apache.org/nutch/Crawl

------------------------------------------------------------------------------
  == Steps ==
  The complete job of this script has been divided broadly into 8 steps.
  
-  # Inject URLs
+  1. Inject URLs
-  # Generate, Fetch, Parse, Update Loop
+  2. Generate, Fetch, Parse, Update Loop
-  # Merge Segments
+  3. Merge Segments
-  # Invert Links
+  4. Invert Links
-  # Index
+  5. Index
-  # Dedup
+  6. Dedup
-  # Merge Indexes
+  7. Merge Indexes
-  # Reload index
+  8. Reload index
  
  == Modes of Execution ==
  The script can be executed in two modes:-