You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nutch.apache.org by Apache Wiki <wi...@apache.org> on 2005/09/20 17:16:27 UTC

[Nutch Wiki] Update of "FAQ" by JakeVanderdray

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.

The following page has been changed by JakeVanderdray:
http://wiki.apache.org/nutch/FAQ

The comment on the change is:
Just some formatting.

------------------------------------------------------------------------------
       % cp nutch-0.7.war $CATALINA_HOME/webapps/ROOT.war
  
    * After building your first index, start Tomcat from the index folder.
-     Assuming your index is located at /index/db/
+     Assuming your index is located at /index/db/:
-     % cd /index/db/
+     {{{% cd /index/db/
-     % $CATATALINA_HOME/bin/startup.sh
+ % $CATATALINA_HOME/bin/startup.sh}}}
    * After building your first index, start Tomcat from the index folder.
      Start Tomcat
-      % $CATATALINA_HOME/bin/startup.sh
+       % $CATATALINA_HOME/bin/startup.sh
      Stop Tomcat
-      % $CATATALINA_HOME/bin/startup.sh
+       % $CATATALINA_HOME/bin/startup.sh
      Tomcat has extracted the contens of the ROOT.war file
      Edit the nutch-default.xml which is located at:
         $CATATALINA_HOME/bin/webapps/ROOT/WEB-INF/classes/
@@ -59, +59 @@

  ==== How can I recover an aborted fetch process? ====
  
  You have two choices:
-    1) Use the aborted output. You'll need to touch the file fetcher.done in the segment directory. All the pages that were not crawled will be re-generated for fetch pretty soon. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way.
+    1. Use the aborted output. You'll need to touch the file fetcher.done in the segment directory. All the pages that were not crawled will be re-generated for fetch pretty soon. If you fetched lots of pages, and don't want to have to re-fetch them again, this is the best way.
-    2) Discard the aborted output. To do this, just delete the fetcher* directories in the segment and restart the fetcher.
+    2. Discard the aborted output. To do this, just delete the fetcher* directories in the segment and restart the fetcher.
  
  ==== Who changes the next fetch date? ====
    * After injecting a new url the next fetch date is set to the current time.