You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@nutch.apache.org by pk...@apache.org on 2005/08/08 19:59:41 UTC

svn commit: r230828 - /lucene/nutch/trunk/src/site/src/documentation/content/xdocs/tutorial.xml

Author: pkosiorowski
Date: Mon Aug  8 10:59:34 2005
New Revision: 230828

URL: http://svn.apache.org/viewcvs?rev=230828&view=rev
Log:
Changed URLs in tutorial to point to apache.

Modified:
    lucene/nutch/trunk/src/site/src/documentation/content/xdocs/tutorial.xml

Modified: lucene/nutch/trunk/src/site/src/documentation/content/xdocs/tutorial.xml
URL: http://svn.apache.org/viewcvs/lucene/nutch/trunk/src/site/src/documentation/content/xdocs/tutorial.xml?rev=230828&r1=230827&r2=230828&view=diff
==============================================================================
--- lucene/nutch/trunk/src/site/src/documentation/content/xdocs/tutorial.xml (original)
+++ lucene/nutch/trunk/src/site/src/documentation/content/xdocs/tutorial.xml Mon Aug  8 10:59:34 2005
@@ -34,7 +34,7 @@
 
 <p>First, you need to get a copy of the Nutch code.  You can download
 a release from <a
-href="http://www.nutch.org/release/">http://www.nutch.org/release/</a>.
+href="http://lucene.apache.org/nutch/release/">http://lucene.apache.org/nutch/release/</a>.
 Unpack the release and connect to its top-level directory.  Or, check
 out the latest source code from <a
 href="version_control.html">subversion</a> and build it
@@ -67,23 +67,23 @@
 <ol>
 
 <li>Create a flat file of root urls.  For example, to crawl the
-<code>nutch.org</code> site you might start with a file named
+<code>nutch</code> site you might start with a file named
 <code>urls</code> containing just the Nutch home page.  All other
 Nutch pages should be reachable from this page.  The <code>urls</code>
 file would thus look like:
 <source>
-http://www.nutch.org/
+http://lucene.apache.org/nutch/
 </source>
 </li>
 
 <li>Edit the file <code>conf/crawl-urlfilter.txt</code> and replace
 <code>MY.DOMAIN.NAME</code> with the name of the domain you wish to
 crawl.  For example, if you wished to limit the crawl to the
-<code>nutch.org</code> domain, the line should read:
+<code>apache.org</code> domain, the line should read:
 <source>
-+^http://([a-z0-9]*\.)*nutch.org/
++^http://([a-z0-9]*\.)*apache.org/
 </source>
-This will include any url in the domain <code>nutch.org</code>.
+This will include any url in the domain <code>apache.org</code>.
 </li>
 
 </ol>