You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by ilango gurusamy <il...@yahoo.com> on 2006/03/07 05:34:43 UTC
running Nutch
Hi
I am trying to run Nutch by following the instructions given in the tutorial.
The environment is Suse Linux10, JDK 1.4.2 and Nutch 0.71. And of course Tomcat 5
I get the following errors:
linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3 -topN 50
060307 200146 parsing file:/usr/local/nutch/nutch071/conf/nutch-default.xml
060307 200147 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
060307 200147 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
060307 200147 No FS indicated, using default:local
Exception in thread "main" java.lang.RuntimeException: crawl already exists.
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
linux:/usr/local/nutch/nutch071 # export JAVA_HOME
linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3 -topN 50
060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-default.xml
060307 200325 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
060307 200325 No FS indicated, using default:local
Exception in thread "main" java.lang.RuntimeException: crawl already exists.
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=//usr/lib/jvm/java-1.4.2
linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=/usr/lib/jvm/java-1.4.2
linux:/usr/local/nutch/nutch071 # export NUTCH_JAVA_HOME
linux:/usr/local/nutch/nutch071 # echo $NUTCH_JAVA_HOME
/usr/lib/jvm/java-1.4.2
linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
run java in /usr/lib/jvm/java-1.4.2
060307 201624 parsing file:/usr/local/nutch/nutch071/conf/nutch-default.xml
060307 201625 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
060307 201625 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
060307 201625 No FS indicated, using default:local
Exception in thread "main" java.lang.RuntimeException: crawl already exists.
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
I set JAVA_HOME and NUTCH_JAVA_HOME to the base of my JVM installation, but I am not sure what is going on.
I really appreciate any help that I can get. Thanks a lot
ilango
---------------------------------
Yahoo! Mail
Use Photomail to share photos without annoying attachments.
Re: running Nutch
Posted by ilango gurusamy <il...@yahoo.com>.
Hi
I successfully ran Nutch. Thanks for the tip. Strangely I remember deleting the crawl directory before..but anyway, you worked the magic for me
by the way, Saravanaraj, are you from TN. What are your research interests with Nutch
ilango
"D.Saravanaraj" <sa...@gmail.com> wrote: Delete the crawl folder which would have been created in the previous crawl.
On 3/7/06, ilango gurusamy wrote:
>
> Hi
> I am trying to run Nutch by following the instructions given in the
> tutorial.
> The environment is Suse Linux10, JDK 1.4.2 and Nutch 0.71. And of course
> Tomcat 5
>
> I get the following errors:
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200146 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200147 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # export JAVA_HOME
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200325 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=//usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=/usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # export NUTCH_JAVA_HOME
> linux:/usr/local/nutch/nutch071 # echo $NUTCH_JAVA_HOME
> /usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> run java in /usr/lib/jvm/java-1.4.2
> 060307 201624 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 201625 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
>
>
> I set JAVA_HOME and NUTCH_JAVA_HOME to the base of my JVM installation,
> but I am not sure what is going on.
>
> I really appreciate any help that I can get. Thanks a lot
>
> ilango
>
>
>
>
> ---------------------------------
> Yahoo! Mail
> Use Photomail to share photos without annoying attachments.
>
---------------------------------
Yahoo! Mail
Bring photos to life! New PhotoMail makes sharing a breeze.
Re: running Nutch
Posted by "D.Saravanaraj" <sa...@gmail.com>.
Delete the crawl folder which would have been created in the previous crawl.
On 3/7/06, ilango gurusamy <il...@yahoo.com> wrote:
>
> Hi
> I am trying to run Nutch by following the instructions given in the
> tutorial.
> The environment is Suse Linux10, JDK 1.4.2 and Nutch 0.71. And of course
> Tomcat 5
>
> I get the following errors:
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200146 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200147 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200147 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # export JAVA_HOME
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> -topN 50
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 200325 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 200325 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=//usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # NUTCH_JAVA_HOME=/usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # export NUTCH_JAVA_HOME
> linux:/usr/local/nutch/nutch071 # echo $NUTCH_JAVA_HOME
> /usr/lib/jvm/java-1.4.2
> linux:/usr/local/nutch/nutch071 # bin/nutch crawl urls -dir crawl depth 3
> run java in /usr/lib/jvm/java-1.4.2
> 060307 201624 parsing file:/usr/local/nutch/nutch071/conf/nutch-
> default.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/crawl-tool.xml
> 060307 201625 parsing file:/usr/local/nutch/nutch071/conf/nutch-site.xml
> 060307 201625 No FS indicated, using default:local
> Exception in thread "main" java.lang.RuntimeException: crawl already
> exists.
> at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:121)
>
>
> I set JAVA_HOME and NUTCH_JAVA_HOME to the base of my JVM installation,
> but I am not sure what is going on.
>
> I really appreciate any help that I can get. Thanks a lot
>
> ilango
>
>
>
>
> ---------------------------------
> Yahoo! Mail
> Use Photomail to share photos without annoying attachments.
>