You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Richard Braman <rb...@bramantax.com> on 2006/09/06 01:11:19 UTC
problem with hadoop
I am using nutch 0.9 dev, latest from svn.
I have running a crawl successfully for about a week now. I have over 100K
documents in my index. I have 21 segments. I just finished a segment and
when going to updatedb I get an error like this:
CrawlDb update: starting
CrawlDb update: db: taxcrawl/crawldb
CrawlDb update: segment: taxcrawl/segments/20060902205223
CrawlDb update: Merging segment data into db.
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:363)
at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:62)
at org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:116)
I noticed that in this version the shell script istn't nearly as verbose as
it once was , even though I have verbose logging turned on everywhere.
This is all of the message I get.
I have a simple install one machine doing everything.
RE: problem with hadoop
Posted by Richard Braman <rb...@bramantax.com>.
I found the bug
https://issues.apache.org/jira/browse/NUTCH-266 in Jira.
This is problem I am having. I downloaded hadoop from trunk and built
hadoop 0.5.1-dev. I put this in my nutch lib and rebuilt, same error. This
sounds like the same error I was having on another machine running .8 just a
few weeks ago. On my .8 machine the console output was much more verbose and
I remember the error occurring after I had a good number of segments as
well. Either way, how can one not use hadoop? I am running Windows XP on
this machine, but I can also confirm the same problem on Windows 2000.
-----Original Message-----
From: Richard Braman [mailto:rbraman@bramantax.com]
Sent: Tuesday, September 05, 2006 8:38 PM
To: nutch-dev@lucene.apache.org; rbraman@bramantax.com
Subject: RE: problem with hadoop
No matter what command I run, I get this error. index, updatedb, addurl,
every class.
-----Original Message-----
From: Richard Braman [mailto:rbraman@bramantax.com]
Sent: Tuesday, September 05, 2006 7:11 PM
To: nutch-dev@lucene.apache.org
Subject: problem with hadoop
I am using nutch 0.9 dev, latest from svn.
I have running a crawl successfully for about a week now. I have over 100K
documents in my index. I have 21 segments. I just finished a segment and
when going to updatedb I get an error like this:
CrawlDb update: starting
CrawlDb update: db: taxcrawl/crawldb
CrawlDb update: segment: taxcrawl/segments/20060902205223
CrawlDb update: Merging segment data into db.
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:363)
at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:62)
at org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:116)
I noticed that in this version the shell script istn't nearly as verbose as
it once was , even though I have verbose logging turned on everywhere.
This is all of the message I get.
I have a simple install one machine doing everything.
RE: problem with hadoop
Posted by Richard Braman <rb...@bramantax.com>.
No matter what command I run, I get this error. index, updatedb, addurl,
every class.
-----Original Message-----
From: Richard Braman [mailto:rbraman@bramantax.com]
Sent: Tuesday, September 05, 2006 7:11 PM
To: nutch-dev@lucene.apache.org
Subject: problem with hadoop
I am using nutch 0.9 dev, latest from svn.
I have running a crawl successfully for about a week now. I have over 100K
documents in my index. I have 21 segments. I just finished a segment and
when going to updatedb I get an error like this:
CrawlDb update: starting
CrawlDb update: db: taxcrawl/crawldb
CrawlDb update: segment: taxcrawl/segments/20060902205223
CrawlDb update: Merging segment data into db.
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:363)
at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:62)
at org.apache.nutch.crawl.CrawlDb.main(CrawlDb.java:116)
I noticed that in this version the shell script istn't nearly as verbose as
it once was , even though I have verbose logging turned on everywhere.
This is all of the message I get.
I have a simple install one machine doing everything.