You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by andy2005cst <an...@gmail.com> on 2009/04/03 11:06:26 UTC

Re: Hadoop java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) while indexing.

I met the same problem as you, have you find a way to solve it?


dealmaker wrote:
> 
> I am using the nutch nightly build #741 (Mar 3, 2009 4:01:53 AM).  I am at
> the final phrase of crawling following the tutorial on Nutch.org website. 
> I ran the following command, and I got exception in Hadoop.  I double
> checked the folder path in nutch-site.xml, and they are correct.  I tried
> multiple times, and I got same problem.  I didn't have same problem in
> 0.9.   What's wrong?
> 
> $ bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb
> crawl/segments/*
> Indexer: starting
> Indexer: java.io.IOException: Job failed!
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
> 	at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
> 	at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
> 
> 
> Log from Hadoop:
> 2009-03-04 14:30:31,531 WARN  mapred.LocalJobRunner - job_local_0001
> java.lang.IllegalArgumentException: it doesn't make sense to have a field
> that is neither indexed nor stored
> 	at org.apache.lucene.document.Field.<init>(Field.java:279)
> 	at
> org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:133)
> 	at
> org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:239)
> 	at
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50)
> 	at
> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:40)
> 	at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:410)
> 	at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:158)
> 	at
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
> 	at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)
> 2009-03-04 14:30:31,668 FATAL indexer.Indexer - Indexer:
> java.io.IOException: Job failed!
> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
> 	at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
> 	at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
> 
> 

-- 
View this message in context: http://www.nabble.com/Hadoop--java.io.IOException%3A-Job-failed%21-at-org.apache.hadoop.mapred.JobClient.runJob%28JobClient.java%3A1232%29-while-indexing.-tp22341554p22864662.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Hadoop java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) while indexing.

Posted by dealmaker <vi...@gmail.com>.

No, this problem was supposed to be already fixed in nutch-711 in 1.0-dev.  I
tried it in 1.0, and somehow it broke again.  Maybe some changes after 711
broke it again?  Does anyone know how to fix this?  Thanks.


andy2005cst wrote:
> 
> I met the same problem as you, have you find a way to solve it?
> 
> 
> dealmaker wrote:
>> 
>> I am using the nutch nightly build #741 (Mar 3, 2009 4:01:53 AM).  I am
>> at the final phrase of crawling following the tutorial on Nutch.org
>> website.  I ran the following command, and I got exception in Hadoop.  I
>> double checked the folder path in nutch-site.xml, and they are correct. 
>> I tried multiple times, and I got same problem.  I didn't have same
>> problem in 0.9.   What's wrong?
>> 
>> $ bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb
>> crawl/segments/*
>> Indexer: starting
>> Indexer: java.io.IOException: Job failed!
>> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>> 	at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
>> 	at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
>> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> 	at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
>> 
>> 
>> Log from Hadoop:
>> 2009-03-04 14:30:31,531 WARN  mapred.LocalJobRunner - job_local_0001
>> java.lang.IllegalArgumentException: it doesn't make sense to have a field
>> that is neither indexed nor stored
>> 	at org.apache.lucene.document.Field.<init>(Field.java:279)
>> 	at
>> org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:133)
>> 	at
>> org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:239)
>> 	at
>> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50)
>> 	at
>> org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:40)
>> 	at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:410)
>> 	at
>> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:158)
>> 	at
>> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
>> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
>> 	at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)
>> 2009-03-04 14:30:31,668 FATAL indexer.Indexer - Indexer:
>> java.io.IOException: Job failed!
>> 	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>> 	at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
>> 	at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
>> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> 	at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Hadoop--java.io.IOException%3A-Job-failed%21-at-org.apache.hadoop.mapred.JobClient.runJob%28JobClient.java%3A1232%29-while-indexing.-tp22341554p22983483.html
Sent from the Nutch - User mailing list archive at Nabble.com.