You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by zhao <25...@qq.com> on 2011/08/29 04:24:52 UTC
a question about job failed
Dear all,
after use nutch 0.9 ,but have a question,Detailed description of the problem
is
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
at
org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
Thank you for your help
zhao
--
View this message in context: http://lucene.472066.n3.nabble.com/a-question-about-job-failed-tp3291669p3291669.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Re: a question about job failed
Posted by Markus Jelsma <ma...@openindex.io>.
Yes, i have this less than descriptive exception too:
https://issues.apache.org/jira/browse/NUTCH-1100
On Tuesday 30 August 2011 02:28:18 Markus Jelsma wrote:
> Thanks for the reminder as i believe this is an actual issue! I've got some
> indices that cannot be deduplicated from Nutch and die without giving a
> proper clue.
>
>
> I'll reproduce and report back on it. I know it's not a problem of not
> having the correct fields marked as STORED since that once index has all
> fields used by dedup marked as STORED.
>
> Strange..
>
> > Hi Zhao,
> >
> > Do you have anymore verbose log info from hadoop.log, I have never worked
> > with Nutch 0.9 but if you could at least indicate whether you get
> > something like
> >
> > LOG: info Dedup: starting ... blah blah blah
> >
> > Taking this to a larger context I am not particularly happy with the
> > verboseness of logging when there are errors with indexing commands. When
> > we experience an error during any of the index related commands we get
> > back Job failed. It would be nice to get back a reason for the job
> > failing which was more clear than a stack trace.
> >
> > Finally, this is from a personal point of view, I would highly recommend
> > that you upgrade to a newer (1.3) version of Nutch if you are using this
> > in production. There are significant improvements in functionality.
> >
> > Lewis
> >
> > On Mon, Aug 29, 2011 at 3:24 AM, zhao <25...@qq.com> wrote:
> > > Dear all,
> > > after use nutch 0.9 ,but have a question,Detailed description of the
> > > problem
> > > is
> > >
> > > Exception in thread "main" java.io.IOException: Job failed!
> > >
> > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
> > > at
> > >
> > > org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:4
> > > 39 )
> > >
> > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> > >
> > > Thank you for your help
> > >
> > > zhao
> > >
> > > --
> > > View this message in context:
> > > http://lucene.472066.n3.nabble.com/a-question-about-job-failed-tp329166
> > > 9p 3291669.html Sent from the Nutch - User mailing list archive at
> > > Nabble.com.
--
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350
Re: a question about job failed
Posted by Markus Jelsma <ma...@openindex.io>.
Thanks for the reminder as i believe this is an actual issue! I've got some
indices that cannot be deduplicated from Nutch and die without giving a proper
clue.
I'll reproduce and report back on it. I know it's not a problem of not having
the correct fields marked as STORED since that once index has all fields used
by dedup marked as STORED.
Strange..
> Hi Zhao,
>
> Do you have anymore verbose log info from hadoop.log, I have never worked
> with Nutch 0.9 but if you could at least indicate whether you get something
> like
>
> LOG: info Dedup: starting ... blah blah blah
>
> Taking this to a larger context I am not particularly happy with the
> verboseness of logging when there are errors with indexing commands. When
> we experience an error during any of the index related commands we get
> back Job failed. It would be nice to get back a reason for the job failing
> which was more clear than a stack trace.
>
> Finally, this is from a personal point of view, I would highly recommend
> that you upgrade to a newer (1.3) version of Nutch if you are using this in
> production. There are significant improvements in functionality.
>
> Lewis
>
> On Mon, Aug 29, 2011 at 3:24 AM, zhao <25...@qq.com> wrote:
> > Dear all,
> > after use nutch 0.9 ,but have a question,Detailed description of the
> > problem
> > is
> >
> > Exception in thread "main" java.io.IOException: Job failed!
> >
> > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
> > at
> >
> > org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439
> > )
> >
> > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> >
> > Thank you for your help
> >
> > zhao
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/a-question-about-job-failed-tp3291669p
> > 3291669.html Sent from the Nutch - User mailing list archive at
> > Nabble.com.
Re: a question about job failed
Posted by lewis john mcgibbney <le...@gmail.com>.
Hi Zhao,
Do you have anymore verbose log info from hadoop.log, I have never worked
with Nutch 0.9 but if you could at least indicate whether you get something
like
LOG: info Dedup: starting ... blah blah blah
Taking this to a larger context I am not particularly happy with the
verboseness of logging when there are errors with indexing commands. When we
experience an error during any of the index related commands we get back Job
failed. It would be nice to get back a reason for the job failing which was
more clear than a stack trace.
Finally, this is from a personal point of view, I would highly recommend
that you upgrade to a newer (1.3) version of Nutch if you are using this in
production. There are significant improvements in functionality.
Lewis
On Mon, Aug 29, 2011 at 3:24 AM, zhao <25...@qq.com> wrote:
> Dear all,
> after use nutch 0.9 ,but have a question,Detailed description of the
> problem
> is
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
> at
> org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:439)
> at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)
> Thank you for your help
> zhao
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/a-question-about-job-failed-tp3291669p3291669.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
--
*Lewis*