You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Vishal Sharma <vi...@grazitti.com> on 2015/07/06 07:54:48 UTC
Getting IO exception on crawl with Apache nutch 1.9
Hi,
I am getting following error when trying to crawl just one url:
./bin/crawl ./urls/ ./CrawlData/ "." 5
Injector: starting at 2015-07-06 05:52:45
Injector: crawlDb: CrawlData/crawldb
Injector: urlDir: urls
Injector: Converting injected urls to crawl db entries.
Injector: java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
This used to work earlier. Can someone please help?
*Vishal Sharma*
*Team Leader, SFDC*T: +1 304 636 7373
E: vishals@grazitti.com
www.grazitti.com [image: Description: LinkedIn]
<http://www.linkedin.com/company/grazitti-interactive>[image: Description:
Twitter] <https://twitter.com/grazitti>[image: fbook]
<https://www.facebook.com/grazitti.interactive>
Re: Getting IO exception on crawl with Apache nutch 1.9
Posted by Imtiaz Shakil Siddique <sh...@gmail.com>.
Hi Vishal,
Nutch maintains a log file which can be located in
$Nutch_Home/logs/hadoop.log
These log file holds much detailed information of the crash.
Can you please post the log file so that we can inspect.
Thank you.
On 6 July 2015 at 11:54, Vishal Sharma <vi...@grazitti.com> wrote:
> Hi,
>
> I am getting following error when trying to crawl just one url:
>
> ./bin/crawl ./urls/ ./CrawlData/ "." 5
>
> Injector: starting at 2015-07-06 05:52:45
> Injector: crawlDb: CrawlData/crawldb
> Injector: urlDir: urls
> Injector: Converting injected urls to crawl db entries.
> Injector: java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357)
>
>
> This used to work earlier. Can someone please help?
>
>
>
>
>
>
> *Vishal Sharma*
> *Team Leader, SFDC*T: +1 304 636 7373
> E: vishals@grazitti.com
> www.grazitti.com [image: Description: LinkedIn]
> <http://www.linkedin.com/company/grazitti-interactive>[image: Description:
> Twitter] <https://twitter.com/grazitti>[image: fbook]
> <https://www.facebook.com/grazitti.interactive>
>