You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by lewis john mcgibbney <le...@gmail.com> on 2011/08/29 17:56:59 UTC

Injector hanging on Hadoop 0.20.6

Hi list,

Is/has anyone else experienced the injector hanging when running Nutch 1.4
in Hadoop 0.20.6 cluster setup? I am not getting any log info and need to
kill the command as it hangs as follows

lewis@lewis-01:~/ASF/branch/runtime/deploy$ hadoop jar
nutch-1.4-snapshot.job org.apache.nutch.crawl.Crawl urls -depth 10 -topN 20
-threads 6
11/08/29 16:27:16 WARN crawl.Crawl: solrUrl is not set, indexing will be
skipped...
11/08/29 16:27:16 INFO crawl.Crawl: crawl started in: crawl-20110829162716
11/08/29 16:27:16 INFO crawl.Crawl: rootUrlDir = urls
11/08/29 16:27:16 INFO crawl.Crawl: threads = 6
11/08/29 16:27:16 INFO crawl.Crawl: depth = 10
11/08/29 16:27:16 INFO crawl.Crawl: solrUrl=null
11/08/29 16:27:16 INFO crawl.Crawl: topN = 20
11/08/29 16:27:16 INFO crawl.Injector: Injector: starting at 2011-08-29
16:27:16
11/08/29 16:27:16 INFO crawl.Injector: Injector: crawlDb:
crawl-20110829162716/crawldb
11/08/29 16:27:16 INFO crawl.Injector: Injector: urlDir: urls
11/08/29 16:27:16 INFO crawl.Injector: Injector: Converting injected urls to
crawl db entries. <<<<< HERE
^C
lewis@lewis-01:~/ASF/branch/runtime/deploy$



*Lewis*

Re: Injector hanging on Hadoop 0.20.6

Posted by lewis john mcgibbney <le...@gmail.com>.
Hi Markus,

Thanks for getting back to me on this one, please give me until tomorrow to
implement and get back to you... I got slightly delayed today.

On Mon, Aug 29, 2011 at 5:06 PM, Markus Jelsma
<ma...@openindex.io>wrote:

> Can you try with debug logging and using only the inject command only? What
> do
> you see in tasktracker gui?
>
>
> > Hi list,
> >
> > Is/has anyone else experienced the injector hanging when running Nutch
> 1.4
> > in Hadoop 0.20.6 cluster setup? I am not getting any log info and need to
> > kill the command as it hangs as follows
> >
> > lewis@lewis-01:~/ASF/branch/runtime/deploy$ hadoop jar
> > nutch-1.4-snapshot.job org.apache.nutch.crawl.Crawl urls -depth 10 -topN
> 20
> > -threads 6
> > 11/08/29 16:27:16 WARN crawl.Crawl: solrUrl is not set, indexing will be
> > skipped...
> > 11/08/29 16:27:16 INFO crawl.Crawl: crawl started in:
> crawl-20110829162716
> > 11/08/29 16:27:16 INFO crawl.Crawl: rootUrlDir = urls
> > 11/08/29 16:27:16 INFO crawl.Crawl: threads = 6
> > 11/08/29 16:27:16 INFO crawl.Crawl: depth = 10
> > 11/08/29 16:27:16 INFO crawl.Crawl: solrUrl=null
> > 11/08/29 16:27:16 INFO crawl.Crawl: topN = 20
> > 11/08/29 16:27:16 INFO crawl.Injector: Injector: starting at 2011-08-29
> > 16:27:16
> > 11/08/29 16:27:16 INFO crawl.Injector: Injector: crawlDb:
> > crawl-20110829162716/crawldb
> > 11/08/29 16:27:16 INFO crawl.Injector: Injector: urlDir: urls
> > 11/08/29 16:27:16 INFO crawl.Injector: Injector: Converting injected urls
> > to crawl db entries. <<<<< HERE
> > ^C
> > lewis@lewis-01:~/ASF/branch/runtime/deploy$
> >
> >
> >
> > *Lewis*
>



-- 
*Lewis*

Re: Injector hanging on Hadoop 0.20.6

Posted by Markus Jelsma <ma...@openindex.io>.
Can you try with debug logging and using only the inject command only? What do 
you see in tasktracker gui?


> Hi list,
> 
> Is/has anyone else experienced the injector hanging when running Nutch 1.4
> in Hadoop 0.20.6 cluster setup? I am not getting any log info and need to
> kill the command as it hangs as follows
> 
> lewis@lewis-01:~/ASF/branch/runtime/deploy$ hadoop jar
> nutch-1.4-snapshot.job org.apache.nutch.crawl.Crawl urls -depth 10 -topN 20
> -threads 6
> 11/08/29 16:27:16 WARN crawl.Crawl: solrUrl is not set, indexing will be
> skipped...
> 11/08/29 16:27:16 INFO crawl.Crawl: crawl started in: crawl-20110829162716
> 11/08/29 16:27:16 INFO crawl.Crawl: rootUrlDir = urls
> 11/08/29 16:27:16 INFO crawl.Crawl: threads = 6
> 11/08/29 16:27:16 INFO crawl.Crawl: depth = 10
> 11/08/29 16:27:16 INFO crawl.Crawl: solrUrl=null
> 11/08/29 16:27:16 INFO crawl.Crawl: topN = 20
> 11/08/29 16:27:16 INFO crawl.Injector: Injector: starting at 2011-08-29
> 16:27:16
> 11/08/29 16:27:16 INFO crawl.Injector: Injector: crawlDb:
> crawl-20110829162716/crawldb
> 11/08/29 16:27:16 INFO crawl.Injector: Injector: urlDir: urls
> 11/08/29 16:27:16 INFO crawl.Injector: Injector: Converting injected urls
> to crawl db entries. <<<<< HERE
> ^C
> lewis@lewis-01:~/ASF/branch/runtime/deploy$
> 
> 
> 
> *Lewis*