You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Meraj A. Khan" <me...@gmail.com> on 2014/08/24 19:03:22 UTC

Nutch 1.7 on Hadoop Yarn 2.3.0 performing only 3 rounds of fetching.

Hi All,

After spending some time on this I was able to diagnose the problem that
when I submit the Nutch 1.7 job to a Hadoop Yarn Cluster , I notice that in
the Hadoop UI , it lists the tasks that its executing , only 3 rounds of
fetch happen , even though I have  given a depth on 100 and my seed list
has 10 URLs .

Any idea why this is happening ? Please note when I run the same Nutch
configuration in my local mode i.e in eclipse it does appropriate number of
fetches and also fetches all the URLs from all the domains.

Thanks in advance!