You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Gal Nitzan <gn...@usa.net> on 2006/03/01 12:07:24 UTC

Bad behavior of Fetcher with Hadoop

Hi,

Since the split of Hadoop from Nutch the fetcher is misbehaving.

All task trackers are accessing the same sites at the same time.

The addition of the call to Hadoop in Fetcher.java:

    job.setBoolean("mapred.speculative.execution", false);

Did not change this behavior.


Here is what I found in the Job Tracker log where I think the
speculative execution happens during fetch:


060301 130053 Task 'task_m_349fnm' has completed.
060301 130053 Adding task 'task_m_42lyfs' to tip tip_67i2wa, for tracker
'tracker_77986'


I do not know if it is related but the task has ended prematurely
without any error:

060301 130134 task_r_2c181b 0.0% reduce > copy >
060301 130135 task_m_349fnm 0.031490237% 10 pages, 0 errors, 0.1
pages/s, 47 kb/s,
060301 130135 task_m_349fnm 0.031490237% 10 pages, 0 errors, 0.1
pages/s, 47 kb/s,
060301 130135 Task task_m_349fnm is done.



Gal