You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Ken Krugler <kk...@transpac.com> on 2010/04/16 19:00:17 UTC
Issue with getTaskTrackers
Hi all,
I'm running 0.19.2 in EC2, and running into an occasional problem with
ClusterStatus.getTaskTrackers().
The call to getTaskTrackers() is being made in the job jar's main
function, before the job starts running I need to control some aspects
of my job, for example setting the number of reduce tasks to be
exactly equal to the number of servers, which should be equal to the
number of task trackers.
Every so often (currently < 5%) the call to getTaskTrackers() will
return a value less than expected - e.g. 2 instead of 6. This happens
even when ClusterStatus.getJobTrackerState() returns State.RUNNING.
I'm assuming the problem is that some of the task trackers are taking
extra time to spin up. I saw HADOOP-5337 (https://issues.apache.org/jira/browse/HADOOP-5337
), which seems related, though that's for restarts vs. initial startup.
Given that the JobTracker waits for slaves to self-report, there
doesn't seem to be a totally reliable, automatic solution to this
issue, but I thought I'd ask to see if there's something I'm missing.
Thanks,
-- Ken
--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c w e b m i n i n g