You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Alex Zheng <al...@gmail.com> on 2009/03/15 12:45:32 UTC
what is the relation between the classes at the very beginning?
I am a newer for Hadoop, and am reading the code of Hadoop for a week
now i am very puzzled by the relation of so many classes after i run :
bin/start-all.sh
I know there are JobTrackerInstrumentation, JobTracker, Namenode etc so what
is the order of their initialization?
and after bin/start-all.sh and before i run any job, what exits in the
system?
thanks for your reply!
Re: what is the relation between the classes at the very beginning?
Posted by Steve Loughran <st...@apache.org>.
Alex Zheng wrote:
> I am a newer for Hadoop, and am reading the code of Hadoop for a week
> now i am very puzzled by the relation of so many classes after i run :
> bin/start-all.sh
>
> I know there are JobTrackerInstrumentation, JobTracker, Namenode etc so what
> is the order of their initialization?
> and after bin/start-all.sh and before i run any job, what exits in the
> system?
>
run jps -v to see what's up and about, netstat -p to list ports in use
by the different processes.
The nodes are all designed to spin a bit waiting for their dependencies
to come up; you don't need to bring them up in a strict order (which
would be namenode-datanode(s)-jobtracker-tasktracker(s)) for a full MR
cluster.
I have tests that poll for the various ports to be open before
submitting work, and they sometimes get unhappy if you try submitting
jobs straight after the job tracker appears live. If you are going to
spin waiting for a job tracker to be visible, I would sleep a few
seconds after it's IPC port opens up before sending in work. This is
clearly some race condition, but not anything I've sat down to look at,
as it's only a startup and a 10s sleep makes it go away