You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Karthik K <ka...@gmail.com> on 2014/01/01 03:00:50 UTC

Hadoop cluster setup problems

Hi

I am Karthik from India. We have been working on a temperature aware yarn
scheduler, where we want to stop the scheduler from awarding new jobs to a
node if its temperature crosses certain threshold. We figured out a simple
way to do so, where we set the health checker script, and set it in yarn
conf.xml. We print an 'ERROR' in the stdout, as suggested in the
documentation, whenever a node crosses the threshold temperature, and the
scheduler takes care of it. But on setting up the cluster, we found two
anomalies where I would like your help. Our cluster consists of two nodes,
one of which also runs the name nodes and the resource manager. Let us call
this node as the master node and the other one as the slave node.

(a) I tried setting up a health script which always outputs ERROR for both
the nodes. The Web ui lists the master node as unhealthy whereas the other
node is not listed at all. So if I run a map reduce job now, it should not
run as there are no healthy nodes available. But any map reduce job I run
is able to complete without any problems. Why?

(b) Similarly in the Web ui for the resource manager, only the master
machine is listed as one of the nodes although the web ui for hdfs lists
both the nodes. Here also, the health script is set such that it always
outputs ERROR for both the nodes. Shouldn't the resource manager show 0
nodes?

Thanks in advance.

With Regards
Karthik K