You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by "Zaki SEc." <za...@gmail.com> on 2017/09/20 18:03:58 UTC

Hadoop "managed" setup basic question (Ambari, CDH?)

Hi!

I'm fairly new to Hadoop, but I've been browsing the documentation and
'how-to'-s for some time now.

My question would be as follows; How can one setup a cluster, where the
nodes aren't static?
What I mean is, I want to be able to run a cluster, say, 20 machines, where
each of the nodes have Hadoop installed, and they 'recognize' each other -
saving me from having to manually set their hostnames and configure their
'/etc/hosts' file.

I did look into Apache Ambari, hoping that it would give me an easy
solution to the above problem, but it does not support Ubuntu 16.04 which I
have to work with, and it failed to build for various reasons.
I have also looked into Cloudera's CDH distribution, (the manual
installation) but that has the same problem - it asks me to manually
configure these settings for each node.

It seemed to me, that "Rack Awareness" could potentially solve my problem,
but after some reading, I had to realize that it's for a different thing
entirely.
So now it looks like I'm out of options.

Lately, I was wondering about writing an external script, that would update
the settings for each of the nodes automatically, based on one central
'list', hosted on, for ex. the NameNode. While this isn't nearly on the
level of a real dynamic setup, it would make my job significantly easier.

Thanks in advance,
Zaki