You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-user@hadoop.apache.org by "MrAsanjar ." <af...@gmail.com> on 2014/06/19 18:17:24 UTC

Deploying Hadoop 2.x test cluster with Juju in minutes (local LXC)

Hi all,
My name is Amir Sanjar, team lead at Canonical Big Data Solution Center.
For those of us who had the luxury of deploying hadoop across any types of
clusters, bare-metal or cloud, are well aware it is a science that could
consume precious many happy hour time :).
Few weeks ago, I finally bow to Juju hype and developed a  juju Charm
deploy-er for hadoop 2.2.0 version.
Am I glad that I did that, it took us less time to deploy 200 nodes on
ec2/ubuntu then watching a WC football match, poor Spain.
However as a hadoop developer, I love the fact of having the ability of
building a fully functional hadoop cluster (6 nodes seamlessly using LXC)
on my laptop in 10 minutes. So I would like to share my Juju hadoop Charm
with you, if you don't know Canonical Juju, no worries, you can become juju
master in half an hour:

*What is Juju?*
https://juju.ubuntu.com/
*How to install and setup juju (use local):*
https://juju.ubuntu.com/docs/getting-started.html


*How to install hadoop 2.2.0 charm from command line:*
* * tyep command "*juju bootstrap"

Simple Usage: Combined HDFS and YARN RM

In this configuration, the YARN ResourceManager is deployed on the same
service units as HDFS namenode and the HDFS datanodes also run YARN
NodeManager::

juju deploy hadoop hadoop-master
juju deploy hadoop hadoop-slavecluster
juju add-unit -n 2 hadoop-slavecluster
juju add-relation hadoop-master:namenode hadoop-slavecluster:datanode
juju add-relation hadoop-master:resourcemanager hadoop-slavecluster:nodemanager

Scale Out Usage: Separate HDFS and YARN RM

In this configuration the HDFS and YARN deployments operate on different
service units as separate services::

juju deploy hadoop hdfs-namenode
juju deploy hadoop hdfs-datacluster
juju add-unit -n 2 hdfs-datacluster
juju add-relation hdfs-namenode:namenode hdfs-datacluster:datanode

juju deploy hadoop mapred-resourcemanager
juju deploy hadoop mapred-taskcluster
juju add-unit -n 2 mapred-taskcluster
juju add-relation mapred-resourcemanager:mapred-namenode hdfs-namenode:namenode
juju add-relation mapred-taskcluster:mapred-namenode hdfs-namenode:namenode
juju add-relation mapred-resourcemanager:resourcemanager
mapred-taskcluster:nodemanager


Connecting to the hadoop-master node:
* "juju ssh hadoop-master/0"
it is ready to go..