You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Tibor Kiss (JIRA)" <ji...@apache.org> on 2011/06/18 16:08:47 UTC
[jira] [Issue Comment Edited] (WHIRR-88) Support image creation

    [ https://issues.apache.org/jira/browse/WHIRR-88?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051517#comment-13051517 ] 

Tibor Kiss edited comment on WHIRR-88 at 6/18/11 2:06 PM:
----------------------------------------------------------

I've played with some image creation methods. There are no straightforward approaches currently, therefore I started to experiment it with my CDH service.
At first, I just made a manual image creation method. Having a newly started minimal cluster, I started to use onle the master node where manually I installed all the software packages which may exists on each type of the instance templates in that cluster. Then I disabled all the softwares, such as hadoop-0.20-* by using chkconfig .. off command. I also manually cleaned up the instance
{quote}
for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop; done
yum install hadoop-0.20-datanode
yum install hadoop-0.20-tasktracker
chkconfig hadoop-0.20-datanode off
chkconfig hadoop-0.20-jobtracker off
chkconfig hadoop-0.20-namenode off
chkconfig hadoop-0.20-tasktracker off
rm -f /root/.*hist* $HOME/.*hist*
rm -f /var/log/*.gz
find /var/log -name mysql -prune -o -type f -print | while read i; do sudo cp /dev/null $i; done
rm -f /var/log/oozie/*
rm -f /var/log/hadoop/*
rm -rf /var/log/hadoop/history
rm -rf /var/log/hadoop/userlogs
{quote}

Then I made some changes to the /etc/sudoers, to allow login with ec2-user (personally I used Amazon Linux). Note that the whirr scripts is overwriting that, so we need to add it again at this step.
{quote}
cat /home/users/web/.ssh/authorized_keys >> /home/ec2-user/.ssh/authorized_keys
echo "ec2-user ALL = (ALL) NOPASSWD: ALL" >> /etc/sudoers
{quote}

Then I am logging in with ec2-user, then sudo su - root
{quote}
userdel --remove web
rm -rf /etc/hadoop/conf.dist
rm -rf /mnt/perf
rm -rf /data/perf
rm -f /home/ec2-user/setup-web
rm -rf /tmp/Jetty*
rm -rf /tmp/hsperf*
rm -rf /tmp/jclouds*
rm -rf /tmp/logs
rm -f /tmp/*
{quote}

While creating the AMI, I excluded /root/.ssh,/home/ec2-user/.ssh,/data,/data0,/data1 (in fact all /data*).

Then I planned to switch from 
{quote}
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
{quote}
to
{quote}
whirr.hadoop-install-function=prepare_cdh_hadoop
whirr.hadoop-configure-function=reconfigure_cdh_hadoop
whirr.image-id=us-east-1/ami-12345678
jclouds.ec2.ami-owners=123456789012
whirr.login-user=ec2-user
{quote}

Of course that change required to rewrite most of the hadoop related scripts in order to have such a separation (install/prepare and configure/reconfigure).

Unfortunately not only the functions from the exact service in cause I had to modify it, but also from the core. From the core, the core/src/main/resources/functions/install_java.sh I made idempotent, such as both functions, install_java_rpm and install_java_deb will get an if {quote}if [ ! -e $JDK_INSTALL_PATH ]{quote} respectively {quote}if [ ! -e /usr/lib/jvm/java-6-sun ]{quote} and closed before {quote}echo "export JAVA_HOME{quote}.
In the same way the services/hadoop/src/main/resources/functions/install_hadoop.sh also get a condition {quote}if [ ! -e $HADOOP_HOME ]{quote} immediately after line {quote}HADOOP_HOME=/usr/local/$(basename $HADOOP_TAR_URL .tar.gz){quote}. (Note that I don't really understand why the Apache version is unpacked while I am installing the CDH based services. Maybe is a bug only!)

At the end, I had in my functions directory the following files
{quote}configure_cdh_hadoop.sh  install_cdh_hadoop.sh  install_hadoop.sh  install_java.sh  prepare_cdh_hadoop.sh  reconfigure_cdh_hadoop.sh{quote}

With this configuration switched as described earlier I was able to fire-up my CDH based Hadoop from my private AMI created manually. 
Not all of the changes I described here in detail, but at the end I get up and running a cluster without reinstalling from the scratch everything.

In my opinion this approach is not really a solid approach and not just because the image I have created it was manually created. For example, I removed the web user, just to be able to cope with the tricky steps used by whirr when starting... it starts with default ec2-user then there it starts a setup-web script which creates the web user. Without rewriting this tricky startup, which is implemented in core, I had only one solution, by removing the web user. So that web user is getting recreated when the instance is started. I don't really understand all the motivations behind this initialization, but I'm sure here is getting complicated any attempts to create private images.

>From a higher perspective, my opinion regarding the image creation.. is not necessary to have a new command for image creation from the scratch such as the cluster creation, but a different approach, such as fire-up the cluster as it is now then do the tests with it and if you like it say persist the cluster. Then later on, you may choose to switch your cluster startup from own image or by building from the scratch. Maybe a we need to add whirr.hadoop-cleanup-function which will be used when cleaning up. Most of the changes can be made in scripts such as each entry function has to have two entries or all of them made idempotent regarding that such install and configure lines were already present or not.

Is getting complicated the process how we get a single instance which contains all the software from all template combinations, such as now in case of hadoop, we would like to persist all kind of hadoop softwares preinstalled but not set to auto start. Who knows all the rules may arise? Or there are scenarios when not just a single common private image needs to be created?
Anyway, this request raises several questions.

      was (Author: tibor.kiss):
    I've played with some image creation methods. There are no straightforward approaches currently, therefore I started to experiment it with my CDH service.
At first, I just made a manual image creation method. Having a newly started minimal cluster, I started to use onle the master node where manually I installed all the software packages which may exists on each type of the instance templates in that cluster. Then I disabled all the softwares, such as hadoop-0.20-* by using chkconfig .. off command. I also manually cleaned up the instance
{{{
for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop; done
yum install hadoop-0.20-datanode
yum install hadoop-0.20-tasktracker
chkconfig hadoop-0.20-datanode off
chkconfig hadoop-0.20-jobtracker off
chkconfig hadoop-0.20-namenode off
chkconfig hadoop-0.20-tasktracker off
rm -f /root/.*hist* $HOME/.*hist*
rm -f /var/log/*.gz
find /var/log -name mysql -prune -o -type f -print | while read i; do sudo cp /dev/null $i; done
rm -f /var/log/oozie/*
rm -f /var/log/hadoop/*
rm -rf /var/log/hadoop/history
rm -rf /var/log/hadoop/userlogs
}}}

Then I made some changes to the /etc/sudoers, to allow login with ec2-user (personally I used Amazon Linux). Note that the whirr scripts is overwriting that, so we need to add it again at this step.
{{{
cat /home/users/web/.ssh/authorized_keys >> /home/ec2-user/.ssh/authorized_keys
echo "ec2-user ALL = (ALL) NOPASSWD: ALL" >> /etc/sudoers
}}}

Then I am logging in with ec2-user, then sudo su - root
{{{
userdel --remove web
rm -rf /etc/hadoop/conf.dist
rm -rf /mnt/perf
rm -rf /data/perf
rm -f /home/ec2-user/setup-web
rm -rf /tmp/Jetty*
rm -rf /tmp/hsperf*
rm -rf /tmp/jclouds*
rm -rf /tmp/logs
rm -f /tmp/*
}}}

While creating the AMI, I excluded /root/.ssh,/home/ec2-user/.ssh,/data,/data0,/data1 (in fact all /data*).

Then I planned to switch from 
{{{
whirr.hadoop-install-function=install_cdh_hadoop
whirr.hadoop-configure-function=configure_cdh_hadoop
}}}
to
{{{
whirr.hadoop-install-function=prepare_cdh_hadoop
whirr.hadoop-configure-function=reconfigure_cdh_hadoop
whirr.image-id=us-east-1/ami-12345678
jclouds.ec2.ami-owners=123456789012
whirr.login-user=ec2-user
}}}

Of course that change required to rewrite most of the hadoop related scripts in order to have such a separation (install/prepare and configure/reconfigure).

Unfortunately not only the functions from the exact service in cause I had to modify it, but also from the core. From the core, the core/src/main/resources/functions/install_java.sh I made idempotent, such as both functions, install_java_rpm and install_java_deb will get an if {{{if [ ! -e $JDK_INSTALL_PATH ]}}} respectively {{{if [ ! -e /usr/lib/jvm/java-6-sun ]}}} and closed before {{{echo "export JAVA_HOME}}}.
In the same way the services/hadoop/src/main/resources/functions/install_hadoop.sh also get a condition {{{if [ ! -e $HADOOP_HOME ]}}} immediately after line {{{HADOOP_HOME=/usr/local/$(basename $HADOOP_TAR_URL .tar.gz)}}}. (Note that I don't really understand why the Apache version is unpacked while I am installing the CDH based services. Maybe is a bug only!)

At the end, I had in my functions directory the following files
{{{configure_cdh_hadoop.sh  install_cdh_hadoop.sh  install_hadoop.sh  install_java.sh  prepare_cdh_hadoop.sh  reconfigure_cdh_hadoop.sh}}}

With this configuration switched as described earlier I was able to fire-up my CDH based Hadoop from my private AMI created manually. 
Not all of the changes I described here in detail, but at the end I get up and running a cluster without reinstalling from the scratch everything.

In my opinion this approach is not really a solid approach and not just because the image I have created it was manually created. For example, I removed the web user, just to be able to cope with the tricky steps used by whirr when starting... it starts with default ec2-user then there it starts a setup-web script which creates the web user. Without rewriting this tricky startup, which is implemented in core, I had only one solution, by removing the web user. So that web user is getting recreated when the instance is started. I don't really understand all the motivations behind this initialization, but I'm sure here is getting complicated any attempts to create private images.

>From a higher perspective, my opinion regarding the image creation.. is not necessary to have a new command for image creation from the scratch such as the cluster creation, but a different approach, such as fire-up the cluster as it is now then do the tests with it and if you like it say persist the cluster. Then later on, you may choose to switch your cluster startup from own image or by building from the scratch. Maybe a we need to add whirr.hadoop-cleanup-function which will be used when cleaning up. Most of the changes can be made in scripts such as each entry function has to have two entries or all of them made idempotent regarding that such install and configure lines were already present or not.

Is getting complicated the process how we get a single instance which contains all the software from all template combinations, such as now in case of hadoop, we would like to persist all kind of hadoop softwares preinstalled but not set to auto start. Who knows all the rules may arise? Or there are scenarios when not just a single common private image needs to be created?
Anyway, this request raises several questions.
  
> Support image creation
> ----------------------
>
>                 Key: WHIRR-88
>                 URL: https://issues.apache.org/jira/browse/WHIRR-88
>             Project: Whirr
>          Issue Type: New Feature
>          Components: core
>            Reporter: Tom White
>
> Much of the time taken to start a cluster is in installing the software on the instances. By allowing users to build their own images it would make cluster launches faster. The way this could work is by having a create image step that brings up an instance and runs the install scripts on it before creating an image from it. The resulting image would then be used in subsequent launches.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira