You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Tom White (JIRA)" <ji...@apache.org> on 2008/09/08 19:13:44 UTC

[jira] Created: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Improve configurability of Hadoop EC2 instances
-----------------------------------------------

                 Key: HADOOP-4117
                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
             Project: Hadoop Core
          Issue Type: Improvement
          Components: contrib/ec2
            Reporter: Tom White
            Assignee: Tom White
             Fix For: 0.19.0


Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.

It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631846#action_12631846 ] 

Tom White commented on HADOOP-4117:
-----------------------------------

Chris,

Thanks for pointing that out. I can set the execute bit when the scripts are committed. Apart from that, do the changes look OK?

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631849#action_12631849 ] 

Chris K Wensel commented on HADOOP-4117:
----------------------------------------

well, it might be safer to have the script chmod the file on the server when pushed up to init.d.

i'll chmod the file locally and try testing the scripts again tonight and see if it carries over scp reliably.

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631641#action_12631641 ] 

Chris K Wensel commented on HADOOP-4117:
----------------------------------------

 ec2-run-user-data is not made executable, and isn't getting run on startup.

[root@ip-10-251-203-243 init.d]# ls -la ec2-run-user-data 
-rw-r--r-- 1 root root 1763 Sep 16 23:17 ec2-run-user-data

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Chris K Wensel (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631989#action_12631989 ] 

Chris K Wensel commented on HADOOP-4117:
----------------------------------------

+1 everything booted up this time. looks great

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117-v2.patch, hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4117:
------------------------------

    Attachment: hadoop-4117.patch

This patch passes the boot script as user data to the EC2 instance on launch. This makes it easy to change config by editing the boot script. You can change other config in the script too. For example, it would be the obvious place to add set up code for HBase (see HBASE-838). I've also changed the script to fix HADOOP-3783.

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated HADOOP-4117:
------------------------------

    Attachment: hadoop-4117-v2.patch

I did a local chmod and it worked - I've used these scripts a few times. But here's a new patch where the file's mode is changed on the instance, which should be more reliable (although scp -p would be an alternative).

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117-v2.patch, hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White resolved HADOOP-4117.
-------------------------------

      Resolution: Fixed
    Release Note: Changed scripts to pass initialization script for EC2 instances at boot time (as EC2 user data) rather than embedding initialization information in the EC2 image. This change makes it easy to customize the hadoop-site.xml file for your cluster before launch, by editing the hadoop-ec2-init-remote.sh script, or by setting the environment variable USER_DATA_FILE in hadoop-ec2-env.sh to run a script of your choice.
    Hadoop Flags: [Reviewed]

I've just committed this.

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117-v2.patch, hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4117) Improve configurability of Hadoop EC2 instances

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633328#action_12633328 ] 

Hudson commented on HADOOP-4117:
--------------------------------

Integrated in Hadoop-trunk #611 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/611/])

> Improve configurability of Hadoop EC2 instances
> -----------------------------------------------
>
>                 Key: HADOOP-4117
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4117
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/ec2
>            Reporter: Tom White
>            Assignee: Tom White
>             Fix For: 0.19.0
>
>         Attachments: hadoop-4117-v2.patch, hadoop-4117.patch
>
>
> Currently hadoop-site.xml for EC2 instances is stored as a part of the image and only a few properties can be controlled from the user scripts (compression, number of map/reduce tasks). Furthermore, it is not possible to rsync the configuration around the EC2 cluster with the current image, so the only way to customize the hadoop-site.xml file is to rebuild the image, which is time-consuming.
> It would be much better to pass the initialization script for nodes at boot time, so that it is easy to edit the configuration before starting a cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.