You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Karl Anderson (JIRA)" <ji...@apache.org> on 2008/11/04 02:11:44 UTC
[jira] Created: (HADOOP-4585) unused and misleading configuration
in hadoop-init
unused and misleading configuration in hadoop-init
--------------------------------------------------
Key: HADOOP-4585
URL: https://issues.apache.org/jira/browse/HADOOP-4585
Project: Hadoop Core
Issue Type: Improvement
Components: contrib/ec2
Affects Versions: 0.18.1
Reporter: Karl Anderson
Priority: Minor
src/contrib/ec2/bin/image/hadoop-init is appended to rc.local on all
ec2 cluster boxes. This shell script generates the hadoop-site.xml
configuration file. It starts with some default settings, which are
used to populate the file. These defaults are then overwritten by the
user data (from hadoop-ec2-env.sh) passed to the EC2 instance by
launch-hadoop-master and launch-hadoop-slaves.
This isn't a bug; setting variables in hadoop-ec2-env.sh does the
right thing. However, it's dead and misleading code (well, it misled
me) and running a test Hadoop job to figure out what's going on takes
a little effort.
Suggested change to hadoop-init:
Remove these lines:
# set defaults
MAX_TASKS=3
[ "$INSTANCE_TYPE" == "m1.large" ] && MAX_TASKS=6
[ "$INSTANCE_TYPE" == "m1.xlarge" ] && MAX_TASKS=12
MAX_MAP_TASKS=$MAX_TASKS
MAX_REDUCE_TASKS=$MAX_TASKS
Add a comment before the lines which access the user data:
# get user data passed in by the ec2 instance launch
wget -q -O - http://169.254.169.254/latest/user-data | tr ',' '\n' > /tmp/user-data
source /tmp/user-data
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HADOOP-4585) unused and misleading configuration
in hadoop-init
Posted by "Karl Anderson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Anderson updated HADOOP-4585:
----------------------------------
Description:
src/contrib/ec2/bin/image/hadoop-init is appended to rc.local on all
ec2 cluster boxes. This shell script generates the hadoop-site.xml
configuration file. It starts with some default settings, which are
used to populate the file. These defaults are then overwritten by the
user data (from hadoop-ec2-env.sh) passed to the EC2 instance by
launch-hadoop-master and launch-hadoop-slaves.
This isn't a bug; setting variables in hadoop-ec2-env.sh does the
right thing. However, it's dead and misleading code (well, it misled
me) and running a test Hadoop job to figure out what's going on takes
a little effort.
Suggested change to hadoop-init:
Remove these lines:
{noformat}
# set defaults
MAX_TASKS=3
[ "$INSTANCE_TYPE" == "m1.large" ] && MAX_TASKS=6
[ "$INSTANCE_TYPE" == "m1.xlarge" ] && MAX_TASKS=12
MAX_MAP_TASKS=$MAX_TASKS
MAX_REDUCE_TASKS=$MAX_TASKS
{noformat}
Add a comment before the lines which access the user data:
{noformat}
# get user data passed in by the ec2 instance launch
wget -q -O - http://169.254.169.254/latest/user-data | tr ',' '\n' > /tmp/user-data
source /tmp/user-data
{noformat}
was:
src/contrib/ec2/bin/image/hadoop-init is appended to rc.local on all
ec2 cluster boxes. This shell script generates the hadoop-site.xml
configuration file. It starts with some default settings, which are
used to populate the file. These defaults are then overwritten by the
user data (from hadoop-ec2-env.sh) passed to the EC2 instance by
launch-hadoop-master and launch-hadoop-slaves.
This isn't a bug; setting variables in hadoop-ec2-env.sh does the
right thing. However, it's dead and misleading code (well, it misled
me) and running a test Hadoop job to figure out what's going on takes
a little effort.
Suggested change to hadoop-init:
Remove these lines:
# set defaults
MAX_TASKS=3
[ "$INSTANCE_TYPE" == "m1.large" ] && MAX_TASKS=6
[ "$INSTANCE_TYPE" == "m1.xlarge" ] && MAX_TASKS=12
MAX_MAP_TASKS=$MAX_TASKS
MAX_REDUCE_TASKS=$MAX_TASKS
Add a comment before the lines which access the user data:
# get user data passed in by the ec2 instance launch
wget -q -O - http://169.254.169.254/latest/user-data | tr ',' '\n' > /tmp/user-data
source /tmp/user-data
> unused and misleading configuration in hadoop-init
> --------------------------------------------------
>
> Key: HADOOP-4585
> URL: https://issues.apache.org/jira/browse/HADOOP-4585
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/ec2
> Affects Versions: 0.18.1
> Reporter: Karl Anderson
> Priority: Minor
>
> src/contrib/ec2/bin/image/hadoop-init is appended to rc.local on all
> ec2 cluster boxes. This shell script generates the hadoop-site.xml
> configuration file. It starts with some default settings, which are
> used to populate the file. These defaults are then overwritten by the
> user data (from hadoop-ec2-env.sh) passed to the EC2 instance by
> launch-hadoop-master and launch-hadoop-slaves.
> This isn't a bug; setting variables in hadoop-ec2-env.sh does the
> right thing. However, it's dead and misleading code (well, it misled
> me) and running a test Hadoop job to figure out what's going on takes
> a little effort.
> Suggested change to hadoop-init:
> Remove these lines:
> {noformat}
> # set defaults
> MAX_TASKS=3
> [ "$INSTANCE_TYPE" == "m1.large" ] && MAX_TASKS=6
> [ "$INSTANCE_TYPE" == "m1.xlarge" ] && MAX_TASKS=12
> MAX_MAP_TASKS=$MAX_TASKS
> MAX_REDUCE_TASKS=$MAX_TASKS
> {noformat}
> Add a comment before the lines which access the user data:
> {noformat}
> # get user data passed in by the ec2 instance launch
> wget -q -O - http://169.254.169.254/latest/user-data | tr ',' '\n' > /tmp/user-data
> source /tmp/user-data
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HADOOP-4585) unused and misleading configuration
in hadoop-init
Posted by "Tom White (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HADOOP-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom White resolved HADOOP-4585.
-------------------------------
Resolution: Won't Fix
This was fixed in HADOOP-4117
> unused and misleading configuration in hadoop-init
> --------------------------------------------------
>
> Key: HADOOP-4585
> URL: https://issues.apache.org/jira/browse/HADOOP-4585
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/ec2
> Affects Versions: 0.18.1
> Reporter: Karl Anderson
> Priority: Minor
>
> src/contrib/ec2/bin/image/hadoop-init is appended to rc.local on all
> ec2 cluster boxes. This shell script generates the hadoop-site.xml
> configuration file. It starts with some default settings, which are
> used to populate the file. These defaults are then overwritten by the
> user data (from hadoop-ec2-env.sh) passed to the EC2 instance by
> launch-hadoop-master and launch-hadoop-slaves.
> This isn't a bug; setting variables in hadoop-ec2-env.sh does the
> right thing. However, it's dead and misleading code (well, it misled
> me) and running a test Hadoop job to figure out what's going on takes
> a little effort.
> Suggested change to hadoop-init:
> Remove these lines:
> {noformat}
> # set defaults
> MAX_TASKS=3
> [ "$INSTANCE_TYPE" == "m1.large" ] && MAX_TASKS=6
> [ "$INSTANCE_TYPE" == "m1.xlarge" ] && MAX_TASKS=12
> MAX_MAP_TASKS=$MAX_TASKS
> MAX_REDUCE_TASKS=$MAX_TASKS
> {noformat}
> Add a comment before the lines which access the user data:
> {noformat}
> # get user data passed in by the ec2 instance launch
> wget -q -O - http://169.254.169.254/latest/user-data | tr ',' '\n' > /tmp/user-data
> source /tmp/user-data
> {noformat}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.