You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@whirr.apache.org by Jonathan Rhone <rh...@tinyco.com> on 2011/10/20 05:55:09 UTC

Whirr 0.6 + Cassandra 1.0

Anyone used whirr 0.6 with Cassandra 1.0 and a custom ec2 ami successfully?
 The properties file below consistently fails for me.

whirr.cluster-name=cassandraDev
whirr.instance-templates=3 cassandra
 whirr.provider=aws-ec2
whirr.identity=[...]
whirr.credential=[...]
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
whirr.hardware-id=m1.large
whirr.image-id=us-east-1/ami-1136fb78
whirr.location-id=us-east-1
whirr.cassandra.version.major=1.0
whirr.cassandra.tarball.url=
http://archive.apache.org/dist/cassandra/1.0.0/apache-cassandra-1.0.0-bin.tar.gz


Whirr launch-cluster output:
Bootstrapping cluster
Configuring template
Starting 3 node(s) with roles [cassandra]
Dying because - java.net.SocketException: Connection reset
Dying because - java.net.SocketException: Connection reset
<<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
Connection reset
<< (ubuntu@50.17.156.68:22) error acquiring SSHClient(ubuntu@50.17.156.68:22):
Connection reset
net.schmizz.sshj.transport.TransportException: Connection reset
at
net.schmizz.sshj.transport.TransportException$1.chain(TransportException.java:33)
..
..
..
Dying because - java.net.SocketTimeoutException: Read timed out
<<authenticated>> woke to: net.schmizz.sshj.userauth.UserAuthException:
publickey auth failed


Whirr.log: (a bunch of the messages warn that 'some action has already been
attempted'
..
..
2011-10-20 03:25:09,465 INFO
 [org.apache.whirr.actions.ConfigureClusterAction] (main) Running
configuration script on nodes: [us-east-1/i-4e30222e, us-east-1/i-50302230,
us-east-1/i-52302232]
2011-10-20 03:25:09,467 DEBUG
[org.apache.whirr.actions.ConfigureClusterAction] (main) script:
#!/bin/bash
set +u
shopt -s xpg_echo
..
..
..
configure_cassandra -c aws-ec2 10.46.38.219 || exit 1
start_cassandra || exit 1
exit 0

2011-10-20 03:25:10,142 DEBUG [jclouds.compute] (user thread 8) >> blocking
on socket [address=50.17.156.68, port=22] for 600000 seconds
2011-10-20 03:25:10,143 DEBUG [jclouds.compute] (user thread 7) >> blocking
on socket [address=107.20.113.254, port=22] for 600000 seconds
2011-10-20 03:25:10,144 DEBUG [jclouds.compute] (user thread 1) >> blocking
on socket [address=107.22.52.0, port=22] for 600000 seconds
2011-10-20 03:25:10,145 DEBUG [jclouds.compute] (user thread 7) << socket
[address=107.20.113.254, port=22] opened
2011-10-20 03:25:10,145 DEBUG [jclouds.compute] (user thread 8) << socket
[address=50.17.156.68, port=22] opened
2011-10-20 03:25:10,145 DEBUG [jclouds.compute] (user thread 1) << socket
[address=107.22.52.0, port=22] opened
2011-10-20 03:29:19,993 DEBUG [org.apache.whirr.service.ComputeCache]
(Thread-1) closing ComputeServiceContext  [id=aws-ec2, endpoint=
https://ec2.us-east-1.amazonaws.com, apiVersion=2010-06-15,
identity=AKIAJFI6GBE4GISI4PJQ, iso3166Codes=[US-VA, US-CA, IE, SG, JP-13]]

Thanks,
-- 

jon

Re: whirr.hadoop.tarball.url property or specifying exact releases

Posted by Andrei Savu <sa...@gmail.com>.
I have created a new JIRA issue for this:

https://issues.apache.org/jira/browse/WHIRR-415

It seems like the scripts automatically go for the latest release.

-- Andrei Savu

On Fri, Oct 28, 2011 at 12:48 AM, Andrei Savu <sa...@gmail.com> wrote:

> Paul,
>
> Thanks for sharing this impressive recipe. The problem you are seeing is
> not really a problem it's more like a "feature". The tarball URLs are
> completely ignored if you are deploying CDH. All the binaries are deployed
> from the Cloudera CDH repos (check the
> {install/config_cdh}_{zookeeper/hbase/hadoop}).
>
> I will start to look for a way of deploying cdh3u1 - we should be able to
> specify this as a version.
>
> -- Andrei
>
> On Thu, Oct 27, 2011 at 10:48 PM, Paul Baclace <pa...@gmail.com>wrote:
>
>>  On 20111027 2:20 , Andrei Savu wrote:
>>
>>
>>  On Thu, Oct 27, 2011 at 12:05 PM, Paul Baclace <pa...@gmail.com>wrote:
>>
>>> I don't expect that the cdh3u2 files came from a cdh3u1 tarball.
>>
>>
>> I see no cdh3u2 files inside that tarball. Can you share the full
>> .properties file?
>>
>> My best guess is that some other installation specification
>> (*.install-function prop) has the side effect of overriding the tarball
>> property, if there is a more recent CDH release.  If that is the case, then
>> either the tarball.url props need to be documented as "set all tarballs or
>> none" (a dicey feature) or the installation logic must allow a single
>> tarball.url to override implied installations (be they tarball or packages).
>>
>> The actual, full whirr.config from the particular run is below (with
>> sensitive bits removed).  Some values are supplied by env on the launching
>> host.
>>
>>
>> Paul
>>
>> --------------------whirr.config----------------
>> hadoop-common.fs.checkpoint.dir=/mnt/hadoop/dfs/namesecondary
>> hadoop-common.fs.s3.awsAccessKeyId=XXXXXXXXXXXXXX
>> hadoop-common.fs.s3.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
>> hadoop-common.fs.s3bfs.awsAccessKeyId=XXXXXXXXXXXXXX
>> hadoop-common.fs.s3bfs.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
>> hadoop-common.fs.s3.block.size=${env:BLOCK_SIZE}
>> hadoop-common.fs.s3.maxRetries=20
>> hadoop-common.fs.s3n.awsAccessKeyId=XXXXXXXXXXXXXX
>> hadoop-common.fs.s3n.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
>> hadoop-common.fs.s3.sleepTimeSeconds=4
>> hadoop-common.hadoop.tmp.dir=/mnt/hadoop/tmp/user_${user.name}
>> hadoop-common.io.file.buffer.size=65536
>> hadoop-common.io.sort.factor=25
>> hadoop-common.io.sort.mb=100
>> hadoop-common.webinterface.private.actions=true
>> hadoop-hdfs.dfs.block.size=${env:BLOCK_SIZE}
>> hadoop-hdfs.dfs.data.dir=/mnt/hadoop/dfs/data
>> hadoop-hdfs.dfs.datanode.du.reserved=500000000
>> hadoop-hdfs.dfs.datanode.max.xcievers=1000
>> hadoop-hdfs.dfs.heartbeat.interval=1
>> hadoop-hdfs.dfs.name.dir=/mnt/hadoop/dfs/name
>> hadoop-hdfs.dfs.permissions=false
>> hadoop-hdfs.dfs.replication=${env:REPLICATION_FACTOR}
>> hadoop-hdfs.dfs.support.append=true
>> hadoop-mapreduce.keep.failed.task.file=true
>> hadoop-mapreduce.mapred.child.java.opts=-Xmx550m -Xms200m
>> -Djava.net.preferIPv4Stack=true
>> hadoop-mapreduce.mapred.child.ulimit=1126400
>> hadoop-mapreduce.mapred.compress.map.output=true
>> hadoop-mapreduce.mapred.job.reuse.jvm.num.tasks=1
>> hadoop-mapreduce.mapred.jobtracker.completeuserjobs.maximum=1000
>> hadoop-mapreduce.mapred.local.dir=/mnt/hadoop/mapred/local/user_${
>> user.name}
>> hadoop-mapreduce.mapred.map.max.attempts=2
>> hadoop-mapreduce.mapred.map.tasks=${env:N_MAP_TASKS_JOB_DEFAULT}
>> hadoop-mapreduce.mapred.map.tasks.speculative.execution=false
>> hadoop-mapreduce.mapred.min.split.size=${env:BLOCK_SIZE}
>> hadoop-mapreduce.mapred.output.compression.type=BLOCK
>> hadoop-mapreduce.mapred.reduce.max.attempts=2
>> hadoop-mapreduce.mapred.reduce.tasks=${env:N_REDUCE_TASKS_JOB_DEFAULT}
>> hadoop-mapreduce.mapred.reduce.tasks.speculative.execution=false
>> hadoop-mapreduce.mapred.system.dir=/hadoop/system/mapred
>>
>> hadoop-mapreduce.mapred.tasktracker.map.tasks.maximum=${env:N_MAP_TASKS_PER_TRACKER}
>>
>> hadoop-mapreduce.mapred.tasktracker.reduce.tasks.maximum=${env:N_REDUCE_TASKS_PER_TRACKER}
>> hadoop-mapreduce.mapred.temp.dir=/mnt/hadoop/mapred/temp/user_${user.name
>> }
>> hadoop-mapreduce.mapreduce.jobtracker.staging.root.dir=/user
>> hbase-site.dfs.datanode.max.xcievers=1000
>> hbase-site.dfs.replication=2
>> hbase-site.dfs.support.append=true
>> hbase-site.hbase.client.pause=3000
>> hbase-site.hbase.cluster.distributed=true
>> hbase-site.hbase.rootdir=${fs.default.name}user/hbase
>> hbase-site.hbase.tmp.dir=/mnt/hbase/tmp
>> hbase-site.hbase.zookeeper.property.dataDir=/mnt/zookeeper/snapshot
>> hbase-site.hbase.zookeeper.property.initLimit=30
>> hbase-site.hbase.zookeeper.property.maxClientCnxns=2000
>> hbase-site.hbase.zookeeper.property.syncLimit=10
>> hbase-site.hbase.zookeeper.property.tickTime=6000
>> hbase-site.hbase.zookeeper.quorum=${fs.default.name}
>> hbase-site.zookeeper.session.timeout=120000
>> jclouds.aws-s3.endpoint=us-west-1
>>
>> jclouds.ec2.ami-query=owner-id=999999999999;state=available;image-type=machine;root-device-type=instance-store;architecture=x86_32
>> jclouds.ec2.cc-regions=us-west-1
>> jclouds.ec2.timeout.securitygroup-present=1500
>> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
>> whirr.hadoop.configure-function=configure_cdh_hadoop
>> whirr.hadoop.install-function=install_cdh_hadoop
>> whirr.hadoop.tarball.url=
>> http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u1.tar.gz
>> whirr.hadoop.version=0.20.2
>> whirr.hardware-id=c1.medium
>> whirr.hbase.configure-function=configure_cdh_hbase
>> whirr.hbase.install-function=install_cdh_hbase
>> whirr.hbase.tarball.url=
>> http://apache.cs.utah.edu/hbase/hbase-0.90.3/hbase-0.90.3.tar.gz
>> whirr.identity=${env:AWS_ACCESS_KEY_ID}
>> whirr.image-id=us-west-1/ami-ffffffff
>> whirr.instance-templates=1
>> hadoop-jobtracker+hadoop-namenode+hbase-master+zookeeper+ganglia-metad,2
>> hadoop-datanode+hadoop-tasktracker+hbase-regionserver+ganglia-monitor
>> whirr.instance-templates-minimum-number-of-instances=1
>> hadoop-jobtracker+hadoop-namenode+hbase-master+zookeeper+ganglia-metad,2
>> hadoop-datanode+hadoop-tasktracker+hbase-regionserver+ganglia-monitor
>> whirr.location-id=us-west-1
>> whirr.login-user=ubuntu
>> whirr.max-startup-retries=4
>> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
>> whirr.provider=aws-ec2
>> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
>> whirr.zookeeper.configure-function=configure_cdh_zookeeper
>> whirr.zookeeper.install-function=install_cdh_zookeeper
>> --------------------
>>
>
>

Re: whirr.hadoop.tarball.url property or specifying exact releases

Posted by Andrei Savu <sa...@gmail.com>.
Paul,

Thanks for sharing this impressive recipe. The problem you are seeing is not
really a problem it's more like a "feature". The tarball URLs are completely
ignored if you are deploying CDH. All the binaries are deployed from the
Cloudera CDH repos (check the
{install/config_cdh}_{zookeeper/hbase/hadoop}).

I will start to look for a way of deploying cdh3u1 - we should be able to
specify this as a version.

-- Andrei

On Thu, Oct 27, 2011 at 10:48 PM, Paul Baclace <pa...@gmail.com>wrote:

>  On 20111027 2:20 , Andrei Savu wrote:
>
>
>  On Thu, Oct 27, 2011 at 12:05 PM, Paul Baclace <pa...@gmail.com>wrote:
>
>> I don't expect that the cdh3u2 files came from a cdh3u1 tarball.
>
>
> I see no cdh3u2 files inside that tarball. Can you share the full
> .properties file?
>
> My best guess is that some other installation specification
> (*.install-function prop) has the side effect of overriding the tarball
> property, if there is a more recent CDH release.  If that is the case, then
> either the tarball.url props need to be documented as "set all tarballs or
> none" (a dicey feature) or the installation logic must allow a single
> tarball.url to override implied installations (be they tarball or packages).
>
> The actual, full whirr.config from the particular run is below (with
> sensitive bits removed).  Some values are supplied by env on the launching
> host.
>
>
> Paul
>
> --------------------whirr.config----------------
> hadoop-common.fs.checkpoint.dir=/mnt/hadoop/dfs/namesecondary
> hadoop-common.fs.s3.awsAccessKeyId=XXXXXXXXXXXXXX
> hadoop-common.fs.s3.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
> hadoop-common.fs.s3bfs.awsAccessKeyId=XXXXXXXXXXXXXX
> hadoop-common.fs.s3bfs.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
> hadoop-common.fs.s3.block.size=${env:BLOCK_SIZE}
> hadoop-common.fs.s3.maxRetries=20
> hadoop-common.fs.s3n.awsAccessKeyId=XXXXXXXXXXXXXX
> hadoop-common.fs.s3n.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
> hadoop-common.fs.s3.sleepTimeSeconds=4
> hadoop-common.hadoop.tmp.dir=/mnt/hadoop/tmp/user_${user.name}
> hadoop-common.io.file.buffer.size=65536
> hadoop-common.io.sort.factor=25
> hadoop-common.io.sort.mb=100
> hadoop-common.webinterface.private.actions=true
> hadoop-hdfs.dfs.block.size=${env:BLOCK_SIZE}
> hadoop-hdfs.dfs.data.dir=/mnt/hadoop/dfs/data
> hadoop-hdfs.dfs.datanode.du.reserved=500000000
> hadoop-hdfs.dfs.datanode.max.xcievers=1000
> hadoop-hdfs.dfs.heartbeat.interval=1
> hadoop-hdfs.dfs.name.dir=/mnt/hadoop/dfs/name
> hadoop-hdfs.dfs.permissions=false
> hadoop-hdfs.dfs.replication=${env:REPLICATION_FACTOR}
> hadoop-hdfs.dfs.support.append=true
> hadoop-mapreduce.keep.failed.task.file=true
> hadoop-mapreduce.mapred.child.java.opts=-Xmx550m -Xms200m
> -Djava.net.preferIPv4Stack=true
> hadoop-mapreduce.mapred.child.ulimit=1126400
> hadoop-mapreduce.mapred.compress.map.output=true
> hadoop-mapreduce.mapred.job.reuse.jvm.num.tasks=1
> hadoop-mapreduce.mapred.jobtracker.completeuserjobs.maximum=1000
> hadoop-mapreduce.mapred.local.dir=/mnt/hadoop/mapred/local/user_${
> user.name}
> hadoop-mapreduce.mapred.map.max.attempts=2
> hadoop-mapreduce.mapred.map.tasks=${env:N_MAP_TASKS_JOB_DEFAULT}
> hadoop-mapreduce.mapred.map.tasks.speculative.execution=false
> hadoop-mapreduce.mapred.min.split.size=${env:BLOCK_SIZE}
> hadoop-mapreduce.mapred.output.compression.type=BLOCK
> hadoop-mapreduce.mapred.reduce.max.attempts=2
> hadoop-mapreduce.mapred.reduce.tasks=${env:N_REDUCE_TASKS_JOB_DEFAULT}
> hadoop-mapreduce.mapred.reduce.tasks.speculative.execution=false
> hadoop-mapreduce.mapred.system.dir=/hadoop/system/mapred
>
> hadoop-mapreduce.mapred.tasktracker.map.tasks.maximum=${env:N_MAP_TASKS_PER_TRACKER}
>
> hadoop-mapreduce.mapred.tasktracker.reduce.tasks.maximum=${env:N_REDUCE_TASKS_PER_TRACKER}
> hadoop-mapreduce.mapred.temp.dir=/mnt/hadoop/mapred/temp/user_${user.name}
> hadoop-mapreduce.mapreduce.jobtracker.staging.root.dir=/user
> hbase-site.dfs.datanode.max.xcievers=1000
> hbase-site.dfs.replication=2
> hbase-site.dfs.support.append=true
> hbase-site.hbase.client.pause=3000
> hbase-site.hbase.cluster.distributed=true
> hbase-site.hbase.rootdir=${fs.default.name}user/hbase
> hbase-site.hbase.tmp.dir=/mnt/hbase/tmp
> hbase-site.hbase.zookeeper.property.dataDir=/mnt/zookeeper/snapshot
> hbase-site.hbase.zookeeper.property.initLimit=30
> hbase-site.hbase.zookeeper.property.maxClientCnxns=2000
> hbase-site.hbase.zookeeper.property.syncLimit=10
> hbase-site.hbase.zookeeper.property.tickTime=6000
> hbase-site.hbase.zookeeper.quorum=${fs.default.name}
> hbase-site.zookeeper.session.timeout=120000
> jclouds.aws-s3.endpoint=us-west-1
>
> jclouds.ec2.ami-query=owner-id=999999999999;state=available;image-type=machine;root-device-type=instance-store;architecture=x86_32
> jclouds.ec2.cc-regions=us-west-1
> jclouds.ec2.timeout.securitygroup-present=1500
> whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
> whirr.hadoop.configure-function=configure_cdh_hadoop
> whirr.hadoop.install-function=install_cdh_hadoop
> whirr.hadoop.tarball.url=
> http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u1.tar.gz
> whirr.hadoop.version=0.20.2
> whirr.hardware-id=c1.medium
> whirr.hbase.configure-function=configure_cdh_hbase
> whirr.hbase.install-function=install_cdh_hbase
> whirr.hbase.tarball.url=
> http://apache.cs.utah.edu/hbase/hbase-0.90.3/hbase-0.90.3.tar.gz
> whirr.identity=${env:AWS_ACCESS_KEY_ID}
> whirr.image-id=us-west-1/ami-ffffffff
> whirr.instance-templates=1
> hadoop-jobtracker+hadoop-namenode+hbase-master+zookeeper+ganglia-metad,2
> hadoop-datanode+hadoop-tasktracker+hbase-regionserver+ganglia-monitor
> whirr.instance-templates-minimum-number-of-instances=1
> hadoop-jobtracker+hadoop-namenode+hbase-master+zookeeper+ganglia-metad,2
> hadoop-datanode+hadoop-tasktracker+hbase-regionserver+ganglia-monitor
> whirr.location-id=us-west-1
> whirr.login-user=ubuntu
> whirr.max-startup-retries=4
> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
> whirr.provider=aws-ec2
> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
> whirr.zookeeper.configure-function=configure_cdh_zookeeper
> whirr.zookeeper.install-function=install_cdh_zookeeper
> --------------------
>

Re: whirr.hadoop.tarball.url property or specifying exact releases

Posted by Paul Baclace <pa...@gmail.com>.
On 20111027 2:20 , Andrei Savu wrote:
>
> On Thu, Oct 27, 2011 at 12:05 PM, Paul Baclace <paul.baclace@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     I don't expect that the cdh3u2 files came from a cdh3u1 tarball. 
>
>
> I see no cdh3u2 files inside that tarball. Can you share the full 
> .properties file?
My best guess is that some other installation specification 
(*.install-function prop) has the side effect of overriding the tarball 
property, if there is a more recent CDH release.  If that is the case, 
then either the tarball.url props need to be documented as "set all 
tarballs or none" (a dicey feature) or the installation logic must allow 
a single tarball.url to override implied installations (be they tarball 
or packages).

The actual, full whirr.config from the particular run is below (with 
sensitive bits removed).  Some values are supplied by env on the 
launching host.


Paul

--------------------whirr.config----------------
hadoop-common.fs.checkpoint.dir=/mnt/hadoop/dfs/namesecondary
hadoop-common.fs.s3.awsAccessKeyId=XXXXXXXXXXXXXX
hadoop-common.fs.s3.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
hadoop-common.fs.s3bfs.awsAccessKeyId=XXXXXXXXXXXXXX
hadoop-common.fs.s3bfs.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
hadoop-common.fs.s3.block.size=${env:BLOCK_SIZE}
hadoop-common.fs.s3.maxRetries=20
hadoop-common.fs.s3n.awsAccessKeyId=XXXXXXXXXXXXXX
hadoop-common.fs.s3n.awsSecretAccessKey=YYYYYYYYYYYYYYYYYYY
hadoop-common.fs.s3.sleepTimeSeconds=4
hadoop-common.hadoop.tmp.dir=/mnt/hadoop/tmp/user_${user.name}
hadoop-common.io.file.buffer.size=65536
hadoop-common.io.sort.factor=25
hadoop-common.io.sort.mb=100
hadoop-common.webinterface.private.actions=true
hadoop-hdfs.dfs.block.size=${env:BLOCK_SIZE}
hadoop-hdfs.dfs.data.dir=/mnt/hadoop/dfs/data
hadoop-hdfs.dfs.datanode.du.reserved=500000000
hadoop-hdfs.dfs.datanode.max.xcievers=1000
hadoop-hdfs.dfs.heartbeat.interval=1
hadoop-hdfs.dfs.name.dir=/mnt/hadoop/dfs/name
hadoop-hdfs.dfs.permissions=false
hadoop-hdfs.dfs.replication=${env:REPLICATION_FACTOR}
hadoop-hdfs.dfs.support.append=true
hadoop-mapreduce.keep.failed.task.file=true
hadoop-mapreduce.mapred.child.java.opts=-Xmx550m -Xms200m 
-Djava.net.preferIPv4Stack=true
hadoop-mapreduce.mapred.child.ulimit=1126400
hadoop-mapreduce.mapred.compress.map.output=true
hadoop-mapreduce.mapred.job.reuse.jvm.num.tasks=1
hadoop-mapreduce.mapred.jobtracker.completeuserjobs.maximum=1000
hadoop-mapreduce.mapred.local.dir=/mnt/hadoop/mapred/local/user_${user.name}
hadoop-mapreduce.mapred.map.max.attempts=2
hadoop-mapreduce.mapred.map.tasks=${env:N_MAP_TASKS_JOB_DEFAULT}
hadoop-mapreduce.mapred.map.tasks.speculative.execution=false
hadoop-mapreduce.mapred.min.split.size=${env:BLOCK_SIZE}
hadoop-mapreduce.mapred.output.compression.type=BLOCK
hadoop-mapreduce.mapred.reduce.max.attempts=2
hadoop-mapreduce.mapred.reduce.tasks=${env:N_REDUCE_TASKS_JOB_DEFAULT}
hadoop-mapreduce.mapred.reduce.tasks.speculative.execution=false
hadoop-mapreduce.mapred.system.dir=/hadoop/system/mapred
hadoop-mapreduce.mapred.tasktracker.map.tasks.maximum=${env:N_MAP_TASKS_PER_TRACKER}
hadoop-mapreduce.mapred.tasktracker.reduce.tasks.maximum=${env:N_REDUCE_TASKS_PER_TRACKER}
hadoop-mapreduce.mapred.temp.dir=/mnt/hadoop/mapred/temp/user_${user.name}
hadoop-mapreduce.mapreduce.jobtracker.staging.root.dir=/user
hbase-site.dfs.datanode.max.xcievers=1000
hbase-site.dfs.replication=2
hbase-site.dfs.support.append=true
hbase-site.hbase.client.pause=3000
hbase-site.hbase.cluster.distributed=true
hbase-site.hbase.rootdir=${fs.default.name}user/hbase
hbase-site.hbase.tmp.dir=/mnt/hbase/tmp
hbase-site.hbase.zookeeper.property.dataDir=/mnt/zookeeper/snapshot
hbase-site.hbase.zookeeper.property.initLimit=30
hbase-site.hbase.zookeeper.property.maxClientCnxns=2000
hbase-site.hbase.zookeeper.property.syncLimit=10
hbase-site.hbase.zookeeper.property.tickTime=6000
hbase-site.hbase.zookeeper.quorum=${fs.default.name}
hbase-site.zookeeper.session.timeout=120000
jclouds.aws-s3.endpoint=us-west-1
jclouds.ec2.ami-query=owner-id=999999999999;state=available;image-type=machine;root-device-type=instance-store;architecture=x86_32
jclouds.ec2.cc-regions=us-west-1
jclouds.ec2.timeout.securitygroup-present=1500
whirr.credential=${env:AWS_SECRET_ACCESS_KEY}
whirr.hadoop.configure-function=configure_cdh_hadoop
whirr.hadoop.install-function=install_cdh_hadoop
whirr.hadoop.tarball.url=http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u1.tar.gz
whirr.hadoop.version=0.20.2
whirr.hardware-id=c1.medium
whirr.hbase.configure-function=configure_cdh_hbase
whirr.hbase.install-function=install_cdh_hbase
whirr.hbase.tarball.url=http://apache.cs.utah.edu/hbase/hbase-0.90.3/hbase-0.90.3.tar.gz
whirr.identity=${env:AWS_ACCESS_KEY_ID}
whirr.image-id=us-west-1/ami-ffffffff
whirr.instance-templates=1 
hadoop-jobtracker+hadoop-namenode+hbase-master+zookeeper+ganglia-metad,2 
hadoop-datanode+hadoop-tasktracker+hbase-regionserver+ganglia-monitor
whirr.instance-templates-minimum-number-of-instances=1 
hadoop-jobtracker+hadoop-namenode+hbase-master+zookeeper+ganglia-metad,2 
hadoop-datanode+hadoop-tasktracker+hbase-regionserver+ganglia-monitor
whirr.location-id=us-west-1
whirr.login-user=ubuntu
whirr.max-startup-retries=4
whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
whirr.provider=aws-ec2
whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
whirr.zookeeper.configure-function=configure_cdh_zookeeper
whirr.zookeeper.install-function=install_cdh_zookeeper
--------------------

Re: whirr.hadoop.tarball.url property or specifying exact releases

Posted by Andrei Savu <sa...@gmail.com>.
On Thu, Oct 27, 2011 at 12:05 PM, Paul Baclace <pa...@gmail.com>wrote:

> I don't expect that the cdh3u2 files came from a cdh3u1 tarball.


I see no cdh3u2 files inside that tarball. Can you share the full
.properties file?

Re: whirr.hadoop.tarball.url property or specifying exact releases

Posted by Paul Baclace <pa...@gmail.com>.
On 20111023 5:37 , Andrei Savu wrote:
>
> On Sun, Oct 23, 2011 at 1:34 AM, Paul Baclace <paul.baclace@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     Is the above property insufficient for specifying exact releases?
>      What else should I do or not do?
>
>
> It should be enough
I guess that means it is a bug.  I don't expect that the cdh3u2 files 
came from a cdh3u1 tarball.

> but if you need to be sure that you are never affected by external 
> changes you
> should probably consider hosting the required artefacts somewhere else 
> (e.g. S3 public bucket).
>
I want to avoid downloads with *.tarball.url settings using the prefix 
remote:// to get files from my private image; however that does require 
that the tarball.url works.


Paul


Re: whirr.hadoop.tarball.url property or specifying exact releases

Posted by Andrei Savu <sa...@gmail.com>.
On Sun, Oct 23, 2011 at 1:34 AM, Paul Baclace <pa...@gmail.com>wrote:

> Is the above property insufficient for specifying exact releases?  What
> else should I do or not do?


It should be enough but if you need to be sure that you are never affected
by external changes you
should probably consider hosting the required artefacts somewhere else (e.g.
S3 public bucket).

whirr.hadoop.tarball.url property or specifying exact releases

Posted by Paul Baclace <pa...@gmail.com>.
I almost always want to pick the exact release of what is installed.  I 
have been using  whirr.hadoop.tarball.url (and similar props) to specify 
versions (with the side-effect that I also specify from where it is 
downloaded).  Building a cluster for interactive, exploratory use might 
work fine with obtaining the latest of everything, but I want Whirr to 
help me spawn something that I previously characterized so that it has 
predictable performance.

I thought that using:

     
whirr.hadoop.tarball.url=http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u1.tar.gz

would ensure that I would get cdh3u1, but since the release of 
hadoop-0.20.2-cdh3u2.tar.gz on Oct. 20, I see lots of cdh3u2 components 
installed, as seen in the names of jar files.  Aside from being untested 
for my purposes, this also breaks some of my post-processing scripts for 
setting up Hive + HBase.

Is the above property insufficient for specifying exact releases?  What 
else should I do or not do?


Paul


Re: Whirr 0.6 + Cassandra 1.0

Posted by Andrei Savu <sa...@gmail.com>.
Jonathan,

We've tested the 0.6.0 release against the Cassandra 0.7. I think we should
consider upgrading to 1.0 in Whirr 0.7.0.

Can you open an issue for this? You can give it a try if you want. It's not
difficult to develop / modify a Whirr service.

Cheers,

-- Andrei Savu

On Thu, Oct 20, 2011 at 6:55 AM, Jonathan Rhone <rh...@tinyco.com> wrote:

> Anyone used whirr 0.6 with Cassandra 1.0 and a custom ec2 ami successfully?
>  The properties file below consistently fails for me.
>
> whirr.cluster-name=cassandraDev
> whirr.instance-templates=3 cassandra
>  whirr.provider=aws-ec2
> whirr.identity=[...]
> whirr.credential=[...]
> whirr.private-key-file=${sys:user.home}/.ssh/id_rsa
> whirr.public-key-file=${sys:user.home}/.ssh/id_rsa.pub
> whirr.hardware-id=m1.large
> whirr.image-id=us-east-1/ami-1136fb78
> whirr.location-id=us-east-1
> whirr.cassandra.version.major=1.0
> whirr.cassandra.tarball.url=
> http://archive.apache.org/dist/cassandra/1.0.0/apache-cassandra-1.0.0-bin.tar.gz
>
>
> Whirr launch-cluster output:
> Bootstrapping cluster
> Configuring template
> Starting 3 node(s) with roles [cassandra]
> Dying because - java.net.SocketException: Connection reset
> Dying because - java.net.SocketException: Connection reset
> <<kex done>> woke to: net.schmizz.sshj.transport.TransportException:
> Connection reset
> << (ubuntu@50.17.156.68:22) error acquiring SSHClient(
> ubuntu@50.17.156.68:22): Connection reset
> net.schmizz.sshj.transport.TransportException: Connection reset
> at
> net.schmizz.sshj.transport.TransportException$1.chain(TransportException.java:33)
> ..
> ..
> ..
> Dying because - java.net.SocketTimeoutException: Read timed out
> <<authenticated>> woke to: net.schmizz.sshj.userauth.UserAuthException:
> publickey auth failed
>
>
> Whirr.log: (a bunch of the messages warn that 'some action has already been
> attempted'
> ..
> ..
> 2011-10-20 03:25:09,465 INFO
>  [org.apache.whirr.actions.ConfigureClusterAction] (main) Running
> configuration script on nodes: [us-east-1/i-4e30222e, us-east-1/i-50302230,
> us-east-1/i-52302232]
> 2011-10-20 03:25:09,467 DEBUG
> [org.apache.whirr.actions.ConfigureClusterAction] (main) script:
> #!/bin/bash
> set +u
> shopt -s xpg_echo
> ..
> ..
> ..
> configure_cassandra -c aws-ec2 10.46.38.219 || exit 1
> start_cassandra || exit 1
> exit 0
>
> 2011-10-20 03:25:10,142 DEBUG [jclouds.compute] (user thread 8) >> blocking
> on socket [address=50.17.156.68, port=22] for 600000 seconds
> 2011-10-20 03:25:10,143 DEBUG [jclouds.compute] (user thread 7) >> blocking
> on socket [address=107.20.113.254, port=22] for 600000 seconds
> 2011-10-20 03:25:10,144 DEBUG [jclouds.compute] (user thread 1) >> blocking
> on socket [address=107.22.52.0, port=22] for 600000 seconds
> 2011-10-20 03:25:10,145 DEBUG [jclouds.compute] (user thread 7) << socket
> [address=107.20.113.254, port=22] opened
> 2011-10-20 03:25:10,145 DEBUG [jclouds.compute] (user thread 8) << socket
> [address=50.17.156.68, port=22] opened
> 2011-10-20 03:25:10,145 DEBUG [jclouds.compute] (user thread 1) << socket
> [address=107.22.52.0, port=22] opened
> 2011-10-20 03:29:19,993 DEBUG [org.apache.whirr.service.ComputeCache]
> (Thread-1) closing ComputeServiceContext  [id=aws-ec2, endpoint=
> https://ec2.us-east-1.amazonaws.com, apiVersion=2010-06-15,
> identity=AKIAJFI6GBE4GISI4PJQ, iso3166Codes=[US-VA, US-CA, IE, SG, JP-13]]
>
> Thanks,
> --
>
> jon
>