You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Andrei Savu (JIRA)" <ji...@apache.org> on 2011/06/23 15:39:47 UTC

[jira] [Created] (WHIRR-334) Support for CDH3u0 HBase

Support for CDH3u0 HBase
------------------------

                 Key: WHIRR-334
                 URL: https://issues.apache.org/jira/browse/WHIRR-334
             Project: Whirr
          Issue Type: Bug
          Components: service/hbase
    Affects Versions: 0.5.0
            Reporter: Andrei Savu
             Fix For: 0.6.0, 0.5.0


Message from Geoff Black on the Github pull request [1]:

{quote}
I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.

The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
{quote}

[1] https://github.com/apache/whirr/pull/1 



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13055635#comment-13055635 ] 

Tom White commented on WHIRR-334:
---------------------------------

> I believe we should commit this one before WHIRR-294.

OK

> Should we consider fixing WHIRR-221 so that we can have a more predictable cluster launch process? 

It sounds like that would be useful in this case. Really we want a DAG of dependencies, not just a list, but a list is probably good enough to start with.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu resolved WHIRR-334.
-------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.5.0)
         Assignee: Bruno Dumon  (was: Andrei Savu)

I've just committed this. Thanks guys! 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Bruno Dumon
>             Fix For: 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334-4.patch

Patch update: added comment explaining the wait-for-namenode loop.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070157#comment-13070157 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

I updated the patch for CDH3u1:
 * configure_cdh_hbase: removed delayed_restart trick
 * configure_cdh_hbase: add hbase.zookeeper.recoverable.waittime
 * configure_cdh_hbase: install daemon package after configuration is performed (otherwise it first starts against default conf), don't restart on debian as it's not necessary.
 * install_cdh_zookeeper/configure_cdh_zookeeper: install daemon package after configuration is performed. This aligns with how things are done in general, and otherwise this will expire client's ZK sessions due to changed ZK server identities.
 * configure_cdh_zookeeper: the service is now also called 'hadoop-zookeeper-server' on rpm systems

I ran the integration tests (of CDH Hadoop, ZooKeeper and HBase) with everything default except for whirr.hardware-id=m1.large. These all ran successfully.

Then I also ran the integration tests on Amazon's own Linux images, which are rpm based, using these properties:
whirr.image-id=us-east-1/ami-221fec4b
jclouds.ec2.ami-owners=137112412989
whirr.login-user=ec2-user
whirr.hardware-id=m1.large

These tests ran also successful.

IMHO this patch is ready know. Once this is in, I can look into adjusting my other patches to trunk. Probably WHIRR-240 first?

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068199#comment-13068199 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

I'm unable to get the integration tests to work just by applying the patch. The region server was not running. I'm testing using the default image selected by Whirr (imageId=us-east-1/ami-aef607c7, description=411009282317/RightImage_Ubuntu_10.04_x64_v5.6.8.1_EBS) and maybe this is the problem because I see no other difference. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334.patch

Created patch from the pull request. Not tested yet. It looks like it applies cleanly on branch-0.5 and trunk. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063426#comment-13063426 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

I found another issue with the configure_cdh_zookeeper.sh script: the service name for zookeeper is actually different in the RPM and Debian packages, therefore I changed it like this:

  if [ -f /etc/init.d/hadoop-zookeeper ] ; then
    service hadoop-zookeeper restart
  else
    service hadoop-zookeeper-server restart
  fi

Otherwise the patch is the same as the one from June 24.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065823#comment-13065823 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

I realized the wait loop could be easily adjusted to wait for at least one datanode to be up. Since we're wget'ing the hadoop web ui we can as well grep it for the number of datanodes. I've adjusted the patch in this regard (WHIRR-334-5.patch).

Tested on EC2.

In summary, this patch contains the following:
 - the changes from the patch of June 24
 - fix for ZK service script name: https://github.com/bdumon/whirr/commit/0f9910439c2025240828b34af6442ebedd72bca2
 - HDFS-wait-loop before starting HBase: https://github.com/bdumon/whirr/commit/b160a0f8345524fcb9ddd5301550d9fa48b0b865

Here's an exmple of the output of the HDFS wait loop:

{noformat}
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:38 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:41 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:44 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:47 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:50 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:53 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:56 UTC 2011
hadoop-hbase-master restart: waiting for HDFS to be available -- Fri Jul 15 08:32:59 UTC 2011
Live Datanodes : 2
Restarting Hadoop HBase master daemon: no master to stop because no pid file /var/run/hbase/hbase-hbase-master.pid
Starting Hadoop HBase master daemon: starting master, logging to /var/log/hbase/logs/hbase-hbase-master-ip-10-50-37-175.out
hbase-master.
{noformat}

What I said in my previous comment about double java processes can be ignored, the double one was not Java itself but "su mapred -s /usr/lib/jvm/java-6-sun/bin/java"

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066107#comment-13066107 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

The CDH HBase integration test still hangs for me when using the automatically selected Rightscale Ubuntu 10.04 AMI. Am I the only one seeing this? I will update the default image selection strategy to use a plain Canonical AMI. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068231#comment-13068231 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

I'll give it a try with the default image as well and report back (might not be before friday).

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065345#comment-13065345 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

I tried to launch a CDH hbase cluster on EC2 using this patch, and I had the opposite problem: the master was not running, the region servers were running.

The master seemed to have exited because of this:

{noformat}
2011-07-14 15:48:26,912 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/hbase.version" - Aborting...
2011-07-14 15:48:26,913 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1469)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
{noformat}

I had the same problem over at WHIRR-240, but there the master survived this (possibly due to improved handling in newer HBase, or maybe due to timing differences).

Also strange was that all hadoop Java processes (datanode, tasktracker) appeared double, as if they were started twice.

I'll look some further into this tomorrow and will report back.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070245#comment-13070245 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

+1

Thanks Bruno for taking the time to fix the remaining issues. I'm planning to commit this tomorrow. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063939#comment-13063939 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

With the current state of the patch, I'm able to successfully launch a CDH HBase cluster.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334.patch

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu reassigned WHIRR-334:
---------------------------------

    Assignee: Andrei Savu

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053914#comment-13053914 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

Nit: {{hbase-ec-cdh.properties}} need to specify a {{whirr.cluster-name}} and {{location-id}} should probably be {{us-east-1}} (testing on trunk with jclouds 1.0.0). 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334.patch

Updated patch to fix the following issues:
* naming inconsistency as discussed on the email list (whirr.hadoop-install-function renamed to whirr.hadoop.install-function etc.)
* fixed CDH test .properties files
* fixed ZooKeeper install / configure scripts

Unfortunately it's not yet ready. It seems like sometime the region server does not start as expected. Any ideas? I will keep on debugging this. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334-test.patch

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334.patch

Attached an updated version of the patch that address some of the minor issues I've noticed. Unfortunately the CDH integration tests are still failing. I will investigate more later today. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064036#comment-13064036 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

Looks good to me! Thanks Bruno for taking the time to work on this. I will test it now and if everything works I will commit. Tom, Adrian any feedback? 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064083#comment-13064083 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

The CDH HBase integration test is still failing for me (getting stuck). I've took a look on the servers and it seems like be HMaster is running but the regionserver is not. Should we make the regionserver check that the master is running before starting (I guess this is the problem)? 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066901#comment-13066901 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

Hi Andrei, thanks for trying out my patches, much appreciated.

I added a patch to make the tests work, it has two changes:
 * the delay loop in configure_cdh_hbase.sh also needs to be added before starting the thrift server. This is because otherwise the thrift server might start quite a bit earlier than the master, and it will only try to establish connection with the master for a limited amount of time.
 * the instance-templates in whirr-hbase-test.properties: changed similar as for WHIRR-240. This change is actually not strictly necessary, the test runs successful without it too, so you might want to leave this out. It would be needed though once WHIRR-339 gets in (will add comment there).

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065345#comment-13065345 ] 

Bruno Dumon edited comment on WHIRR-334 at 7/14/11 4:16 PM:
------------------------------------------------------------

I tried to launch a CDH hbase cluster on EC2 using this patch, and I had the opposite problem: the master was not running, the region servers were running.

The master seemed to have exited because of this:

{noformat}
2011-07-14 15:48:26,912 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/hbase.version" - Aborting...
2011-07-14 15:48:26,913 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1469)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
{noformat}

I had the same problem over at WHIRR-240, but there the master survived this (possibly due to improved handling in newer HBase, or maybe due to timing differences). Of course, this was the original topic of this issue ("HBase Master shuts down after only one attempt to access DFS"), but hence it's not enough for the namenode to be up, there need to be actual datanodes. Maybe I'll go for the ordered role startup after all.

Also strange was that all hadoop Java processes (datanode, tasktracker) appeared double, as if they were started twice. Will look into this more tomorrow.

      was (Author: bruno):
    I tried to launch a CDH hbase cluster on EC2 using this patch, and I had the opposite problem: the master was not running, the region servers were running.

The master seemed to have exited because of this:

{noformat}
2011-07-14 15:48:26,912 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/hbase.version" - Aborting...
2011-07-14 15:48:26,913 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1469)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
{noformat}

I had the same problem over at WHIRR-240, but there the master survived this (possibly due to improved handling in newer HBase, or maybe due to timing differences).

Also strange was that all hadoop Java processes (datanode, tasktracker) appeared double, as if they were started twice.

I'll look some further into this tomorrow and will report back.
  
> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334-2.patch

Adds fix for zookeeper service name

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334.patch

I've updated the patch so that we can apply it using the {{patch}} command (Bruno please use {{git diff --no-prefix}} next time). 

+1 for committing it. It seems like it works most of the time and on the long term we should find a better way of handling the order of starting services.


> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070020#comment-13070020 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

CDH has done a new release and is now on HBase 0.90.3. The part of this patch which does the wait-on-hdfs should in theory not be needed anymore. I'll try this out tomorrow.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053929#comment-13053929 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

I've started a cluster using the provided recipe (an updated version) and everything seems to be working as expected: I've been able to create a table. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334-5.patch

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069592#comment-13069592 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

bq. I've updated the patch so that we can apply it using the patch command (Bruno please use git diff --no-prefix next time). 

ok, thanks for the tip.

I've just done some more tests and noticed that even though all HBase processes started, there was (sometimes, especially with more nodes) a problem with actually using HBase. The configure script is such that it first installs the CDH daemon package (which starts HBase), then changes the configuration, and then restarts HBase. It appears HBase is confused by this configuration change. If I move the daemon package installation after the conf file changes, then it works.

I'm now able to run the integration tests with the default selected images in the default (us) zone. I've ran in them three times in a row successfully. I did use whirr.hardware-id=m1.large. With the default selected t1.micro, results seem to be less consistent (I've run it once successful, and once it failed).

I'll provide an updated patch tomorrow.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053897#comment-13053897 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

I believe that there is still some work that needs to be done in order to make the integration tests pass. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054520#comment-13054520 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

Lars is possible to make it retry forever? How about using a watchdog process? 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13054519#comment-13054519 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

I've been able to track this down and it's also related to HBASE-1960 - it seems like the region server shutdowns if it's unable to connect to HDFS. I will look for an workaround. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066916#comment-13066916 ] 

Bruno Dumon commented on WHIRR-334:
-----------------------------------

Just learned that the first point in my previous comment should not be necessary due to WHIRR-314 (but not if WHIRR-339 is applied too since that patch dropped that setting). So basically the tests should have run with just the original patch applied.

If you would still experience any failures, it would be helpful to have:

/tmp/setup-*.sh
/tmp/logs/*
/tmp/jclouds*/*
/var/logs/hbase/**

and a 'ps aux | grep java'


I provided the following extra properties when running the test:

whirr.image-id=eu-west-1/ami-619ea915 (canonical 11.04 instance store EU)
whirr.hardware-id=m1.large
whirr.location-id=eu-west-1a

and I ran this with a jclouds 1.1 snapshot, since jclouds 1.0 has a bug that prevents from specifying the zone.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334-test.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-334:
------------------------------

    Attachment: WHIRR-334-3.patch

Patch update: added waiting for namenode availability before starting hbase master/regionserver.

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-334) Support for CDH3u0 HBase

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066767#comment-13066767 ] 

Andrei Savu commented on WHIRR-334:
-----------------------------------

Bruno are the integration tests working for you? I've also tried to run them today using the AMI packaged by Canonical with Ubuntu 10.04 LTS and I had to restart daemons by hand. 

> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334-5.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira