You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Lars George (JIRA)" <ji...@apache.org> on 2011/02/20 12:00:38 UTC

[jira] Created: (WHIRR-240) [HBase] Enable support for HBase 0.90.x

[HBase] Enable support for HBase 0.90.x
---------------------------------------

                 Key: WHIRR-240
                 URL: https://issues.apache.org/jira/browse/WHIRR-240
             Project: Whirr
          Issue Type: New Feature
          Components: service/hbase
    Affects Versions: 0.3.0
            Reporter: Lars George
             Fix For: 0.4.0


HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 

One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078598#comment-13078598 ] 

Andrei Savu commented on WHIRR-240:
-----------------------------------

+1 tested on EC2. Thanks Bruno!

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240-tests.patch, WHIRR-240.patch, WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064639#comment-13064639 ] 

Bruno Dumon commented on WHIRR-240:
-----------------------------------

I added a patch which should solve the problem.

It copies (actually links) the hadoop-core jar of the installed Hadoop version to the HBase lib dir, which is the opposite of what was suggested in the description of this patch. Cloudera does exactly the same in their Linux packages.

This solution requires that Hadoop is installed on each node where HBase is installed. Usually this is the case (hadoop-namenode+hbase-master and hadoop-datanode+hbase-regionserver).

The usual suggestion in the HBase community is to use the CDH Hadoop version. You can combine stock HBase with CDH Hadoop by using the following properties:

whirr.hbase.tarball.url=http://apache.cu.be//hbase/hbase-0.90.3/hbase-0.90.3.tar.gz
whirr.hadoop.tarball.url=http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0.tar.gz

Note that this uses the .tar.gz release of Cloudera, and not Whirr's special cdh support which uses the Linux packages instead. I do it that way because the CDH linux packages use different conventions from Whirr (e.g. different Linux users).

I tried this on a small byon cluster and it seems to work fine. It doesn't seem to need the 'wait for namenode' loop before starting hbase as suggested in WHIRR-334, though it might be that this is due to byon timing differences.

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066888#comment-13066888 ] 

Bruno Dumon commented on WHIRR-240:
-----------------------------------

I had not arrived at trying the integration tests before (sorry), I'm looking into it now.

I identified the following problems:

* whirr.instance-templates: the thriftserver was run on a node on which there is no hadoop. This does not work as then we can't copy over the hadoop-core jar. This will also be a problem for WHIRR-339 (hbase-site.xml generation), I'll add a comment over there. The 'solution' is to adjust the template, I think this is fair enough since there are other limitations too on how templates must be structured.

* Whirr decides to first configure the node with the regionserver, and after that the one with zookeeper. This fails because the regionserver will only wait/retry a limited amount of time to connect to ZooKeeper on startup (20s, I checked in the logs that zookeeper actually started after the regionserver exited, about 10s later). Luckily, there is a property to control this: hbase.zookeeper.recoverable.waittime. I put it to 5 minutes (this prop only has effect on startup AFAICS).

With these changes, the test runs successfully for me.

To make the tests work for both 0.89 and 0.90, I had to add an additional test class and properties file, the name of the properties file is passed to HBaseServiceController.getInstance(). Let me know if you prefer to have it done some other way.

Note that, in services/hbase/pom.xml, I left the hbase.version property on 0.89etcetera which does work for both since it uses the thrift interface which seems to be compatible.

The tests for 0.89 still run, but since copying hadoop-core is done regardless of the HBase version, the above mentioned limitation of needing to have Hadoop installed on the node with thrift now also applies to 0.89.

The hbase.zookeeper.recoverable.waittime is specified in the cluster configuration, which depends on WHIRR-339. Without WHIRR-339, it needs to be hardcoded in the configure_(cdh_)hbase.sh

Through the properties passed with -Dconfig to the integration test, I supplied the following:

whirr.image-id=eu-west-1/ami-619ea915 (canonical 11.04 instance store EU)
whirr.hardware-id=m1.large
whirr.location-id=eu-west-1a

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240-tests.patch, WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-240:
------------------------------

    Fix Version/s:     (was: 0.4.0)

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-240:
------------------------------

    Attachment: hbase-ec2-090.properties

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065336#comment-13065336 ] 

Bruno Dumon commented on WHIRR-240:
-----------------------------------

Added hbase-ec2-090 recipe. As EC2 is working smooth for me know, I was able to test it. Only difference is that I tested with ubuntu 11.04 in eu-west (shouldn't make a difference). 

HBase master & region server are running, hbase master web ui shows the region servers.

Initially during startup there were some errors of the following kind (in the master's log):

{noformat}
2011-07-14 15:16:51,115 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1469)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
{noformat}

presumably because the datanodes were not available yet (seems like a reason to have the ordered service startup after all).

I didn't put an actual workload against it, but created a table and saw no further errors in the logs (checked on the actual regionserver that hosted the table, it was able to create its dfs files fine).

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu resolved WHIRR-240.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 0.6.0
         Assignee: Bruno Dumon

I've just committed this. Thanks Bruno! Integration tests also work on cloudservers. We can fix any remaining issues or make changes in new JIRAs.

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>            Assignee: Bruno Dumon
>             Fix For: 0.6.0
>
>         Attachments: WHIRR-240-tests.patch, WHIRR-240.patch, WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064882#comment-13064882 ] 

Andrei Savu commented on WHIRR-240:
-----------------------------------

Bruno thanks for taking the time to contribute this patch. 

We should probably also create a recipe for this that would contain the relevant URLs:

{noformat}
whirr.hbase.tarball.url=http://apache.cu.be//hbase/hbase-0.90.3/hbase-0.90.3.tar.gz
whirr.hadoop.tarball.url=http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0.tar.gz
{noformat}

I will look more into that timing issue tonight. 

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065105#comment-13065105 ] 

Bruno Dumon commented on WHIRR-240:
-----------------------------------

Indeed, we need a recipe, something like "hbase-ec2-090.properties". Should be trivial to make, but maybe someone else can double-check this patch (I had trouble testing it on ec2 yesterday, cfr. the boostrap phase timeout problems).

Since there was some discussion on the users list as to whether this jar-replacing is a good approach, I'd like to throw in my arguments for adding this:

 * most importantly, HBase users currently looking into Whirr will be disappointed as they can't run any recent version

 * from Andrew's comment, I take that even with HBase 0.92, people might want to run different Hadoop versions (e.g. CDH), which will likely still require replacing the hadoop-core jar in HBase, for example because of differences in the RPC protocol.

Besides this, it is 'the' approach in HBase land, cfr http://hbase.apache.org/book/hadoop.html: 'Replace the hadoop jar found in the HBase lib directory with the hadoop jar you are running on your cluster to avoid version mismatch issues'.

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-240:
------------------------------

    Attachment: WHIRR-240-tests.patch

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240-tests.patch, WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-240:
------------------------------

    Attachment: WHIRR-240.patch

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Bruno Dumon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruno Dumon updated WHIRR-240:
------------------------------

    Attachment: WHIRR-240.patch

Merge the various aspects of the previous patches and test with current trunk.

This patch includes:
 * a change to configure_hbase.sh to replace HBase's hadoop-core jar with the one of the acutally installed hadoop
 * a sample recipe
 * separate integration tests for HBase 0.89 and HBase 0.90

Ran integration tests with all settings default (except whirr.hardware-id=m1.large) and tested the recipe manually.

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240-tests.patch, WHIRR-240.patch, WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-240) [HBase] Enable support for HBase 0.90.x

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066119#comment-13066119 ] 

Tom White commented on WHIRR-240:
---------------------------------

Does this still support 0.89 HBase?

> [HBase] Enable support for HBase 0.90.x
> ---------------------------------------
>
>                 Key: WHIRR-240
>                 URL: https://issues.apache.org/jira/browse/WHIRR-240
>             Project: Whirr
>          Issue Type: New Feature
>          Components: service/hbase
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>         Attachments: WHIRR-240.patch, hbase-ec2-090.properties
>
>
> HBase 0.90.0 is a difficult release as it either needs CDH or a patched Hadoop (with append) to work. The Apache tarballs won't do and HBase will not start. 
> One way possible is to deploy the Apache Hadoop 0.20.2 tarball and then override the core jar with the one supplied by HBase. Since HBase relies on Hadoop to be setup by the Whirr service we would need some surgery that would imply service ordering. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira