You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Bruno Dumon (JIRA)" <ji...@apache.org> on 2011/07/14 18:17:00 UTC

[jira] [Issue Comment Edited] (WHIRR-334) Support for CDH3u0 HBase

    [ https://issues.apache.org/jira/browse/WHIRR-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065345#comment-13065345 ] 

Bruno Dumon edited comment on WHIRR-334 at 7/14/11 4:16 PM:
------------------------------------------------------------

I tried to launch a CDH hbase cluster on EC2 using this patch, and I had the opposite problem: the master was not running, the region servers were running.

The master seemed to have exited because of this:

{noformat}
2011-07-14 15:48:26,912 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/hbase.version" - Aborting...
2011-07-14 15:48:26,913 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1469)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
{noformat}

I had the same problem over at WHIRR-240, but there the master survived this (possibly due to improved handling in newer HBase, or maybe due to timing differences). Of course, this was the original topic of this issue ("HBase Master shuts down after only one attempt to access DFS"), but hence it's not enough for the namenode to be up, there need to be actual datanodes. Maybe I'll go for the ordered role startup after all.

Also strange was that all hadoop Java processes (datanode, tasktracker) appeared double, as if they were started twice. Will look into this more tomorrow.

      was (Author: bruno):
    I tried to launch a CDH hbase cluster on EC2 using this patch, and I had the opposite problem: the master was not running, the region servers were running.

The master seemed to have exited because of this:

{noformat}
2011-07-14 15:48:26,912 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/hbase.version" - Aborting...
2011-07-14 15:48:26,913 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1469)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:649)
{noformat}

I had the same problem over at WHIRR-240, but there the master survived this (possibly due to improved handling in newer HBase, or maybe due to timing differences).

Also strange was that all hadoop Java processes (datanode, tasktracker) appeared double, as if they were started twice.

I'll look some further into this tomorrow and will report back.
  
> Support for CDH3u0 HBase
> ------------------------
>
>                 Key: WHIRR-334
>                 URL: https://issues.apache.org/jira/browse/WHIRR-334
>             Project: Whirr
>          Issue Type: Bug
>          Components: service/hbase
>    Affects Versions: 0.5.0
>            Reporter: Andrei Savu
>            Assignee: Andrei Savu
>             Fix For: 0.5.0, 0.6.0
>
>         Attachments: WHIRR-334-2.patch, WHIRR-334-3.patch, WHIRR-334-4.patch, WHIRR-334.patch, WHIRR-334.patch, WHIRR-334.patch
>
>
> Message from Geoff Black on the Github pull request [1]:
> {quote}
> I've updated the cdh services scripts and ZooKeeperClusterActionHandler.java to properly work with CDH3 when setting up an HBase cluster. Tested multiple times with 1 master + 1 region and also 1 master + 5 region on EC2.
> The only issue I ran into was previously documented in https://issues.apache.org/jira/browse/HBASE-1960 where the HBase Master shuts down after only one attempt to access DFS. This is something that should be addressed by the HBase team or a fix integrated by Cloudera into CDH.
> {quote}
> [1] https://github.com/apache/whirr/pull/1 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira