You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org> on 2009/12/18 20:57:18 UTC

[jira] Created: (HBASE-2057) Cluster won't stop

Cluster won't stop
------------------

                 Key: HBASE-2057
                 URL: https://issues.apache.org/jira/browse/HBASE-2057
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.21.0
            Reporter: Jean-Daniel Cryans
             Fix For: 0.21.0


It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Attachment: HBASE-2057-2_0.20.patch

Sorry, please use this version of the patch.  This changes the final "exec" line from:

{code}
exec "$JAVA" $JAVA_HEAP_MAX $HBASE_OPTS -classpath "$CLASSPATH" $CLASS "$ACTION" "$@"
{code}

to

{code}
exec "$JAVA" $JAVA_HEAP_MAX $HBASE_OPTS -classpath "$CLASSPATH" $CLASS $ACTION "$@"
{code}

The quoting of $ACTION in the first version causes "hbase shell" to not work.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Attachment: HBASE-2057_0.20.patch

Here's a patch to the bin/hbase script, which only adds the service specific options (used for JMX among other things) when the action passed is "start".

This works well in my case, but would definitely appreciate more eyes.  Anyone using these env variables for other options that might be needed in the case of a "stop"?

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057_0.20.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Attachment: HBASE-2057-2_trunk.patch

Patch for bin/hbase script against trunk.  Same as v2 of 0.20 branch patch.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2057) Cluster won't stop

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803391#action_12803391 ] 

Lars George commented on HBASE-2057:
------------------------------------

Just looked at the v3 patch and it looks good to me.

+1

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch, HBASE-2057-3_0.20.patch, HBASE-2057-3_trunk.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2057:
--------------------------------------

    Fix Version/s:     (was: 0.20.3)
                   0.20.4

Searching on the intertubes, I found that Tomcat had the same problem years ago (JMX binding). They fixed it by having a special env variable for just "start" that isn't used on "stop". See http://svn.apache.org/viewvc/tomcat/tc6.0.x/trunk/bin/catalina.sh?diff_format=h&r1=558523&r2=558522&pathrev=558523

We should probably do the same. Punting to 0.20.4

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Attachment: HBASE-2057-3_trunk.patch

Corresponding v3 of patch against trunk.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch, HBASE-2057-3_0.20.patch, HBASE-2057-3_trunk.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2057:
--------------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.20.4)
                   0.20.3
         Assignee: Gary Helmling  (was: Jean-Daniel Cryans)
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

Committed to branch and trunk. Thanks Gary!

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Gary Helmling
>             Fix For: 0.20.3, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch, HBASE-2057-3_0.20.patch, HBASE-2057-3_trunk.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Attachment:     (was: HBASE-2057_0.20.patch)

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans updated HBASE-2057:
--------------------------------------

    Affects Version/s: 0.20.3
        Fix Version/s: 0.20.3
             Assignee: Jean-Daniel Cryans

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.3, 0.21.0
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2057) Cluster won't stop

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12792682#action_12792682 ] 

Jean-Daniel Cryans commented on HBASE-2057:
-------------------------------------------

A first thing I will commit is if the shutdown znode exists then we should no print a huge exception when starting a Master.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.21.0
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.21.0
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Attachment: HBASE-2057-3_0.20.patch

After reading some of the dev list messages on the Thrift changes, I believe that the current patch (HBASE-2057-2) would not work well with the current Thrift server usage.  For example, starting the Thrift with either:

{code}
./bin/hbase thrift --port=1234
./bin/hbase thrift --port=1234 start
{code}

would not pass the HBASE_THRIFT_OPTS env variable to the exec line.

So here is a new, more conservative, patch that only excludes the service specific env variables if the "stop" argument is used.  This also drops the unnecessary ACTION variable and corresponding change to the final "exec" line.

Sorry for the continuous changes.  This version should be it.  Promise.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch, HBASE-2057-3_0.20.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2057) Cluster won't stop

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803434#action_12803434 ] 

stack commented on HBASE-2057:
------------------------------

Patch looks good to me. 

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch, HBASE-2057-3_0.20.patch, HBASE-2057-3_trunk.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-2057) Cluster won't stop

Posted by "Gary Helmling (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary Helmling updated HBASE-2057:
---------------------------------

    Status: Patch Available  (was: Open)

Please use v3 of patches:

HBASE-2057-3_0.20.patch
HBASE-2057-3_trunk.patch

Should be safer for commands that accept extra CLI params on startup.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057-2_0.20.patch, HBASE-2057-2_trunk.patch, HBASE-2057-3_0.20.patch, HBASE-2057-3_trunk.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2057) Cluster won't stop

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794736#action_12794736 ] 

Jean-Daniel Cryans commented on HBASE-2057:
-------------------------------------------

I investigated the problem of our cluster and it seems to be that, since we use JMX, when the stop master process starts it fails to bind on the JMX port and then exists. One thing that would be helping is if the first master in a cluster was watching it's own cluster state znode so I'm going to commit this from ZKMasterAddressWatcher.writeAddressToZooKeeper:

{code}
       if(this.zookeeper.writeMasterAddress(address)) {
         this.zookeeper.setClusterState(true);
+        this.zookeeper.setClusterStateWatch(this);
{code}

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.3, 0.21.0
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-2057) Cluster won't stop

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12803009#action_12803009 ] 

Jean-Daniel Cryans commented on HBASE-2057:
-------------------------------------------

Thanks Gary, looks good. I don't think those are used for anything else, I will wait until the release of 0.20.3 to commit.

> Cluster won't stop
> ------------------
>
>                 Key: HBASE-2057
>                 URL: https://issues.apache.org/jira/browse/HBASE-2057
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.21.0
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.20.4, 0.21.0
>
>         Attachments: HBASE-2057_0.20.patch
>
>
> It seems that clusters on trunk have some trouble stopping. Even manually deleting the shutdown file in ZK doesn't always help. Investigate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.