You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Lars Hofhansl (JIRA)" <ji...@apache.org> on 2011/08/30 00:29:37 UTC

[jira] [Created] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

HBaseAdmin never recovers from restarted cluster
------------------------------------------------

                 Key: HBASE-4283
                 URL: https://issues.apache.org/jira/browse/HBASE-4283
             Project: HBase
          Issue Type: Bug
            Reporter: Lars Hofhansl
            Priority: Minor


While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.

It turns out HBaseClient.Connection.stop() is send into an endless loop here:
{code}
    // wait until all connections are closed
    while (!connections.isEmpty()) {
      try {
        Thread.sleep(100);
      } catch (InterruptedException ignored) {
      }
    }
{code}
The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).


When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.

In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.

The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4283:
---------------------------------

          Component/s: client
    Affects Version/s: 0.92.0
        Fix Version/s: 0.92.0

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094724#comment-13094724 ] 

Ted Yu commented on HBASE-4283:
-------------------------------

Integrated to TRUNK.

Thanks for the patch Lars.

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094860#comment-13094860 ] 

Hudson commented on HBASE-4283:
-------------------------------

Integrated in HBase-TRUNK #2166 (See [https://builds.apache.org/job/HBase-TRUNK/2166/])
    HBASE-4283  HBaseAdmin never recovers from restarted cluster (Lars Hofhansl)

tedyu : 
Files : 
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/util/PoolMap.java


> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl reassigned HBASE-4283:
------------------------------------

    Assignee: Lars Hofhansl

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu resolved HBASE-4283.
---------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094622#comment-13094622 ] 

Lars Hofhansl commented on HBASE-4283:
--------------------------------------

Feel like committing this Ted? :)

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094645#comment-13094645 ] 

Ted Yu commented on HBASE-4283:
-------------------------------

I can commit if there is no -1 for the patch coming up today.

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094008#comment-13094008 ] 

Lars Hofhansl commented on HBASE-4283:
--------------------------------------

Actually I cannot create the same failure scenario with a MiniCluster inside a test.

The patch is pretty innocuous anyway.


> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093761#comment-13093761 ] 

Ted Yu commented on HBASE-4283:
-------------------------------

+1 on patch.

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093984#comment-13093984 ] 

Lars Hofhansl commented on HBASE-4283:
--------------------------------------

unittest coming

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4283:
---------------------------------

    Attachment: 4283.txt

Stop gap patch.
I'll also look at the Connection Pooling in HConnectionManager.

> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Priority: Minor
>         Attachments: 4283.txt
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4283) HBaseAdmin never recovers from restarted cluster

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-4283:
---------------------------------

    Attachment: HBasePing.java

The Minicluster does not behave exactly like a real cluster when stopped and restarted.

Instead, I extracted the minimum client to cause this problem.

o running HBasePing first, the starting the cluster, shows that HBaseAdmin never recovers from MasterNotFoundExceptions.
o running the cluster first, then HBasePing, then restarting the cluster. Leads to the endless loop described above.


> HBaseAdmin never recovers from restarted cluster
> ------------------------------------------------
>
>                 Key: HBASE-4283
>                 URL: https://issues.apache.org/jira/browse/HBASE-4283
>             Project: HBase
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 4283.txt, HBasePing.java
>
>
> While testing common scenarios that we might encounter I found that HBaseAdmin does not recover from a restarted cluster.
> It turns out HBaseClient.Connection.stop() is send into an endless loop here:
> {code}
>     // wait until all connections are closed
>     while (!connections.isEmpty()) {
>       try {
>         Thread.sleep(100);
>       } catch (InterruptedException ignored) {
>       }
>     }
> {code}
> The reason is that PoolMap.remove(k,v) does not remove empty pools, and hence connections.isEmpty() is never true if there ever was any connection in there.
> My fix is to remove the pool from the poolMap when it is empty. (Alternatively one could change PoolMap.isEmpty() to also look inside of all pools and see if their size is 0).
> When I fixed that I noticed that if the master wasn't running when HBaseAdmin is created it also will not recover from that.
> Even creating a new HBaseAdmin from the same Configuration will still use the old stale HConnection.
> In that case a MasterNotRunningException is thrown, which is not handled in HBaseAdmin's constructor.
> The HConnection handling in HConnectionManager is funky. There should never be a closed connection in the HBASE_INSTANCES.
> I might look at that as well but in a separate issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira