You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by "Barney Frank (JIRA)" <ji...@apache.org> on 2009/09/16 02:08:57 UTC

[jira] Created: (HBASE-1843) HBase Client Configuration -- Pause and Retries

HBase Client Configuration -- Pause and Retries
-----------------------------------------------

Key: HBASE-1843
URL: https://issues.apache.org/jira/browse/HBASE-1843
Project: Hadoop HBase
Issue Type: Improvement
Components: client
Affects Versions: 0.20.0
Reporter: Barney Frank
Priority: Critical

The ability to set the number of retries and the pause length between them on the HBaseClient through HBaseConfiguration.

I am working on an application that utilizes Hbase for real-time queries. The dependency on Hbase is not critical so if HBase is not available for any reason the application should continue on doing its job without the data from Hbase. Essentially I want HBase Client to fail quickly if the request to Hase is going to fail or take a long time to respond. I have tested various scenarios with Zookeeper running/not running and the master running/not running.

Configuration:
Hbase 0.20.0 & Hadoop 0.20.1
Pseudo distributed mode
Java client using HTablePool

When ZK, Master, Regionserver and my app are running, I stop the Hbase master/regionserver. The HBaseClient then begins to complain:
16:26:51,273 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 0 time(s).
16:26:53,274 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 1 time(s).
...already tried 9 time(s)....
16:27:10,294 INFO [HbaseRPC] Server at /192.168.1.55:44808 not available yet, Zzzzz...

**** This is despite the fact that I set hbase.pause to be 25 ms and the retries.number = 2. ****

I restart the Master and RegionServer and then send more client requests through HTablePool. It has the same "Retrying to connect to server:" messages. I noticed that the port number it is using is the old port for the region server and not the new one assigned after the restart. The HbaseClient does not seem to recover unless I restart the client app. When I do not use HTablePool and only Htable it works fine.

Issue:
Setting and using hbase.client.pause and hbase.client.retries.number parameters. I have rarely gotten them to work. It seems to default to 2 sec and 10 retries no matter if I overwrite the defaults on the client and the server. Yes, I made sure my client doesn't have anything in the classpath it might pick-up.
<property>
<name>hbase.client.pause</name>
<value>20</value>
</property>
<property>
<name>hbase.client.retries.number</name>
<value>2</value>
</property>

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1843) HBase Client Configuration -- Pause and Retries

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1843:
-------------------------

    Fix Version/s: 0.20.1

Look at in 0.20.1.

> HBase Client Configuration -- Pause and Retries
> -----------------------------------------------
>
>                 Key: HBASE-1843
>                 URL: https://issues.apache.org/jira/browse/HBASE-1843
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.0
>            Reporter: Barney Frank
>            Priority: Critical
>             Fix For: 0.20.1
>
>
> The ability to set the number of retries and the pause length between them on the HBaseClient through HBaseConfiguration.
> I am working on an application that utilizes Hbase for real-time queries.  The dependency on Hbase is not critical so if HBase is not available for any reason the application should continue on doing its job without the data from Hbase.  Essentially I want HBase Client to fail quickly if the request to Hase is going to fail or take a long time to respond.  I have tested various scenarios with Zookeeper running/not running and the master running/not running.
>  
> Configuration:
> Hbase 0.20.0 & Hadoop 0.20.1
> Pseudo distributed mode
> Java client using HTablePool
>  
>  
> When ZK, Master, Regionserver and my app are running, I stop the Hbase master/regionserver.  The HBaseClient then begins to complain:
> 16:26:51,273 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 0 time(s).
> 16:26:53,274 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 1 time(s).
> ...already tried 9 time(s)....
> 16:27:10,294 INFO [HbaseRPC] Server at /192.168.1.55:44808 not available yet, Zzzzz...
> **** This is despite the fact that I set hbase.pause to be 25 ms and the retries.number = 2.  ****
> I restart the Master and RegionServer and then send more client requests through HTablePool.  It has the same "Retrying to connect to server:" messages.  I noticed that the port number it is using is the old port for the region server and not the new one assigned after the restart.  The HbaseClient does not seem to recover unless I restart the client app.  When I do not use HTablePool and only Htable it works fine.
> Issue:
> Setting and using hbase.client.pause and hbase.client.retries.number parameters.  I have rarely gotten them to work.  It seems to default to 2 sec and 10 retries no matter if I overwrite the defaults on the client and the server.  Yes, I made sure my client doesn't have anything in the classpath it might pick-up.
> <property>
> <name>hbase.client.pause</name>
> <value>20</value>
> </property>
> <property>
> <name>hbase.client.retries.number</name>
> <value>2</value>
> </property>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1843) HBase Client Configuration -- Millisecond Pauses and very few Retries (s)

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757538#action_12757538 ] 

stack commented on HBASE-1843:
------------------------------

For example, if what is wanted is 25ms maximum pause, we'd need to account for GC in client.  We'd also need account for lookup of -ROOT- and .META. each of which could happen within 2 retries and within the 25ms but overall, the get could take 150ms though maximum asked for was 50ms (2x25ms).

> HBase Client Configuration -- Millisecond Pauses and very few Retries (s)
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1843
>                 URL: https://issues.apache.org/jira/browse/HBASE-1843
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.0
>            Reporter: Barney Frank
>            Priority: Critical
>
> The ability to set the number of retries and the pause length between them on the HBaseClient through HBaseConfiguration.
> I am working on an application that utilizes Hbase for real-time queries.  The dependency on Hbase is not critical so if HBase is not available for any reason the application should continue on doing its job without the data from Hbase.  Essentially I want HBase Client to fail quickly if the request to Hase is going to fail or take a long time to respond.  I have tested various scenarios with Zookeeper running/not running and the master running/not running.
>  
> Configuration:
> Hbase 0.20.0 & Hadoop 0.20.1
> Pseudo distributed mode
> Java client using HTablePool
>  
>  
> When ZK, Master, Regionserver and my app are running, I stop the Hbase master/regionserver.  The HBaseClient then begins to complain:
> 16:26:51,273 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 0 time(s).
> 16:26:53,274 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 1 time(s).
> ...already tried 9 time(s)....
> 16:27:10,294 INFO [HbaseRPC] Server at /192.168.1.55:44808 not available yet, Zzzzz...
> **** This is despite the fact that I set hbase.pause to be 25 ms and the retries.number = 2.  ****
> I restart the Master and RegionServer and then send more client requests through HTablePool.  It has the same "Retrying to connect to server:" messages.  I noticed that the port number it is using is the old port for the region server and not the new one assigned after the restart.  The HbaseClient does not seem to recover unless I restart the client app.  When I do not use HTablePool and only Htable it works fine.
> Issue:
> Setting and using hbase.client.pause and hbase.client.retries.number parameters.  I have rarely gotten them to work.  It seems to default to 2 sec and 10 retries no matter if I overwrite the defaults on the client and the server.  Yes, I made sure my client doesn't have anything in the classpath it might pick-up.
> <property>
> <name>hbase.client.pause</name>
> <value>20</value>
> </property>
> <property>
> <name>hbase.client.retries.number</name>
> <value>2</value>
> </property>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1843) HBase Client Configuration -- Pause and Retries

Posted by "stack (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757533#action_12757533 ] 

stack commented on HBASE-1843:
------------------------------

HBASE-1815 should get us this roughly.  It removes the ipc retries that worked independent of hbase.client retries and pause.  25ms though is a bit to fine of a hair trigger for our clumsy client.  I'll leave this issue open but move it out of 0.20.1.

> HBase Client Configuration -- Pause and Retries
> -----------------------------------------------
>
>                 Key: HBASE-1843
>                 URL: https://issues.apache.org/jira/browse/HBASE-1843
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.0
>            Reporter: Barney Frank
>            Priority: Critical
>             Fix For: 0.20.1
>
>
> The ability to set the number of retries and the pause length between them on the HBaseClient through HBaseConfiguration.
> I am working on an application that utilizes Hbase for real-time queries.  The dependency on Hbase is not critical so if HBase is not available for any reason the application should continue on doing its job without the data from Hbase.  Essentially I want HBase Client to fail quickly if the request to Hase is going to fail or take a long time to respond.  I have tested various scenarios with Zookeeper running/not running and the master running/not running.
>  
> Configuration:
> Hbase 0.20.0 & Hadoop 0.20.1
> Pseudo distributed mode
> Java client using HTablePool
>  
>  
> When ZK, Master, Regionserver and my app are running, I stop the Hbase master/regionserver.  The HBaseClient then begins to complain:
> 16:26:51,273 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 0 time(s).
> 16:26:53,274 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 1 time(s).
> ...already tried 9 time(s)....
> 16:27:10,294 INFO [HbaseRPC] Server at /192.168.1.55:44808 not available yet, Zzzzz...
> **** This is despite the fact that I set hbase.pause to be 25 ms and the retries.number = 2.  ****
> I restart the Master and RegionServer and then send more client requests through HTablePool.  It has the same "Retrying to connect to server:" messages.  I noticed that the port number it is using is the old port for the region server and not the new one assigned after the restart.  The HbaseClient does not seem to recover unless I restart the client app.  When I do not use HTablePool and only Htable it works fine.
> Issue:
> Setting and using hbase.client.pause and hbase.client.retries.number parameters.  I have rarely gotten them to work.  It seems to default to 2 sec and 10 retries no matter if I overwrite the defaults on the client and the server.  Yes, I made sure my client doesn't have anything in the classpath it might pick-up.
> <property>
> <name>hbase.client.pause</name>
> <value>20</value>
> </property>
> <property>
> <name>hbase.client.retries.number</name>
> <value>2</value>
> </property>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1843) HBase Client Configuration -- Millisecond Pauses and very few Retries (s)

Posted by "ryan rawson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HBASE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757550#action_12757550 ] 

ryan rawson commented on HBASE-1843:
------------------------------------

server gc pauses can very easily be 25-80ms. every second too.

with a 25ms hard limit, you are looking to see a large number of spurious failures.

> HBase Client Configuration -- Millisecond Pauses and very few Retries (s)
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1843
>                 URL: https://issues.apache.org/jira/browse/HBASE-1843
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.0
>            Reporter: Barney Frank
>            Priority: Critical
>
> The ability to set the number of retries and the pause length between them on the HBaseClient through HBaseConfiguration.
> I am working on an application that utilizes Hbase for real-time queries.  The dependency on Hbase is not critical so if HBase is not available for any reason the application should continue on doing its job without the data from Hbase.  Essentially I want HBase Client to fail quickly if the request to Hase is going to fail or take a long time to respond.  I have tested various scenarios with Zookeeper running/not running and the master running/not running.
>  
> Configuration:
> Hbase 0.20.0 & Hadoop 0.20.1
> Pseudo distributed mode
> Java client using HTablePool
>  
>  
> When ZK, Master, Regionserver and my app are running, I stop the Hbase master/regionserver.  The HBaseClient then begins to complain:
> 16:26:51,273 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 0 time(s).
> 16:26:53,274 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 1 time(s).
> ...already tried 9 time(s)....
> 16:27:10,294 INFO [HbaseRPC] Server at /192.168.1.55:44808 not available yet, Zzzzz...
> **** This is despite the fact that I set hbase.pause to be 25 ms and the retries.number = 2.  ****
> I restart the Master and RegionServer and then send more client requests through HTablePool.  It has the same "Retrying to connect to server:" messages.  I noticed that the port number it is using is the old port for the region server and not the new one assigned after the restart.  The HbaseClient does not seem to recover unless I restart the client app.  When I do not use HTablePool and only Htable it works fine.
> Issue:
> Setting and using hbase.client.pause and hbase.client.retries.number parameters.  I have rarely gotten them to work.  It seems to default to 2 sec and 10 retries no matter if I overwrite the defaults on the client and the server.  Yes, I made sure my client doesn't have anything in the classpath it might pick-up.
> <property>
> <name>hbase.client.pause</name>
> <value>20</value>
> </property>
> <property>
> <name>hbase.client.retries.number</name>
> <value>2</value>
> </property>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-1843) HBase Client Configuration -- Millisecond Pauses and very few Retries (s)

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HBASE-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1843:
-------------------------

    Fix Version/s:     (was: 0.20.1)
          Summary: HBase Client Configuration -- Millisecond Pauses and very few Retries (s)  (was: HBase Client Configuration -- Pause and Retries)

Moved out of 0.20.1.  Changed title to be explicit that what is wanted is very fine-grained pauses and few retries.

> HBase Client Configuration -- Millisecond Pauses and very few Retries (s)
> -------------------------------------------------------------------------
>
>                 Key: HBASE-1843
>                 URL: https://issues.apache.org/jira/browse/HBASE-1843
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client
>    Affects Versions: 0.20.0
>            Reporter: Barney Frank
>            Priority: Critical
>
> The ability to set the number of retries and the pause length between them on the HBaseClient through HBaseConfiguration.
> I am working on an application that utilizes Hbase for real-time queries.  The dependency on Hbase is not critical so if HBase is not available for any reason the application should continue on doing its job without the data from Hbase.  Essentially I want HBase Client to fail quickly if the request to Hase is going to fail or take a long time to respond.  I have tested various scenarios with Zookeeper running/not running and the master running/not running.
>  
> Configuration:
> Hbase 0.20.0 & Hadoop 0.20.1
> Pseudo distributed mode
> Java client using HTablePool
>  
>  
> When ZK, Master, Regionserver and my app are running, I stop the Hbase master/regionserver.  The HBaseClient then begins to complain:
> 16:26:51,273 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 0 time(s).
> 16:26:53,274 INFO [HBaseClient] Retrying connect to server: /192.168.1.55:44808. Already tried 1 time(s).
> ...already tried 9 time(s)....
> 16:27:10,294 INFO [HbaseRPC] Server at /192.168.1.55:44808 not available yet, Zzzzz...
> **** This is despite the fact that I set hbase.pause to be 25 ms and the retries.number = 2.  ****
> I restart the Master and RegionServer and then send more client requests through HTablePool.  It has the same "Retrying to connect to server:" messages.  I noticed that the port number it is using is the old port for the region server and not the new one assigned after the restart.  The HbaseClient does not seem to recover unless I restart the client app.  When I do not use HTablePool and only Htable it works fine.
> Issue:
> Setting and using hbase.client.pause and hbase.client.retries.number parameters.  I have rarely gotten them to work.  It seems to default to 2 sec and 10 retries no matter if I overwrite the defaults on the client and the server.  Yes, I made sure my client doesn't have anything in the classpath it might pick-up.
> <property>
> <name>hbase.client.pause</name>
> <value>20</value>
> </property>
> <property>
> <name>hbase.client.retries.number</name>
> <value>2</value>
> </property>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.