You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Guanghao Zhang (JIRA)" <ji...@apache.org> on 2019/02/21 10:00:00 UTC

[jira] [Updated] (HBASE-21941) Use operation timeout instead of scanner timeout for scan call

     [ https://issues.apache.org/jira/browse/HBASE-21941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Guanghao Zhang updated HBASE-21941:
-----------------------------------
    Description: 
There are hbase.rpc.timeout and hbase.client.operation.timeout for client operation expect scan. And there is a special config hbase.client.scanner.timeout.period for scan. If I am not wrong, this should rpc timeout of scan call. But now we use this as operation timeout of scan call. The scan callable is complicated as we need handle the replica case. The real call with retry is called in [https://github.com/apache/hbase/blob/9a55cbb2c1dfe5a13a6ceb323ac7edd23532f4b5/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultBoundedCompletionService.java#L80|https://github.com/apache/hbase/blob/9a55cbb2c1dfe5a13a6ceb323ac7edd23532f4b5/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultBoundedCompletionService.java#L80.] . And the callTimeout is configed by hbase.client.scanner.timeout.period. So I thought this is not right.

 

I meet this problem when run ITBLL for branch-2.2. The verify map task failed when scan.
{noformat}
2019-02-21 03:47:20,287 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions:
2019-02-21 03:47:20,287 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: Closing the previously opened scanner object. 2019-02-21 03:47:20,331 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: Current scan={"loadColumnFamiliesOnDemand":null,"startRow":"\\xE1\\x9B\\xB4\\xF0\\xB3(JT\\xDC\\x86pf|y\\xF3\\xE9","stopRow":"","batch":-1,"cacheBlocks":false,"totalColumns":3,"maxResultSize":4194304,"families":{"big":["big"],"meta":["prev"],"tiny":["tiny"]},"caching":10000,"maxVersions":1,"timeRange":[0,9223372036854775807]} 2019-02-21 03:47:20,335 INFO [hconnection-0x7b44b63d-metaLookup-shared--pool4-t36] org.apache.hadoop.hbase.client.ScannerCallable: Open scanner=-4916858472898750097 for scan={"loadColumnFamiliesOnDemand":null,"startRow":"IntegrationTestBigLinkedList,\\xE1\\x9B\\xB4\\xF0\\xB3(JT\\xDC\\x86pf|y\\xF3\\xE9,99999999999999","stopRow":"IntegrationTestBigLinkedList,,","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":-1,"families":{"info":["ALL"]},"caching":5,"maxVersions":1,"timeRange":[0,9223372036854775807]} on region region=hbase:meta,,1.1588230740, hostname=c4-hadoop-tst-st26.bj,29100,1550660298519, seqNum=-1 2019-02-21 03:48:20,354 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: Mapper took 60023ms to process 0 rows 2019-02-21 03:48:20,355 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions: Thu Feb 21 03:48:20 CST 2019, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60215: Call to c4-hadoop-tst-st30.bj/10.132.2.41:29100 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=7102, waitTime=60006, rpcTimeout=60000 row 'ᛴ�(JT܆pf|y��' on table 'IntegrationTestBigLinkedList' at region=IntegrationTestBigLinkedList,\xDD\xDD\xDD\xDD\xDD\xDD\xDD\xDD,1550661322522.d5d29d2f1e8fee42d666c117709c3a46., hostname=c4-hadoop-tst-st30.bj,29100,1550652984371, seqNum=1007960 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions: Thu Feb 21 03:48:20 CST 2019, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60215: Call to c4-hadoop-tst-st30.bj/10.132.2.41:29100 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=7102, waitTime=60006, rpcTimeout=60000 row 'ᛴ�(JT܆pf|y��' on table 'IntegrationTestBigLinkedList' at region=IntegrationTestBigLinkedList,\xDD\xDD\xDD\xDD\xDD\xDD\xDD\xDD,1550661322522.d5d29d2f1e8fee42d666c117709c3a46., hostname=c4-hadoop-tst-st30.bj,29100,1550652984371, seqNum=1007960 at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:299) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:242) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:266) at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:434) at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:309) at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:594) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:237){noformat}
 

 

 

 

> Use operation timeout instead of scanner timeout for scan call
> --------------------------------------------------------------
>
>                 Key: HBASE-21941
>                 URL: https://issues.apache.org/jira/browse/HBASE-21941
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Guanghao Zhang
>            Priority: Major
>
> There are hbase.rpc.timeout and hbase.client.operation.timeout for client operation expect scan. And there is a special config hbase.client.scanner.timeout.period for scan. If I am not wrong, this should rpc timeout of scan call. But now we use this as operation timeout of scan call. The scan callable is complicated as we need handle the replica case. The real call with retry is called in [https://github.com/apache/hbase/blob/9a55cbb2c1dfe5a13a6ceb323ac7edd23532f4b5/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultBoundedCompletionService.java#L80|https://github.com/apache/hbase/blob/9a55cbb2c1dfe5a13a6ceb323ac7edd23532f4b5/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultBoundedCompletionService.java#L80.] . And the callTimeout is configed by hbase.client.scanner.timeout.period. So I thought this is not right.
>  
> I meet this problem when run ITBLL for branch-2.2. The verify map task failed when scan.
> {noformat}
> 2019-02-21 03:47:20,287 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions:
> 2019-02-21 03:47:20,287 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: Closing the previously opened scanner object. 2019-02-21 03:47:20,331 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: Current scan={"loadColumnFamiliesOnDemand":null,"startRow":"\\xE1\\x9B\\xB4\\xF0\\xB3(JT\\xDC\\x86pf|y\\xF3\\xE9","stopRow":"","batch":-1,"cacheBlocks":false,"totalColumns":3,"maxResultSize":4194304,"families":{"big":["big"],"meta":["prev"],"tiny":["tiny"]},"caching":10000,"maxVersions":1,"timeRange":[0,9223372036854775807]} 2019-02-21 03:47:20,335 INFO [hconnection-0x7b44b63d-metaLookup-shared--pool4-t36] org.apache.hadoop.hbase.client.ScannerCallable: Open scanner=-4916858472898750097 for scan={"loadColumnFamiliesOnDemand":null,"startRow":"IntegrationTestBigLinkedList,\\xE1\\x9B\\xB4\\xF0\\xB3(JT\\xDC\\x86pf|y\\xF3\\xE9,99999999999999","stopRow":"IntegrationTestBigLinkedList,,","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":-1,"families":{"info":["ALL"]},"caching":5,"maxVersions":1,"timeRange":[0,9223372036854775807]} on region region=hbase:meta,,1.1588230740, hostname=c4-hadoop-tst-st26.bj,29100,1550660298519, seqNum=-1 2019-02-21 03:48:20,354 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: Mapper took 60023ms to process 0 rows 2019-02-21 03:48:20,355 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions: Thu Feb 21 03:48:20 CST 2019, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60215: Call to c4-hadoop-tst-st30.bj/10.132.2.41:29100 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=7102, waitTime=60006, rpcTimeout=60000 row 'ᛴ�(JT܆pf|y��' on table 'IntegrationTestBigLinkedList' at region=IntegrationTestBigLinkedList,\xDD\xDD\xDD\xDD\xDD\xDD\xDD\xDD,1550661322522.d5d29d2f1e8fee42d666c117709c3a46., hostname=c4-hadoop-tst-st30.bj,29100,1550652984371, seqNum=1007960 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=16, exceptions: Thu Feb 21 03:48:20 CST 2019, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60215: Call to c4-hadoop-tst-st30.bj/10.132.2.41:29100 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=7102, waitTime=60006, rpcTimeout=60000 row 'ᛴ�(JT܆pf|y��' on table 'IntegrationTestBigLinkedList' at region=IntegrationTestBigLinkedList,\xDD\xDD\xDD\xDD\xDD\xDD\xDD\xDD,1550661322522.d5d29d2f1e8fee42d666c117709c3a46., hostname=c4-hadoop-tst-st30.bj,29100,1550652984371, seqNum=1007960 at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:299) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:242) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192) at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:266) at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:434) at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:309) at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:594) at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:237){noformat}
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)