You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Vikas Vishwakarma (JIRA)" <ji...@apache.org> on 2014/07/10 07:51:04 UTC
[jira] [Commented] (PHOENIX-998) SocketTimeoutException under high concurrent write access to phoenix indexed table

    [ https://issues.apache.org/jira/browse/PHOENIX-998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057160#comment-14057160 ] 

Vikas Vishwakarma commented on PHOENIX-998:
-------------------------------------------

I saw similar issues with native HBase API based client loader. I could correlate it to GC pause in the RegionServers. Log trace given below:

>From the Client logs a read batch failed on the RegionServer machine at 11:13:29:
===========
[Wed Jul 09 11:13:29 GMT 2014 com.salesforce.hbase.opthreads.FetchScanRunTimeThread run SEVERE] FetchScanRunTimeThread: java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=1, exceptions:
Wed Jul 09 11:13:29 GMT 2014, org.apache.hadoop.hbase.client.RpcRetryingCaller@2c109a0a, java.net.SocketTimeoutException: Call to ---hostname masked---  failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[---hostname masked---]

from RegionServer logs at the same time there two consecutive GC pause total 3.2 seconds duration followed by RpcServer.responder: asyncWrite failure:
===========
2014-07-09 11:13:21,372 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1552ms
2014-07-09 11:13:29,981 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1846ms
2014-07-09 11:13:30,040 WARN org.apache.hadoop.ipc.RpcServer: RpcServer.listener,port=60020: count of bytes read: 0
2014-07-09 11:13:30,041 WARN org.apache.hadoop.ipc.RpcServer: RpcServer.respondercallId: 30901 service: ClientService methodName: Scan size: 24 connection: 10.230.229.22:57115: output error
2014-07-09 11:13:30,042 INFO org.apache.hadoop.ipc.RpcServer: RpcServer.responder: asyncWrite



> SocketTimeoutException under high concurrent write access to phoenix indexed table
> ----------------------------------------------------------------------------------
>
>                 Key: PHOENIX-998
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-998
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>         Environment: HBase 0.98.1-SNAPSHOT, Hadoop 2.3.0-cdh5.0.0
>            Reporter: wangxianbin
>            Priority: Critical
>
> we have a small hbase cluster, which has one master, six slaves, we test phoenix index concurrent write access performance with four write clients, each client has 100 threads, each thread has one phoenix jdbc connection, and we encounter SocketTimeoutException as follow, and it will retry for very long time, how can i deal with such issue?
> 2014-05-22 17:22:58,490 INFO  [storm4.org,60020,1400750242045-index-writer--pool3-t10] client.AsyncProcess: #16016, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M
> 2014-05-22 17:23:00,436 INFO  [storm4.org,60020,1400750242045-index-writer--pool3-t6] client.AsyncProcess: #16027, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M
> 2014-05-22 17:23:00,440 INFO  [storm4.org,60020,1400750242045-index-writer--pool3-t1] client.AsyncProcess: #16013, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M
> 2014-05-22 17:23:00,449 INFO  [storm4.org,60020,1400750242045-index-writer--pool3-t7] client.AsyncProcess: #16028, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M
> 2014-05-22 17:23:00,473 INFO  [storm4.org,60020,1400750242045-index-writer--pool3-t8] client.AsyncProcess: #16020, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M
> 2014-05-22 17:23:00,494 INFO  [htable-pool20-t13] client.AsyncProcess: #16016, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20189 ms, replay 1 ops.
> 2014-05-22 17:23:02,439 INFO  [storm4.org,60020,1400750242045-index-writer--pool3-t4] client.AsyncProcess: #16022, waiting for some tasks to finish. Expected max=0, tasksSent=13, tasksDone=12, currentTasksDone=12, retries=11 hasError=false, tableName=IPHOENIX10M
> 2014-05-22 17:23:02,496 INFO  [htable-pool20-t3] client.AsyncProcess: #16013, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:32 CST 2014, retrying after 20001 ms, replay 1 ops.
> 2014-05-22 17:23:02,496 INFO  [htable-pool20-t16] client.AsyncProcess: #16028, table=IPHOENIX10M, attempt=12/350 failed 1 ops, last exception: java.net.SocketTimeoutException: Call to storm3.org/172.16.2.23:60020 failed because java.net.SocketTimeoutException: 2000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.16.2.24:52017 remote=storm3.org/172.16.2.23:60020] on storm3.org,60020,1400750242156, tracking started Thu May 22 17:21:37 CST 2014, retrying after 20095 ms, replay 1 ops.



--
This message was sent by Atlassian JIRA
(v6.2#6252)