You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Rajeshbabu Chintaguntla (JIRA)" <ji...@apache.org> on 2018/05/09 11:43:00 UTC
[jira] [Comment Edited] (PHOENIX-4685) Properly handle connection caching for Phoenix inside RegionServers

    [ https://issues.apache.org/jira/browse/PHOENIX-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468730#comment-16468730 ] 

Rajeshbabu Chintaguntla edited comment on PHOENIX-4685 at 5/9/18 11:42 AM:
---------------------------------------------------------------------------

[~jamestaylor] 
PHOENIX-4021 is main reason for this issue. Able to reproduce this scenario with simple case like Created a salted table with 50 salt buckets and continuously ingest data with 30 threads. Currently we are creating a connection for each region that to specific case like stats collection or index writes. And each connection creates a threadpool with 256 max threads. At some point of time we may cross the max number of threads allowed and writes fail with out of memory.

Able to reproduce same scenario in 4.14 branch as well.
{noformat}
2018-05-09 13:23:39,703 WARN  [192.168.1.3,16020,1525852241017-index-writer--pool8-t8] client.AsyncProcess: Caught unexpected exception/error:
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:1028)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:934)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$100(AsyncProcess.java:615)
        at org.apache.hadoop.hbase.client.AsyncProcess.submitAll(AsyncProcess.java:597)
        at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:910)
        at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:927)
        at org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter$1.call(TrackingParallelWriterIndexCommitter.java:185)
        at org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter$1.call(TrackingParallelWriterIndexCommitter.java:149)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2018-05-09 13:23:39,703 WARN  [192.168.1.3,16020,1525852241017-index-writer--pool8-t8] client.AsyncProcess: #176, table=I2, attempt=1/1 failed=1ops, last exception: java.lang.OutOfMemoryError: unable to create new native thread on 192.168.1.3,16020,1525852241017, tracking started Wed May 09 13:23:39 IST 2018; not retrying 1 - final failure
{noformat}
{noformat}
Caused by: java.lang.Exception: java.lang.OutOfMemoryError: unable to create new native thread
        at org.apache.phoenix.index.PhoenixIndexFailurePolicy$2.run(PhoenixIndexFailurePolicy.java:288)
        at org.apache.phoenix.index.PhoenixIndexFailurePolicy$2.run(PhoenixIndexFailurePolicy.java:242)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        ... 32 more
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at org.apache.zookeeper.ClientCnxn.start(ClientCnxn.java:405)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:450)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:141)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.<init>(RecoverableZooKeeper.java:128)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:137)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:185)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:153)
        at org.apache.hadoop.hbase.client.ZooKeeperKeepAliveConnection.<init>(ZooKeeperKeepAliveConnection.java:43)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveZooKeeperWatcher(ConnectionManager.java:1690)
        at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:104)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:905)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:648)
        at org.apache.hadoop.hbase.client.CoprocessorHConnection.<init>(CoprocessorHConnection.java:99)
        at org.apache.phoenix.util.ServerUtil$CoprocessorHConnectionTableFactory.getConnection(ServerUtil.java:301)
        at org.apache.phoenix.util.ServerUtil$CoprocessorHConnectionTableFactory.getTable(ServerUtil.java:308)
        at org.apache.phoenix.coprocessor.DelegateRegionCoprocessorEnvironment.getTable(DelegateRegionCoprocessorEnvironment.java:85)
        at org.apache.phoenix.index.PhoenixIndexFailurePolicy$2.run(PhoenixIndexFailurePolicy.java:259)
        ... 36 more
{noformat}
I have already uploaded the jstack having so many threads 
https://issues.apache.org/jira/secure/attachment/12918170/PHOENIX-4685_jstack


was (Author: rajeshbabu):
[~jamestaylor] 
PHOENIX-4021 is main reason for this issue. Able to reproduce this scenario with simple case like Created a salted table with 50 salt buckets and continuously ingest data with 30 threads. Currently we are creating a connection for each region that to specific case like stats collection or index writes. And each connection creates a threadpool with 256 max threads. At some point of time not able to create threads and writes fail with out of memory.

Same scenario able to reproduce in 4.14 branch as well.
{noformat}
2018-05-09 13:23:39,703 WARN  [192.168.1.3,16020,1525852241017-index-writer--pool8-t8] client.AsyncProcess: Caught unexpected exception/error:
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357)
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:1028)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:934)
        at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$100(AsyncProcess.java:615)
        at org.apache.hadoop.hbase.client.AsyncProcess.submitAll(AsyncProcess.java:597)
        at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:910)
        at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:927)
        at org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter$1.call(TrackingParallelWriterIndexCommitter.java:185)
        at org.apache.phoenix.hbase.index.write.TrackingParallelWriterIndexCommitter$1.call(TrackingParallelWriterIndexCommitter.java:149)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2018-05-09 13:23:39,703 WARN  [192.168.1.3,16020,1525852241017-index-writer--pool8-t8] client.AsyncProcess: #176, table=I2, attempt=1/1 failed=1ops, last exception: java.lang.OutOfMemoryError: unable to create new native thread on 192.168.1.3,16020,1525852241017, tracking started Wed May 09 13:23:39 IST 2018; not retrying 1 - final failure
{noformat}
{noformat}
Caused by: java.lang.Exception: java.lang.OutOfMemoryError: unable to create new native thread
        at org.apache.phoenix.index.PhoenixIndexFailurePolicy$2.run(PhoenixIndexFailurePolicy.java:288)
        at org.apache.phoenix.index.PhoenixIndexFailurePolicy$2.run(PhoenixIndexFailurePolicy.java:242)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
        ... 32 more
Caused by: java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:714)
        at org.apache.zookeeper.ClientCnxn.start(ClientCnxn.java:405)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:450)
        at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:141)
        at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.<init>(RecoverableZooKeeper.java:128)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:137)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:185)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:153)
        at org.apache.hadoop.hbase.client.ZooKeeperKeepAliveConnection.<init>(ZooKeeperKeepAliveConnection.java:43)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveZooKeeperWatcher(ConnectionManager.java:1690)
        at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:104)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:905)
        at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:648)
        at org.apache.hadoop.hbase.client.CoprocessorHConnection.<init>(CoprocessorHConnection.java:99)
        at org.apache.phoenix.util.ServerUtil$CoprocessorHConnectionTableFactory.getConnection(ServerUtil.java:301)
        at org.apache.phoenix.util.ServerUtil$CoprocessorHConnectionTableFactory.getTable(ServerUtil.java:308)
        at org.apache.phoenix.coprocessor.DelegateRegionCoprocessorEnvironment.getTable(DelegateRegionCoprocessorEnvironment.java:85)
        at org.apache.phoenix.index.PhoenixIndexFailurePolicy$2.run(PhoenixIndexFailurePolicy.java:259)
        ... 36 more
{noformat}
I have already uploaded the jstack having so many threads 
https://issues.apache.org/jira/secure/attachment/12918170/PHOENIX-4685_jstack

> Properly handle connection caching for Phoenix inside RegionServers
> -------------------------------------------------------------------
>
>                 Key: PHOENIX-4685
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4685
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Rajeshbabu Chintaguntla
>            Priority: Blocker
>             Fix For: 5.0.0
>
>         Attachments: PHOENIX-4685.patch, PHOENIX-4685_5.x-HBase-2.0.patch, PHOENIX-4685_jstack, PHOENIX-4685_v2.patch, PHOENIX-4685_v3.patch, PHOENIX-4685_v4.patch, PHOENIX-4685_v5.patch
>
>
> Currently trying to write data to indexed table failing with OOME where unable to create native threads. But it's working fine with 4.7.x branches. Found many threads created for meta lookup and shared threads and no space to create threads. This is happening even with short circuit writes enabled.
> {noformat}
> 2018-04-08 13:06:04,747 WARN  [RpcServer.default.FPBQ.Fifo.handler=9,queue=0,port=16020] index.PhoenixIndexFailurePolicy: handleFailure failed
> java.io.IOException: java.lang.reflect.UndeclaredThrowableException
>         at org.apache.hadoop.hbase.security.User.runAsLoginUser(User.java:185)
>         at org.apache.phoenix.index.PhoenixIndexFailurePolicy.handleFailureWithExceptions(PhoenixIndexFailurePolicy.java:217)
>         at org.apache.phoenix.index.PhoenixIndexFailurePolicy.handleFailure(PhoenixIndexFailurePolicy.java:143)
>         at org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:160)
>         at org.apache.phoenix.hbase.index.write.IndexWriter.writeAndKillYourselfOnFailure(IndexWriter.java:144)
>         at org.apache.phoenix.hbase.index.Indexer.doPostWithExceptions(Indexer.java:632)
>         at org.apache.phoenix.hbase.index.Indexer.doPost(Indexer.java:607)
>         at org.apache.phoenix.hbase.index.Indexer.postBatchMutateIndispensably(Indexer.java:590)
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1037)
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$30.call(RegionCoprocessorHost.java:1034)
>         at org.apache.hadoop.hbase.coprocessor.CoprocessorHost$ObserverOperationWithoutResult.callObserver(CoprocessorHost.java:540)
>         at org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:614)
>         at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postBatchMutateIndispensably(RegionCoprocessorHost.java:1034)
>         at org.apache.hadoop.hbase.regionserver.HRegion$MutationBatchOperation.doPostOpCleanupForMiniBatch(HRegion.java:3533)
>         at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3914)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3822)
>         at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3753)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:1027)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicBatchOp(RSRpcServices.java:959)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:922)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2666)
>         at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:42014)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
> Caused by: java.lang.reflect.UndeclaredThrowableException
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>         at org.apache.hadoop.security.SecurityUtil.doAsUser(SecurityUtil.java:448)
>         at org.apache.hadoop.security.SecurityUtil.doAsLoginUser(SecurityUtil.java:429)
>         at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at org.apache.hadoop.hbase.util.Methods.call(Methods.java:40)
>         at org.apache.hadoop.hbase.security.User.runAsLoginUser(User.java:183)
>          ... 25 more
> Caused by: java.lang.Exception: java.lang.OutOfMemoryError: unable to create new native thread
>         at org.apache.phoenix.index.PhoenixIndexFailurePolicy$1.run(PhoenixIndexFailurePolicy.java:266)
>         at org.apache.phoenix.index.PhoenixIndexFailurePolicy$1.run(PhoenixIndexFailurePolicy.java:217)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>         ... 32 more
> Caused by: java.lang.OutOfMemoryError: unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:714)
>         at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:950)
>         at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1357)
>         at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
>         at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1007)
>         at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:986)
>         at org.apache.phoenix.util.IndexUtil.updateIndexState(IndexUtil.java:724)
>         at org.apache.phoenix.util.IndexUtil.updateIndexState(IndexUtil.java:709)
>         at org.apache.phoenix.index.PhoenixIndexFailurePolicy$1.run(PhoenixIndexFailurePolicy.java:236)
>         ... 36 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)