You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Yichi Zhang (Jira)" <ji...@apache.org> on 2021/04/21 23:52:00 UTC

[jira] [Commented] (BEAM-9629) JdbcIO seems to run out of connections in the connection pool and freezes pipeline

    [ https://issues.apache.org/jira/browse/BEAM-9629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17327012#comment-17327012 ] 

Yichi Zhang commented on BEAM-9629:
-----------------------------------

I think the issue is probably not fully resolved, we are seeing same issues from 2.25.0 and 2.27.0.

 

My guess is that there can still be WriteFn or Read that hold the connection and haven't had the chance to invoke the finishBundle() function to close the connection, and all the executing threads are blocked on getConnection.

 

sample thread dump:
{noformat}
--- Threads (56): [Thread[pool-3-thread-21,5,main], Thread[pool-3-thread-29,5,main], Thread[pool-3-thread-22,5,main], Thread[pool-3-thread-13,5,main], Thread[pool-3-thread-16,5,main], Thread[pool-3-thread-39,5,main], Thread[pool-3-thread-33,5,main], Thread[pool-3-thread-18,5,main], Thread[pool-3-thread-43,5,main], Thread[pool-3-thread-10,5,main], Thread[pool-3-thread-58,5,main], Thread[pool-3-thread-27,5,main], Thread[pool-3-thread-28,5,main], Thread[pool-3-thread-40,5,main], Thread[pool-3-thread-45,5,main], Thread[pool-3-thread-57,5,main], Thread[pool-3-thread-61,5,main], Thread[pool-3-thread-55,5,main], Thread[pool-3-thread-8,5,main], Thread[pool-3-thread-5,5,main], Thread[pool-3-thread-12,5,main], Thread[pool-3-thread-42,5,main], Thread[pool-3-thread-46,5,main], Thread[pool-3-thread-44,5,main], Thread[pool-3-thread-24,5,main], Thread[pool-3-thread-9,5,main], Thread[pool-3-thread-49,5,main], Thread[pool-3-thread-37,5,main], Thread[pool-3-thread-41,5,main], Thread[pool-3-thread-6,5,main], Thread[pool-3-thread-3,5,main], Thread[pool-3-thread-20,5,main], Thread[pool-3-thread-56,5,main], Thread[pool-3-thread-25,5,main], Thread[pool-3-thread-52,5,main], Thread[pool-3-thread-30,5,main], Thread[pool-3-thread-50,5,main], Thread[pool-3-thread-64,5,main], Thread[pool-3-thread-1,5,main], Thread[pool-3-thread-48,5,main], Thread[pool-3-thread-31,5,main], Thread[pool-3-thread-23,5,main], Thread[pool-3-thread-14,5,main], Thread[pool-3-thread-34,5,main], Thread[pool-3-thread-7,5,main], Thread[pool-3-thread-51,5,main], Thread[pool-3-thread-4,5,main], Thread[pool-3-thread-11,5,main], Thread[pool-3-thread-32,5,main], Thread[pool-3-thread-59,5,main], Thread[pool-3-thread-62,5,main], Thread[pool-3-thread-54,5,main], Thread[pool-3-thread-38,5,main], Thread[pool-3-thread-19,5,main], Thread[pool-3-thread-36,5,main], Thread[pool-3-thread-26,5,main]] State: WAITING stack: ---
  sun.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
  org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:590)
  org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:425)
  org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:346)
  org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
  org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:809)
  org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn.executeBatch(JdbcIO.java:1434)
  org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn.processElement(JdbcIO.java:1383)
  org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn$DoFnInvoker.invokeProcessElement(Unknown Source)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:227)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:186)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:335)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
  org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:280)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:267)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:79)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:413)
  org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:73)
  org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:139)
  org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:227)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:335)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
  org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:280)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:267)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:79)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:413)
  com.producthunter.dataenrichment.BlacklistCompletePostsFilter.process(BlacklistCompletePostsFilter.java:59)
  com.producthunter.dataenrichment.BlacklistCompletePostsFilter$DoFnInvoker.invokeProcessElement(Unknown Source)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:227)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:186)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:335)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
  org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:280)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:267)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:79)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:413)
  org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:73)
  org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:139)
  org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:227)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:186)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:335)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
  org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn$1.output(SimpleParDoFn.java:280)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:267)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:79)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:413)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:401)
  org.apache.beam.runners.dataflow.ReshuffleOverrideFactory$ReshuffleWithOnlyTrigger$1.processElement(ReshuffleOverrideFactory.java:86)
  org.apache.beam.runners.dataflow.ReshuffleOverrideFactory$ReshuffleWithOnlyTrigger$1$DoFnInvoker.invokeProcessElement(Unknown Source)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:227)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:186)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:335)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
  org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
  org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:182)
  org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:108)
  org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:56)
  org.apache.beam.runners.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:39)
  org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:121)
  org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
  org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:114)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
  org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:201)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:159)
  org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:77)
  org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:417)
  org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:386)
  org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:311)
  org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140)
  org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120)
  org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
  java.util.concurrent.FutureTask.run(FutureTask.java:266)
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  java.lang.Thread.run(Thread.java:748)

--- Threads (8): [Thread[pool-3-thread-35,5,main], Thread[pool-3-thread-2,5,main], Thread[pool-3-thread-63,5,main], Thread[pool-3-thread-53,5,main], Thread[pool-3-thread-60,5,main], Thread[pool-3-thread-47,5,main], Thread[pool-3-thread-15,5,main], Thread[pool-3-thread-17,5,main]] State: WAITING stack: ---
  sun.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
  org.apache.commons.pool2.impl.LinkedBlockingDeque.takeFirst(LinkedBlockingDeque.java:590)
  org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:425)
  org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:346)
  org.apache.commons.dbcp2.PoolingDataSource.getConnection(PoolingDataSource.java:134)
  org.apache.commons.dbcp2.BasicDataSource.getConnection(BasicDataSource.java:809)
  org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn.executeBatch(JdbcIO.java:1434)
  org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn.finishBundle(JdbcIO.java:1399)
  org.apache.beam.sdk.io.jdbc.JdbcIO$WriteVoid$WriteFn$DoFnInvoker.invokeFinishBundle(Unknown Source)
  org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.finishBundle(SimpleDoFnRunner.java:237)
  org.apache.beam.runners.dataflow.worker.SimpleParDoFn.finishBundle(SimpleParDoFn.java:428)
  org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.finish(ParDoOperation.java:56)
  org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:85)
  org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:417)
  org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:386)
  org.apache.beam.runners.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:311)
  org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:140)
  org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:120)
  org.apache.beam.runners.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:107)
  java.util.concurrent.FutureTask.run(FutureTask.java:266)
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  java.lang.Thread.run(Thread.java:748){noformat}

> JdbcIO seems to run out of connections in the connection pool and freezes pipeline
> ----------------------------------------------------------------------------------
>
>                 Key: BEAM-9629
>                 URL: https://issues.apache.org/jira/browse/BEAM-9629
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-jdbc
>    Affects Versions: 2.18.0, 2.19.0, 2.21.0, 2.22.0, 2.23.0
>         Environment: Dataflow, Direct Runner on macOS Catalina.
>            Reporter: Boris Shilov
>            Assignee: Luke Cwik
>            Priority: P2
>              Labels: performance
>             Fix For: 2.24.0
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Greetings,
> I am using JdbcIO via the Scala wrappers provided in the Scio project. I am trying to read a few dozen tables in parallel from MySQL, but above 8 concurrent SELECT operations the pipeline freezes. With help of the Scio maintainers we've been able to isolate the issue as likely originating in JdbcIO running out of connections in the connection pool and idling indefinitely. The issue occurs both on the Direct Runner and Dataflow.
> Please see linked issue for more context: https://github.com/spotify/scio/issues/2774



--
This message was sent by Atlassian Jira
(v8.3.4#803005)