You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@geode.apache.org by "Bruce J Schuchardt (Jira)" <ji...@apache.org> on 2019/10/16 15:57:00 UTC

[jira] [Updated] (GEODE-7310) CI failure: hang in IncrementalBackupDistributedTest.testIncompleteInBaseline

     [ https://issues.apache.org/jira/browse/GEODE-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bruce J Schuchardt updated GEODE-7310:
--------------------------------------
    Description: 
At least three distributed unit test runs have recently hung in this test.

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1158
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1164
and
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1173

Here are some threads that might be of interest in the latest hang:

{noformat}
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(java.base@11.0.4/Native Method)
	at java.net.SocketInputStream.socketRead(java.base@11.0.4/SocketInputStream.java:115)
	at java.net.SocketInputStream.read(java.base@11.0.4/SocketInputStream.java:168)
	at java.net.SocketInputStream.read(java.base@11.0.4/SocketInputStream.java:140)
	at java.io.BufferedInputStream.fill(java.base@11.0.4/BufferedInputStream.java:252)
	at java.io.BufferedInputStream.read(java.base@11.0.4/BufferedInputStream.java:271)
	- locked <0x00000000e3c4e9b8> (a java.io.BufferedInputStream)
	at java.io.DataInputStream.readByte(java.base@11.0.4/DataInputStream.java:270)
	at sun.rmi.transport.StreamRemoteCall.executeCall(java.rmi@11.0.4/StreamRemoteCall.java:222)
	at sun.rmi.server.UnicastRef.invoke(java.rmi@11.0.4/UnicastRef.java:161)
	at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(java.rmi@11.0.4/RemoteObjectInvocationHandler.java:209)
	at java.rmi.server.RemoteObjectInvocationHandler.invoke(java.rmi@11.0.4/RemoteObjectInvocationHandler.java:161)
	at com.sun.proxy.$Proxy55.executeMethodOnObject(Unknown Source)
	at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:576)
	at org.apache.geode.test.dunit.VM.invoke(VM.java:431)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.testIncompleteInBaseline(IncrementalBackupDistributedTest.java:327)

"RMI TCP Connection(4)-172.17.0.9" #40 daemon prio=5 os_prio=0 cpu=1139.47ms elapsed=4212.37s tid=0x00007f2f94004800 nid=0x1f7 waiting on condition  [0x00007f301d826000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000ecf14698> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.4/LockSupport.java:234)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(java.base@11.0.4/AbstractQueuedSynchronizer.java:1079)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(java.base@11.0.4/AbstractQueuedSynchronizer.java:1369)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:278)
	at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.awaitWithCheck(StoppableCountDownLatch.java:120)
	at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:93)
	at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:692)
	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:639)
	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:620)
	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:534)
	at org.apache.geode.internal.cache.backup.BackupStep.send(BackupStep.java:66)
	at org.apache.geode.internal.cache.backup.BackupOperation.performBackupSteps(BackupOperation.java:121)
	at org.apache.geode.internal.cache.backup.BackupOperation.performBackupUnderLock(BackupOperation.java:92)
	at org.apache.geode.internal.cache.backup.BackupOperation.performBackup(BackupOperation.java:77)
	at org.apache.geode.internal.cache.backup.BackupOperation.backupAllMembers(BackupOperation.java:71)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.performBackup(IncrementalBackupDistributedTest.java:565)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.performBackup(IncrementalBackupDistributedTest.java:560)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.lambda$testIncompleteInBaseline$515fd116$1(IncrementalBackupDistributedTest.java:327)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40.call(Unknown Source)

"BackupServiceThread1" #276 daemon prio=10 os_prio=0 cpu=0.33ms elapsed=4152.24s tid=0x00007f377021b000 nid=0x50c waiting on condition  [0x00007f38391d6000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000e3ce6438> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
	at org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:72)
	at org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
	at org.apache.geode.internal.cache.backup.BackupService$$Lambda$406/0x0000000840c48040.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(java.base@11.0.4/FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)

"BackupServiceThread1" #266 daemon prio=5 os_prio=0 cpu=0.43ms elapsed=4156.14s tid=0x00007f2fac631800 nid=0x50e waiting on condition  [0x00007f2f455c7000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000ecf518b8> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
	at org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:72)
	at org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
	at org.apache.geode.internal.cache.backup.BackupService$$Lambda$390/0x0000000840be0c40.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(java.base@11.0.4/FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)

   Locked ownable synchronizers:
	- <0x00000000ece9fc10> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	- <0x00000000eceb37b0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	- <0x00000000ecf51990> (a java.util.concurrent.ThreadPoolExecutor$Worker)


"Pooled Message Processor 3" #86 daemon prio=10 os_prio=0 cpu=41.38ms elapsed=4176.70s tid=0x00007fcfb403f800 nid=0x307 waiting on condition  [0x00007fd0348d4000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000ecf68520> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
	at org.apache.geode.internal.cache.backup.BackupTask.getPreparedDiskStores(BackupTask.java:60)
	at org.apache.geode.internal.cache.backup.BackupService.prepareBackup(BackupService.java:63)
	at org.apache.geode.internal.cache.backup.PrepareBackup.run(PrepareBackup.java:41)
	at org.apache.geode.internal.cache.backup.PrepareBackupRequest.createResponse(PrepareBackupRequest.java:66)
	at org.apache.geode.internal.admin.remote.CliLegacyMessage.process(CliLegacyMessage.java:37)
	at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:372)
	at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:436)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
	at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:475)
	at org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:406)
	at org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$174/0x0000000840a39840.invoke(Unknown Source)
	at org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
	at org.apache.geode.internal.logging.LoggingThreadFactory$$Lambda$175/0x0000000840a39c40.run(Unknown Source)
	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)

{noformat}

Test output shows these as the last log entries written to logs & captured by gradle:

{noformat}
[vm0] [info 2019/10/15 22:40:23.697 GMT <Pooled Waiting Message Processor 1> tid=0xd0] Configured redundancy of 1 copies has been restored to /IncrementalBackupDistributedTest_testIncompleteInBaseline_region-2

[info 2019/10/15 22:40:23.698 GMT <Pooled Waiting Message Processor 1> tid=0xf1] Configured redundancy of 1 copies has been restored to /IncrementalBackupDistributedTest_testIncompleteInBaseline_region-2

[vm1] [info 2019/10/15 22:40:26.230 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840@2aba62dc
[vm1] [info 2019/10/15 22:40:26.230 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840@2aba62dc
[vm1] [info 2019/10/15 22:40:26.240 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40@6eaf6978
{noformat}



  was:
At least three distributed unit test runs have recently hung in this test.

https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1158
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1164
and
https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1172

Here are some threads that might be of interest in the latest hang:

{noformat}
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(java.base@11.0.4/Native Method)
	at java.net.SocketInputStream.socketRead(java.base@11.0.4/SocketInputStream.java:115)
	at java.net.SocketInputStream.read(java.base@11.0.4/SocketInputStream.java:168)
	at java.net.SocketInputStream.read(java.base@11.0.4/SocketInputStream.java:140)
	at java.io.BufferedInputStream.fill(java.base@11.0.4/BufferedInputStream.java:252)
	at java.io.BufferedInputStream.read(java.base@11.0.4/BufferedInputStream.java:271)
	- locked <0x00000000e3c4e9b8> (a java.io.BufferedInputStream)
	at java.io.DataInputStream.readByte(java.base@11.0.4/DataInputStream.java:270)
	at sun.rmi.transport.StreamRemoteCall.executeCall(java.rmi@11.0.4/StreamRemoteCall.java:222)
	at sun.rmi.server.UnicastRef.invoke(java.rmi@11.0.4/UnicastRef.java:161)
	at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(java.rmi@11.0.4/RemoteObjectInvocationHandler.java:209)
	at java.rmi.server.RemoteObjectInvocationHandler.invoke(java.rmi@11.0.4/RemoteObjectInvocationHandler.java:161)
	at com.sun.proxy.$Proxy55.executeMethodOnObject(Unknown Source)
	at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:576)
	at org.apache.geode.test.dunit.VM.invoke(VM.java:431)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.testIncompleteInBaseline(IncrementalBackupDistributedTest.java:327)

"RMI TCP Connection(4)-172.17.0.9" #40 daemon prio=5 os_prio=0 cpu=1139.47ms elapsed=4212.37s tid=0x00007f2f94004800 nid=0x1f7 waiting on condition  [0x00007f301d826000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000ecf14698> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.4/LockSupport.java:234)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(java.base@11.0.4/AbstractQueuedSynchronizer.java:1079)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(java.base@11.0.4/AbstractQueuedSynchronizer.java:1369)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:278)
	at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.awaitWithCheck(StoppableCountDownLatch.java:120)
	at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:93)
	at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:692)
	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:639)
	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:620)
	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:534)
	at org.apache.geode.internal.cache.backup.BackupStep.send(BackupStep.java:66)
	at org.apache.geode.internal.cache.backup.BackupOperation.performBackupSteps(BackupOperation.java:121)
	at org.apache.geode.internal.cache.backup.BackupOperation.performBackupUnderLock(BackupOperation.java:92)
	at org.apache.geode.internal.cache.backup.BackupOperation.performBackup(BackupOperation.java:77)
	at org.apache.geode.internal.cache.backup.BackupOperation.backupAllMembers(BackupOperation.java:71)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.performBackup(IncrementalBackupDistributedTest.java:565)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.performBackup(IncrementalBackupDistributedTest.java:560)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.lambda$testIncompleteInBaseline$515fd116$1(IncrementalBackupDistributedTest.java:327)
	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40.call(Unknown Source)

"BackupServiceThread1" #276 daemon prio=10 os_prio=0 cpu=0.33ms elapsed=4152.24s tid=0x00007f377021b000 nid=0x50c waiting on condition  [0x00007f38391d6000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000e3ce6438> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
	at org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:72)
	at org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
	at org.apache.geode.internal.cache.backup.BackupService$$Lambda$406/0x0000000840c48040.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(java.base@11.0.4/FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)

"BackupServiceThread1" #266 daemon prio=5 os_prio=0 cpu=0.43ms elapsed=4156.14s tid=0x00007f2fac631800 nid=0x50e waiting on condition  [0x00007f2f455c7000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000ecf518b8> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
	at org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:72)
	at org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
	at org.apache.geode.internal.cache.backup.BackupService$$Lambda$390/0x0000000840be0c40.call(Unknown Source)
	at java.util.concurrent.FutureTask.run(java.base@11.0.4/FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)

   Locked ownable synchronizers:
	- <0x00000000ece9fc10> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	- <0x00000000eceb37b0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
	- <0x00000000ecf51990> (a java.util.concurrent.ThreadPoolExecutor$Worker)


"Pooled Message Processor 3" #86 daemon prio=10 os_prio=0 cpu=41.38ms elapsed=4176.70s tid=0x00007fcfb403f800 nid=0x307 waiting on condition  [0x00007fd0348d4000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
	- parking to wait for  <0x00000000ecf68520> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
	at org.apache.geode.internal.cache.backup.BackupTask.getPreparedDiskStores(BackupTask.java:60)
	at org.apache.geode.internal.cache.backup.BackupService.prepareBackup(BackupService.java:63)
	at org.apache.geode.internal.cache.backup.PrepareBackup.run(PrepareBackup.java:41)
	at org.apache.geode.internal.cache.backup.PrepareBackupRequest.createResponse(PrepareBackupRequest.java:66)
	at org.apache.geode.internal.admin.remote.CliLegacyMessage.process(CliLegacyMessage.java:37)
	at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:372)
	at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:436)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
	at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:475)
	at org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:406)
	at org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$174/0x0000000840a39840.invoke(Unknown Source)
	at org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
	at org.apache.geode.internal.logging.LoggingThreadFactory$$Lambda$175/0x0000000840a39c40.run(Unknown Source)
	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)

{noformat}

Test output shows these as the last log entries written to logs & captured by gradle:

{noformat}
[vm0] [info 2019/10/15 22:40:23.697 GMT <Pooled Waiting Message Processor 1> tid=0xd0] Configured redundancy of 1 copies has been restored to /IncrementalBackupDistributedTest_testIncompleteInBaseline_region-2

[info 2019/10/15 22:40:23.698 GMT <Pooled Waiting Message Processor 1> tid=0xf1] Configured redundancy of 1 copies has been restored to /IncrementalBackupDistributedTest_testIncompleteInBaseline_region-2

[vm1] [info 2019/10/15 22:40:26.230 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840@2aba62dc
[vm1] [info 2019/10/15 22:40:26.230 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840@2aba62dc
[vm1] [info 2019/10/15 22:40:26.240 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40@6eaf6978
{noformat}




> CI failure: hang in IncrementalBackupDistributedTest.testIncompleteInBaseline
> -----------------------------------------------------------------------------
>
>                 Key: GEODE-7310
>                 URL: https://issues.apache.org/jira/browse/GEODE-7310
>             Project: Geode
>          Issue Type: Improvement
>          Components: persistence
>            Reporter: Bruce J Schuchardt
>            Priority: Major
>
> At least three distributed unit test runs have recently hung in this test.
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1158
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1164
> and
> https://concourse.apachegeode-ci.info/teams/main/pipelines/apache-develop-main/jobs/DistributedTestOpenJDK11/builds/1173
> Here are some threads that might be of interest in the latest hang:
> {noformat}
>    java.lang.Thread.State: RUNNABLE
> 	at java.net.SocketInputStream.socketRead0(java.base@11.0.4/Native Method)
> 	at java.net.SocketInputStream.socketRead(java.base@11.0.4/SocketInputStream.java:115)
> 	at java.net.SocketInputStream.read(java.base@11.0.4/SocketInputStream.java:168)
> 	at java.net.SocketInputStream.read(java.base@11.0.4/SocketInputStream.java:140)
> 	at java.io.BufferedInputStream.fill(java.base@11.0.4/BufferedInputStream.java:252)
> 	at java.io.BufferedInputStream.read(java.base@11.0.4/BufferedInputStream.java:271)
> 	- locked <0x00000000e3c4e9b8> (a java.io.BufferedInputStream)
> 	at java.io.DataInputStream.readByte(java.base@11.0.4/DataInputStream.java:270)
> 	at sun.rmi.transport.StreamRemoteCall.executeCall(java.rmi@11.0.4/StreamRemoteCall.java:222)
> 	at sun.rmi.server.UnicastRef.invoke(java.rmi@11.0.4/UnicastRef.java:161)
> 	at java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(java.rmi@11.0.4/RemoteObjectInvocationHandler.java:209)
> 	at java.rmi.server.RemoteObjectInvocationHandler.invoke(java.rmi@11.0.4/RemoteObjectInvocationHandler.java:161)
> 	at com.sun.proxy.$Proxy55.executeMethodOnObject(Unknown Source)
> 	at org.apache.geode.test.dunit.VM.executeMethodOnObject(VM.java:576)
> 	at org.apache.geode.test.dunit.VM.invoke(VM.java:431)
> 	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.testIncompleteInBaseline(IncrementalBackupDistributedTest.java:327)
> "RMI TCP Connection(4)-172.17.0.9" #40 daemon prio=5 os_prio=0 cpu=1139.47ms elapsed=4212.37s tid=0x00007f2f94004800 nid=0x1f7 waiting on condition  [0x00007f301d826000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
> 	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
> 	- parking to wait for  <0x00000000ecf14698> (a java.util.concurrent.CountDownLatch$Sync)
> 	at java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.4/LockSupport.java:234)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(java.base@11.0.4/AbstractQueuedSynchronizer.java:1079)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(java.base@11.0.4/AbstractQueuedSynchronizer.java:1369)
> 	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:278)
> 	at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.awaitWithCheck(StoppableCountDownLatch.java:120)
> 	at org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:93)
> 	at org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:692)
> 	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:639)
> 	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:620)
> 	at org.apache.geode.distributed.internal.ReplyProcessor21.waitForReplies(ReplyProcessor21.java:534)
> 	at org.apache.geode.internal.cache.backup.BackupStep.send(BackupStep.java:66)
> 	at org.apache.geode.internal.cache.backup.BackupOperation.performBackupSteps(BackupOperation.java:121)
> 	at org.apache.geode.internal.cache.backup.BackupOperation.performBackupUnderLock(BackupOperation.java:92)
> 	at org.apache.geode.internal.cache.backup.BackupOperation.performBackup(BackupOperation.java:77)
> 	at org.apache.geode.internal.cache.backup.BackupOperation.backupAllMembers(BackupOperation.java:71)
> 	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.performBackup(IncrementalBackupDistributedTest.java:565)
> 	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.performBackup(IncrementalBackupDistributedTest.java:560)
> 	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest.lambda$testIncompleteInBaseline$515fd116$1(IncrementalBackupDistributedTest.java:327)
> 	at org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40.call(Unknown Source)
> "BackupServiceThread1" #276 daemon prio=10 os_prio=0 cpu=0.33ms elapsed=4152.24s tid=0x00007f377021b000 nid=0x50c waiting on condition  [0x00007f38391d6000]
>    java.lang.Thread.State: WAITING (parking)
> 	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
> 	- parking to wait for  <0x00000000e3ce6438> (a java.util.concurrent.CountDownLatch$Sync)
> 	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
> 	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
> 	at org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:72)
> 	at org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
> 	at org.apache.geode.internal.cache.backup.BackupService$$Lambda$406/0x0000000840c48040.call(Unknown Source)
> 	at java.util.concurrent.FutureTask.run(java.base@11.0.4/FutureTask.java:264)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
> 	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)
> "BackupServiceThread1" #266 daemon prio=5 os_prio=0 cpu=0.43ms elapsed=4156.14s tid=0x00007f2fac631800 nid=0x50e waiting on condition  [0x00007f2f455c7000]
>    java.lang.Thread.State: WAITING (parking)
> 	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
> 	- parking to wait for  <0x00000000ecf518b8> (a java.util.concurrent.CountDownLatch$Sync)
> 	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
> 	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
> 	at org.apache.geode.internal.cache.backup.BackupTask.backup(BackupTask.java:72)
> 	at org.apache.geode.internal.cache.backup.BackupService.lambda$prepareBackup$0(BackupService.java:62)
> 	at org.apache.geode.internal.cache.backup.BackupService$$Lambda$390/0x0000000840be0c40.call(Unknown Source)
> 	at java.util.concurrent.FutureTask.run(java.base@11.0.4/FutureTask.java:264)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
> 	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)
>    Locked ownable synchronizers:
> 	- <0x00000000ece9fc10> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> 	- <0x00000000eceb37b0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
> 	- <0x00000000ecf51990> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "Pooled Message Processor 3" #86 daemon prio=10 os_prio=0 cpu=41.38ms elapsed=4176.70s tid=0x00007fcfb403f800 nid=0x307 waiting on condition  [0x00007fd0348d4000]
>    java.lang.Thread.State: WAITING (parking)
> 	at jdk.internal.misc.Unsafe.park(java.base@11.0.4/Native Method)
> 	- parking to wait for  <0x00000000ecf68520> (a java.util.concurrent.CountDownLatch$Sync)
> 	at java.util.concurrent.locks.LockSupport.park(java.base@11.0.4/LockSupport.java:194)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.4/AbstractQueuedSynchronizer.java:885)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1039)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@11.0.4/AbstractQueuedSynchronizer.java:1345)
> 	at java.util.concurrent.CountDownLatch.await(java.base@11.0.4/CountDownLatch.java:232)
> 	at org.apache.geode.internal.cache.backup.BackupTask.getPreparedDiskStores(BackupTask.java:60)
> 	at org.apache.geode.internal.cache.backup.BackupService.prepareBackup(BackupService.java:63)
> 	at org.apache.geode.internal.cache.backup.PrepareBackup.run(PrepareBackup.java:41)
> 	at org.apache.geode.internal.cache.backup.PrepareBackupRequest.createResponse(PrepareBackupRequest.java:66)
> 	at org.apache.geode.internal.admin.remote.CliLegacyMessage.process(CliLegacyMessage.java:37)
> 	at org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:372)
> 	at org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:436)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.4/ThreadPoolExecutor.java:1128)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.4/ThreadPoolExecutor.java:628)
> 	at org.apache.geode.distributed.internal.ClusterOperationExecutors.runUntilShutdown(ClusterOperationExecutors.java:475)
> 	at org.apache.geode.distributed.internal.ClusterOperationExecutors.doProcessingThread(ClusterOperationExecutors.java:406)
> 	at org.apache.geode.distributed.internal.ClusterOperationExecutors$$Lambda$174/0x0000000840a39840.invoke(Unknown Source)
> 	at org.apache.geode.internal.logging.LoggingThreadFactory.lambda$newThread$0(LoggingThreadFactory.java:119)
> 	at org.apache.geode.internal.logging.LoggingThreadFactory$$Lambda$175/0x0000000840a39c40.run(Unknown Source)
> 	at java.lang.Thread.run(java.base@11.0.4/Thread.java:834)
> {noformat}
> Test output shows these as the last log entries written to logs & captured by gradle:
> {noformat}
> [vm0] [info 2019/10/15 22:40:23.697 GMT <Pooled Waiting Message Processor 1> tid=0xd0] Configured redundancy of 1 copies has been restored to /IncrementalBackupDistributedTest_testIncompleteInBaseline_region-2
> [info 2019/10/15 22:40:23.698 GMT <Pooled Waiting Message Processor 1> tid=0xf1] Configured redundancy of 1 copies has been restored to /IncrementalBackupDistributedTest_testIncompleteInBaseline_region-2
> [vm1] [info 2019/10/15 22:40:26.230 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840@2aba62dc
> [vm1] [info 2019/10/15 22:40:26.230 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$417/0x0000000840c20840@2aba62dc
> [vm1] [info 2019/10/15 22:40:26.240 GMT <RMI TCP Connection(4)-172.17.0.9> tid=0x28] Received method: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40.call with 0 args on object: org.apache.geode.internal.cache.backup.IncrementalBackupDistributedTest$$Lambda$418/0x0000000840c20c40@6eaf6978
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)