You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (Jira)" <ji...@apache.org> on 2022/06/15 12:19:00 UTC

[jira] [Commented] (FLINK-28077) KeyedStateCheckpointingITCase.testWithMemoryBackendSync runs into timeout

    [ https://issues.apache.org/jira/browse/FLINK-28077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554545#comment-17554545 ] 

Chesnay Schepler commented on FLINK-28077:
------------------------------------------

The TM is crashing because a task gets stuck during cancellation:
{code}
 java.lang.Object.wait(Native Method)
java.lang.Thread.join(Thread.java:1252)
java.lang.Thread.join(Thread.java:1326)
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl.close(ChannelStateWriteRequestExecutorImpl.java:166)
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriterImpl.close(ChannelStateWriterImpl.java:234)
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.cancel(SubtaskCheckpointCoordinatorImpl.java:560)
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.close(SubtaskCheckpointCoordinatorImpl.java:547)
org.apache.flink.streaming.runtime.tasks.StreamTask$$Lambda$1220/1213892815.close(Unknown Source)
org.apache.flink.util.IOUtils.closeAll(IOUtils.java:254)
org.apache.flink.core.fs.AutoCloseableRegistry.doClose(AutoCloseableRegistry.java:72)
org.apache.flink.util.AbstractAutoCloseableRegistry.close(AbstractAutoCloseableRegistry.java:127)
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUp(StreamTask.java:938)
org.apache.flink.runtime.taskmanager.Task.lambda$restoreAndInvoke$1(Task.java:923)
org.apache.flink.runtime.taskmanager.Task$$Lambda$1886/761627220.run(Unknown Source)
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:923)
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
java.lang.Thread.run(Thread.java:748)
{code}

> KeyedStateCheckpointingITCase.testWithMemoryBackendSync runs into timeout
> -------------------------------------------------------------------------
>
>                 Key: FLINK-28077
>                 URL: https://issues.apache.org/jira/browse/FLINK-28077
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing, Tests
>    Affects Versions: 1.16.0
>            Reporter: Matthias Pohl
>            Priority: Major
>              Labels: test-stability
>
> [Build #36209|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=36209&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=9370] got stuck in {{KeyedStateCheckpointingITCase.testWithMemoryBackendSync}}:
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x00007f849c00b800 nid=0x19c3 waiting on condition [0x00007f84a45b7000]
>    java.lang.Thread.State: WAITING (parking)
> 	at sun.misc.Unsafe.park(Native Method)
> 	- parking to wait for  <0x0000000080074870> (a java.util.concurrent.CompletableFuture$Signaller)
> 	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> 	at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> 	at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> 	at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> 	at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> 	at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1989)
> 	at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1951)
> 	at org.apache.flink.test.checkpointing.KeyedStateCheckpointingITCase.testProgramWithBackend(KeyedStateCheckpointingITCase.java:175)
> 	at org.apache.flink.test.checkpointing.KeyedStateCheckpointingITCase.testWithMemoryBackendSync(KeyedStateCheckpointingITCase.java:104)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [...]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)