You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chesnay Schepler (Jira)" <ji...@apache.org> on 2022/06/15 12:19:00 UTC
[jira] [Commented] (FLINK-28077) KeyedStateCheckpointingITCase.testWithMemoryBackendSync runs into timeout
[ https://issues.apache.org/jira/browse/FLINK-28077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554545#comment-17554545 ]
Chesnay Schepler commented on FLINK-28077:
------------------------------------------
The TM is crashing because a task gets stuck during cancellation:
{code}
java.lang.Object.wait(Native Method)
java.lang.Thread.join(Thread.java:1252)
java.lang.Thread.join(Thread.java:1326)
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriteRequestExecutorImpl.close(ChannelStateWriteRequestExecutorImpl.java:166)
org.apache.flink.runtime.checkpoint.channel.ChannelStateWriterImpl.close(ChannelStateWriterImpl.java:234)
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.cancel(SubtaskCheckpointCoordinatorImpl.java:560)
org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.close(SubtaskCheckpointCoordinatorImpl.java:547)
org.apache.flink.streaming.runtime.tasks.StreamTask$$Lambda$1220/1213892815.close(Unknown Source)
org.apache.flink.util.IOUtils.closeAll(IOUtils.java:254)
org.apache.flink.core.fs.AutoCloseableRegistry.doClose(AutoCloseableRegistry.java:72)
org.apache.flink.util.AbstractAutoCloseableRegistry.close(AbstractAutoCloseableRegistry.java:127)
org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUp(StreamTask.java:938)
org.apache.flink.runtime.taskmanager.Task.lambda$restoreAndInvoke$1(Task.java:923)
org.apache.flink.runtime.taskmanager.Task$$Lambda$1886/761627220.run(Unknown Source)
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:923)
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
java.lang.Thread.run(Thread.java:748)
{code}
> KeyedStateCheckpointingITCase.testWithMemoryBackendSync runs into timeout
> -------------------------------------------------------------------------
>
> Key: FLINK-28077
> URL: https://issues.apache.org/jira/browse/FLINK-28077
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing, Tests
> Affects Versions: 1.16.0
> Reporter: Matthias Pohl
> Priority: Major
> Labels: test-stability
>
> [Build #36209|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=36209&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=9370] got stuck in {{KeyedStateCheckpointingITCase.testWithMemoryBackendSync}}:
> {code}
> "main" #1 prio=5 os_prio=0 tid=0x00007f849c00b800 nid=0x19c3 waiting on condition [0x00007f84a45b7000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x0000000080074870> (a java.util.concurrent.CompletableFuture$Signaller)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
> at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
> at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
> at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1989)
> at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1951)
> at org.apache.flink.test.checkpointing.KeyedStateCheckpointingITCase.testProgramWithBackend(KeyedStateCheckpointingITCase.java:175)
> at org.apache.flink.test.checkpointing.KeyedStateCheckpointingITCase.testWithMemoryBackendSync(KeyedStateCheckpointingITCase.java:104)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [...]
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)