You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Zhipeng Zhang (Jira)" <ji...@apache.org> on 2023/04/24 07:51:00 UTC

[jira] [Updated] (FLINK-31910) Using BroadcastUtils#withBroadcast in iteration perround mode got stuck

     [ https://issues.apache.org/jira/browse/FLINK-31910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhipeng Zhang updated FLINK-31910:
----------------------------------
    Description: 
Using BroadcastUtils#withBroadcast in iteration perround mode got stuck. From the thread dump, it seems that the head operator and criteria node are stuck and waiting for a mail.

 
{code:java}
"output-head-Parallel Collection Source -> Sink: Unnamed (4/4)#0" #228 prio=5 os_prio=31 tid=0x00007f9e1d083800 nid=0x19b07 waiting on condition [0x0000700013db6000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000747a83270> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
    at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:149)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:335)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
    at org.apache.flink.runtime.taskmanager.Task$$Lambda$1383/280145505.run(Unknown Source)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
    at java.lang.Thread.run(Thread.java:748) {code}
 

The demo for this bug could be found here: https://github.com/zhipeng93/flink-ml/tree/FLINK-31910-demo-case

  was:
Using BroadcastUtils#withBroadcast in iteration perround mode got stuck. From the thread dump, it seems that the head operator and criteria node are stuck and waiting for a mail.

 
{code:java}
"output-head-Parallel Collection Source -> Sink: Unnamed (4/4)#0" #228 prio=5 os_prio=31 tid=0x00007f9e1d083800 nid=0x19b07 waiting on condition [0x0000700013db6000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x0000000747a83270> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
    at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:149)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:335)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
    at org.apache.flink.runtime.taskmanager.Task$$Lambda$1383/280145505.run(Unknown Source)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
    at java.lang.Thread.run(Thread.java:748) {code}
 

The demo for this bug could be found here: 


> Using BroadcastUtils#withBroadcast in iteration perround mode got stuck
> -----------------------------------------------------------------------
>
>                 Key: FLINK-31910
>                 URL: https://issues.apache.org/jira/browse/FLINK-31910
>             Project: Flink
>          Issue Type: Bug
>          Components: Library / Machine Learning
>    Affects Versions: ml-2.3.0
>            Reporter: Zhipeng Zhang
>            Priority: Major
>
> Using BroadcastUtils#withBroadcast in iteration perround mode got stuck. From the thread dump, it seems that the head operator and criteria node are stuck and waiting for a mail.
>  
> {code:java}
> "output-head-Parallel Collection Source -> Sink: Unnamed (4/4)#0" #228 prio=5 os_prio=31 tid=0x00007f9e1d083800 nid=0x19b07 waiting on condition [0x0000700013db6000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0x0000000747a83270> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>     at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>     at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.TaskMailboxImpl.take(TaskMailboxImpl.java:149)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:335)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
>     at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
>     at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804)
>     at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753)
>     at org.apache.flink.runtime.taskmanager.Task$$Lambda$1383/280145505.run(Unknown Source)
>     at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948)
>     at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
>     at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741)
>     at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563)
>     at java.lang.Thread.run(Thread.java:748) {code}
>  
> The demo for this bug could be found here: https://github.com/zhipeng93/flink-ml/tree/FLINK-31910-demo-case



--
This message was sent by Atlassian Jira
(v8.20.10#820010)