You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Junfan Zhang (Jira)" <ji...@apache.org> on 2021/12/14 05:46:00 UTC

[jira] [Comment Edited] (OOZIE-3646) Possible dead-lock in SignalXCommand

    [ https://issues.apache.org/jira/browse/OOZIE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458913#comment-17458913 ] 

Junfan Zhang edited comment on OOZIE-3646 at 12/14/21, 5:45 AM:
----------------------------------------------------------------

Thanks [~dionusos].
https://github.com/apache/oozie/pull/65 Now the test case has been attached.

If you run {{testPossibleDeadLock}} method, it will fail. 
But you make the {{ConfigurationService.setBoolean(SignalXCommand.FORK_PARALLEL_JOBSUBMISSION, false);}}, everything is ok. 
Because of the sync invoking in {{SignalXCommand}}  
{code:java}
List<Future<ActionExecutorContext>> futures = Services.get().get(CallableQueueService.class)
.invokeAll(tasks)
{code}

Please check it and let me know what you think [~dionusos]


was (Author: zuston):
Thanks [~dionusos].
https://github.com/apache/oozie/pull/65 Now the test case has been attached.

If you run {{testPossibleDeadLock}} method, it will fail. 
But you make the {{ConfigurationService.setBoolean(SignalXCommand.FORK_PARALLEL_JOBSUBMISSION, false);}}, everything is ok. 
Because of the sync invoking in {{SignalXCommand}}  
{code:java}
List<Future<ActionExecutorContext>> futures = Services.get().get(CallableQueueService.class)
.invokeAll(tasks)
{code}


> Possible dead-lock in SignalXCommand
> ------------------------------------
>
>                 Key: OOZIE-3646
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3646
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Junfan Zhang
>            Priority: Major
>
> The limited thread execution mechanism aims to solve the dead-lock when all active threads are executing the SignalXCommand's invokeAll method.
> h2. Dead-lock when to happen
> Assuming that Oozie CallableQueue thread pool size is 120, when all threads are executing the {{SignalXCommand.startForkedActions}} method, a deadlock occurs.
> Because in {{SignalXCommand.startForkedActions}}, the code of {{List<Future<ActionExecutorContext>> futures = Services.get().get(CallableQueueService.class)
>                     .invokeAll(tasks);}} will be sync executed, however now all callableQueue threads are busy.
> h2. Solution
> 1. Need to limit directly invokeAll call when the num of rest threads is less than the tasks
> 2. To obtain correct active threads number in callableQueue, the SignalXCommand.class lock is needed.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)