You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2016/10/28 15:02:58 UTC

[jira] [Commented] (AMBARI-18728) During cluster install, Components get timed out icon while starting

    [ https://issues.apache.org/jira/browse/AMBARI-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15615645#comment-15615645 ] 

Hadoop QA commented on AMBARI-18728:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12835824/AMBARI-18728.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of javac compiler warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number of release audit warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/Ambari-trunk-test-patch/9047//testReport/
Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/9047//console

This message is automatically generated.

> During cluster install, Components get timed out icon while starting
> --------------------------------------------------------------------
>
>                 Key: AMBARI-18728
>                 URL: https://issues.apache.org/jira/browse/AMBARI-18728
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Andrew Onischuk
>            Assignee: Andrew Onischuk
>             Fix For: 2.5.0
>
>         Attachments: AMBARI-18728.patch
>
>
> This was caused by a very tricky race-condition in the way python multiprocessing.thread works resulting in deadlock in ambari_agent.ActionQueue thread.
> The problem is the below flow:
> If this all these three get executed at the same time (a very rear occasion):
> 1. Process1 executes queue.get(False)
> 2. Process2 executes queue.put(largeObjectWhichTakesLongTimeToPut)
> 3. Someone kills Process2.
> This results in deadlock in process1 get. Which is caused by queue locks/semaphores to being released during put of process2.
> I have wrote a script test_race_condition.py to emulate this behaviour and indeed could reproduce this and test the fix for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)