You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by jun aoki <ju...@gmail.com> on 2014/10/09 20:34:44 UTC

Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/
-----------------------------------------------------------

Review request for Ambari and Yusaku Sako.


Bugs: AMBARI-7622
    https://issues.apache.org/jira/browse/AMBARI-7622


Repository: ambari


Description
-------

Tweaked the waiting condition upon ActionScheduler


Diffs
-----

  ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 

Diff: https://reviews.apache.org/r/26510/diff/


Testing
-------


Thanks,

jun aoki


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.

> On Oct. 9, 2014, 10:28 p.m., Sid Wagle wrote:
> > ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java, line 431
> > <https://reviews.apache.org/r/26510/diff/2/?file=717035#file717035line431>
> >
> >     The NPE seems to be from,
> >     line 782: commandParamsStageCache.put(stagePk, commandParams)
> >     
> >     Seems like we have command params as null so guava cache complains about it. Maybe, "StageUtils.getGson().fromJson(s.getCommandParamsStage(), type)" returns null for empty string, note: we are passing empty string for comandParamsStage.
> >     
> >     Not sure how this would be resolved after some time?
> >     A null check in ActionScheduler might not be a bad idea, your call.

Sid, yes it is from StageUtils.getGso()... call returning a null.
I'm not too sure either how it is resolved but it has been working this way and we solved asychronous behaviour. I'd like to say it is good to go. Please give a Ship it if you are OK.


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56070
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ja...@apache.org>.
I just want to make my point clear to everyone.

1. Sid's idea is right and ideal.
2. We are suffering from the trunk build failing due with a *one liner
wrong if statement* (let me know if you think I'm wrong on this)
3. The test code by using asynchronous has been *running this way for some
time* and designed by some one else. (and I didn't know anything about it)
4. Thus, I just want to fix the one line (if statement) and get the trunk
build work.
5. We make another ticket to Sid's point and will come back.

Don't get me wrong. I agree Sid's idea is the right thing, but it is more
like refactoring.
I'm only interested in getting trunk work by fixing the wrong if statement.

- jun

On Fri, Oct 10, 2014 at 10:44 AM, John Speidel <js...@hortonworks.com>
wrote:

> I don't feel that it would be appropriate to merge without addressing
> Sid's concerns.
>
> -John
>
> On Fri, Oct 10, 2014 at 1:34 PM, Alejandro Fernandez <
> afernandez@hortonworks.com> wrote:
>
>>
>>
>> > On Oct. 9, 2014, 10:28 p.m., Sid Wagle wrote:
>> > >
>> ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java,
>> line 431
>> > > <
>> https://reviews.apache.org/r/26510/diff/2/?file=717035#file717035line431>
>> > >
>> > >     The NPE seems to be from,
>> > >     line 782: commandParamsStageCache.put(stagePk, commandParams)
>> > >
>> > >     Seems like we have command params as null so guava cache
>> complains about it. Maybe,
>> "StageUtils.getGson().fromJson(s.getCommandParamsStage(), type)" returns
>> null for empty string, note: we are passing empty string for
>> comandParamsStage.
>> > >
>> > >     Not sure how this would be resolved after some time?
>> > >     A null check in ActionScheduler might not be a bad idea, your
>> call.
>> >
>> > jun aoki wrote:
>> >     Sid, yes it is from StageUtils.getGso()... call returning a null.
>> >     I'm not too sure either how it is resolved but it has been working
>> this way and we solved asychronous behaviour. I'd like to say it is good to
>> go. Please give a Ship it if you are OK.
>> >
>> > Sid Wagle wrote:
>> >     Could you try putting a Null check after, line 782:
>> commandParamsStageCache.put(stagePk, commandParams) and removing the while
>> condition?
>> >     If that works there is no need to catch the NPE, I am not convinced
>> NPE is expected behavior.
>>
>> I concur.
>>
>>
>> - Alejandro
>>
>>
>> -----------------------------------------------------------
>> This is an automatically generated e-mail. To reply, visit:
>> https://reviews.apache.org/r/26510/#review56070
>> -----------------------------------------------------------
>>
>>
>> On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
>> >
>> > -----------------------------------------------------------
>> > This is an automatically generated e-mail. To reply, visit:
>> > https://reviews.apache.org/r/26510/
>> > -----------------------------------------------------------
>> >
>> > (Updated Oct. 9, 2014, 10:05 p.m.)
>> >
>> >
>> > Review request for Ambari and Yusaku Sako.
>> >
>> >
>> > Bugs: AMBARI-7622
>> >     https://issues.apache.org/jira/browse/AMBARI-7622
>> >
>> >
>> > Repository: ambari
>> >
>> >
>> > Description
>> > -------
>> >
>> > Tweaked the waiting condition upon ActionScheduler
>> >
>> >
>> > Diffs
>> > -----
>> >
>> >
>>  ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
>> a20f252
>> >
>> > Diff: https://reviews.apache.org/r/26510/diff/
>> >
>> >
>> > Testing
>> > -------
>> >
>> >
>> > Thanks,
>> >
>> > jun aoki
>> >
>> >
>>
>>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
-jun

Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by John Speidel <js...@hortonworks.com>.
I don't feel that it would be appropriate to merge without addressing Sid's
concerns.

-John

On Fri, Oct 10, 2014 at 1:34 PM, Alejandro Fernandez <
afernandez@hortonworks.com> wrote:

>
>
> > On Oct. 9, 2014, 10:28 p.m., Sid Wagle wrote:
> > >
> ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java,
> line 431
> > > <
> https://reviews.apache.org/r/26510/diff/2/?file=717035#file717035line431>
> > >
> > >     The NPE seems to be from,
> > >     line 782: commandParamsStageCache.put(stagePk, commandParams)
> > >
> > >     Seems like we have command params as null so guava cache complains
> about it. Maybe, "StageUtils.getGson().fromJson(s.getCommandParamsStage(),
> type)" returns null for empty string, note: we are passing empty string for
> comandParamsStage.
> > >
> > >     Not sure how this would be resolved after some time?
> > >     A null check in ActionScheduler might not be a bad idea, your call.
> >
> > jun aoki wrote:
> >     Sid, yes it is from StageUtils.getGso()... call returning a null.
> >     I'm not too sure either how it is resolved but it has been working
> this way and we solved asychronous behaviour. I'd like to say it is good to
> go. Please give a Ship it if you are OK.
> >
> > Sid Wagle wrote:
> >     Could you try putting a Null check after, line 782:
> commandParamsStageCache.put(stagePk, commandParams) and removing the while
> condition?
> >     If that works there is no need to catch the NPE, I am not convinced
> NPE is expected behavior.
>
> I concur.
>
>
> - Alejandro
>
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/#review56070
> -----------------------------------------------------------
>
>
> On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> >
> > -----------------------------------------------------------
> > This is an automatically generated e-mail. To reply, visit:
> > https://reviews.apache.org/r/26510/
> > -----------------------------------------------------------
> >
> > (Updated Oct. 9, 2014, 10:05 p.m.)
> >
> >
> > Review request for Ambari and Yusaku Sako.
> >
> >
> > Bugs: AMBARI-7622
> >     https://issues.apache.org/jira/browse/AMBARI-7622
> >
> >
> > Repository: ambari
> >
> >
> > Description
> > -------
> >
> > Tweaked the waiting condition upon ActionScheduler
> >
> >
> > Diffs
> > -----
> >
> >
>  ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
> a20f252
> >
> > Diff: https://reviews.apache.org/r/26510/diff/
> >
> >
> > Testing
> > -------
> >
> >
> > Thanks,
> >
> > jun aoki
> >
> >
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Alejandro Fernandez <af...@hortonworks.com>.

> On Oct. 9, 2014, 10:28 p.m., Sid Wagle wrote:
> > ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java, line 431
> > <https://reviews.apache.org/r/26510/diff/2/?file=717035#file717035line431>
> >
> >     The NPE seems to be from,
> >     line 782: commandParamsStageCache.put(stagePk, commandParams)
> >     
> >     Seems like we have command params as null so guava cache complains about it. Maybe, "StageUtils.getGson().fromJson(s.getCommandParamsStage(), type)" returns null for empty string, note: we are passing empty string for comandParamsStage.
> >     
> >     Not sure how this would be resolved after some time?
> >     A null check in ActionScheduler might not be a bad idea, your call.
> 
> jun aoki wrote:
>     Sid, yes it is from StageUtils.getGso()... call returning a null.
>     I'm not too sure either how it is resolved but it has been working this way and we solved asychronous behaviour. I'd like to say it is good to go. Please give a Ship it if you are OK.
> 
> Sid Wagle wrote:
>     Could you try putting a Null check after, line 782: commandParamsStageCache.put(stagePk, commandParams) and removing the while condition?
>     If that works there is no need to catch the NPE, I am not convinced NPE is expected behavior.

I concur.


- Alejandro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56070
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ja...@apache.org>.
Hi Sid,

I know my fix is not comprehensive but at least this should stabilize
https://builds.apache.org/job/Ambari-trunk-Commit
with false alert failures Yusaku pointed out.

Could I just keep the fix as is and commit it, and we come back for the
ideal solution later?


On Thu, Oct 9, 2014 at 3:52 PM, Sid Wagle <sw...@hortonworks.com> wrote:

>    This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
>
> On October 9th, 2014, 10:28 p.m. UTC, *Sid Wagle* wrote:
>
>
> ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
> <https://reviews.apache.org/r/26510/diff/2/?file=717035#file717035line431> (Diff
> revision 2)
>
> public Object answer(InvocationOnMock invocation) throws Throwable {
>
>   431
>
>       Thread.sleep(100L);
>
> 431
>
>       }catch(NullPointerException e){
>
>   The NPE seems to be from,
>
> line 782: commandParamsStageCache.put(stagePk, commandParams)
>
> Seems like we have command params as null so guava cache complains about it. Maybe, "StageUtils.getGson().fromJson(s.getCommandParamsStage(), type)" returns null for empty string, note: we are passing empty string for comandParamsStage.
>
> Not sure how this would be resolved after some time?
>
> A null check in ActionScheduler might not be a bad idea, your call.
>
>  On October 9th, 2014, 10:38 p.m. UTC, *jun aoki* wrote:
>
> Sid, yes it is from StageUtils.getGso()... call returning a null.
>
> I'm not too sure either how it is resolved but it has been working this way and we solved asychronous behaviour. I'd like to say it is good to go. Please give a Ship it if you are OK.
>
>  Could you try putting a Null check after, line 782: commandParamsStageCache.put(stagePk, commandParams) and removing the while condition?
>
> If that works there is no need to catch the NPE, I am not convinced NPE is expected behavior.
>
>
> - Sid
>
> On October 9th, 2014, 10:05 p.m. UTC, jun aoki wrote:
>   Review request for Ambari and Yusaku Sako.
> By jun aoki.
>
> *Updated Oct. 9, 2014, 10:05 p.m.*
>  *Bugs: * AMBARI-7622 <https://issues.apache.org/jira/browse/AMBARI-7622>
>  *Repository: * ambari
> Description
>
> Tweaked the waiting condition upon ActionScheduler
>
>   Diffs
>
>    - ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
>    (a20f252)
>
> View Diff <https://reviews.apache.org/r/26510/diff/>
>



-- 
-jun

Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.

> On Oct. 9, 2014, 10:28 p.m., Sid Wagle wrote:
> > ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java, line 431
> > <https://reviews.apache.org/r/26510/diff/2/?file=717035#file717035line431>
> >
> >     The NPE seems to be from,
> >     line 782: commandParamsStageCache.put(stagePk, commandParams)
> >     
> >     Seems like we have command params as null so guava cache complains about it. Maybe, "StageUtils.getGson().fromJson(s.getCommandParamsStage(), type)" returns null for empty string, note: we are passing empty string for comandParamsStage.
> >     
> >     Not sure how this would be resolved after some time?
> >     A null check in ActionScheduler might not be a bad idea, your call.
> 
> jun aoki wrote:
>     Sid, yes it is from StageUtils.getGso()... call returning a null.
>     I'm not too sure either how it is resolved but it has been working this way and we solved asychronous behaviour. I'd like to say it is good to go. Please give a Ship it if you are OK.

Could you try putting a Null check after, line 782: commandParamsStageCache.put(stagePk, commandParams) and removing the while condition?
If that works there is no need to catch the NPE, I am not convinced NPE is expected behavior.


- Sid


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56070
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56070
-----------------------------------------------------------



ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
<https://reviews.apache.org/r/26510/#comment96423>

    The NPE seems to be from,
    line 782: commandParamsStageCache.put(stagePk, commandParams)
    
    Seems like we have command params as null so guava cache complains about it. Maybe, "StageUtils.getGson().fromJson(s.getCommandParamsStage(), type)" returns null for empty string, note: we are passing empty string for comandParamsStage.
    
    Not sure how this would be resolved after some time?
    A null check in ActionScheduler might not be a bad idea, your call.


- Sid Wagle


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.

> On Oct. 10, 2014, 6:49 p.m., John Speidel wrote:
> > I don't feel comfortable with this test being merged as is.
> > If you need to resolve this before a proper fix is implemented, please ignore the test via @Ignore and file a Jira to properly fix the test.

Hi John, I hear your concern and agree with you two that the test should not use asynchronous. 
But the test has been this way since 2012 and I am hesitent to drop the test coverage by adding @Ignore.
https://github.com/apache/ambari/commits/trunk/ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56194
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by John Speidel <js...@hortonworks.com>.

> On Oct. 10, 2014, 6:49 p.m., John Speidel wrote:
> > I don't feel comfortable with this test being merged as is.
> > If you need to resolve this before a proper fix is implemented, please ignore the test via @Ignore and file a Jira to properly fix the test.
> 
> jun aoki wrote:
>     Hi John, I hear your concern and agree with you two that the test should not use asynchronous. 
>     But the test has been this way since 2012 and I am hesitent to drop the test coverage by adding @Ignore.
>     https://github.com/apache/ambari/commits/trunk/ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
> 
> John Speidel wrote:
>     Jun, first off thanks for taking the time to analyze this intermittent test failure.
>     I spent the last hour looking at this test.  
>     My conclusion is that this is a very poorly written test from it's orgin.  It is way too coarse grained, uses the NPE as an intentional error mechanism and expects several iterations of doWork() to complete based on the states returned by the test mocks.
>     Your changes, although they result in a passing test now, have several issues.
>     - They circumvent the run method which under the case being tested, catches the NPE and makes some state changes
>     - The changes are not written in a way that guarantees a deterministic behavior: "while(count < 10)" is a big red flag
>     - The changes will make the test VERY confusing to anybody who needs to fix these tests in the future
>     
>     All of the tests in this class need to be rewrittn in a way that is deterministic and doesn't require obscure NPE exceptions or multiple interations of doWork().
>     
>     I still feel it is better to @Ignore the test, or all of them in this class for that matter, and have them fixed properly when possible.
> 
> jun aoki wrote:
>     Hi John, thank you for writing down to share details of you thoughts. I agree all 3 bullets you mentioned, but only in my second patch.
>     
>     My first intention and patch is much simpler and to fix a "wrong if statement".
>     https://reviews.apache.org/r/26510/diff/1/
>     
>     I still hesitant to drop the test coverage, but let me know if I don't still convince you, I will just set a @Ignore on the test.

Jun,

The change that you made in patch 1 fixes a race condition where we could break out of the while loop if only one of the 2 conditiona are satisfied.
But, I still have a serious concern that this test could/will at some point loop forever because one or both of the conditions are not satisfied.

So, I will relunctantly agree to +1 the patch if 
- you add a relief valve to the test so that it fails out if the conditions aren't met in some period of time (something like 5 mins or we risk lots of intermittent failures)
- a Jira is filed to properly fix these tests in the near future
- we agree that the next time the test fails that we @Ignore it


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56194
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.

> On Oct. 10, 2014, 6:49 p.m., John Speidel wrote:
> > I don't feel comfortable with this test being merged as is.
> > If you need to resolve this before a proper fix is implemented, please ignore the test via @Ignore and file a Jira to properly fix the test.
> 
> jun aoki wrote:
>     Hi John, I hear your concern and agree with you two that the test should not use asynchronous. 
>     But the test has been this way since 2012 and I am hesitent to drop the test coverage by adding @Ignore.
>     https://github.com/apache/ambari/commits/trunk/ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
> 
> John Speidel wrote:
>     Jun, first off thanks for taking the time to analyze this intermittent test failure.
>     I spent the last hour looking at this test.  
>     My conclusion is that this is a very poorly written test from it's orgin.  It is way too coarse grained, uses the NPE as an intentional error mechanism and expects several iterations of doWork() to complete based on the states returned by the test mocks.
>     Your changes, although they result in a passing test now, have several issues.
>     - They circumvent the run method which under the case being tested, catches the NPE and makes some state changes
>     - The changes are not written in a way that guarantees a deterministic behavior: "while(count < 10)" is a big red flag
>     - The changes will make the test VERY confusing to anybody who needs to fix these tests in the future
>     
>     All of the tests in this class need to be rewrittn in a way that is deterministic and doesn't require obscure NPE exceptions or multiple interations of doWork().
>     
>     I still feel it is better to @Ignore the test, or all of them in this class for that matter, and have them fixed properly when possible.
> 
> jun aoki wrote:
>     Hi John, thank you for writing down to share details of you thoughts. I agree all 3 bullets you mentioned, but only in my second patch.
>     
>     My first intention and patch is much simpler and to fix a "wrong if statement".
>     https://reviews.apache.org/r/26510/diff/1/
>     
>     I still hesitant to drop the test coverage, but let me know if I don't still convince you, I will just set a @Ignore on the test.
> 
> John Speidel wrote:
>     Jun,
>     
>     The change that you made in patch 1 fixes a race condition where we could break out of the while loop if only one of the 2 conditiona are satisfied.
>     But, I still have a serious concern that this test could/will at some point loop forever because one or both of the conditions are not satisfied.
>     
>     So, I will relunctantly agree to +1 the patch if 
>     - you add a relief valve to the test so that it fails out if the conditions aren't met in some period of time (something like 5 mins or we risk lots of intermittent failures)
>     - a Jira is filed to properly fix these tests in the near future
>     - we agree that the next time the test fails that we @Ignore it

John, thank you for spending a good amount of time to do the right thing. I see your concerns valid too. 
I did not know ActionScheduler has a chance not to change the state forever. It would be really bad because then it takes up a Jenkins slave until killed manually. 
I will just mark it as @Ignored and let Sid fix with the right approach.  Will attach a new one shortly.
John/Sid, thank you again for your quite a lot of time fot this!


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56194
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.

> On Oct. 10, 2014, 6:49 p.m., John Speidel wrote:
> > I don't feel comfortable with this test being merged as is.
> > If you need to resolve this before a proper fix is implemented, please ignore the test via @Ignore and file a Jira to properly fix the test.
> 
> jun aoki wrote:
>     Hi John, I hear your concern and agree with you two that the test should not use asynchronous. 
>     But the test has been this way since 2012 and I am hesitent to drop the test coverage by adding @Ignore.
>     https://github.com/apache/ambari/commits/trunk/ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java
> 
> John Speidel wrote:
>     Jun, first off thanks for taking the time to analyze this intermittent test failure.
>     I spent the last hour looking at this test.  
>     My conclusion is that this is a very poorly written test from it's orgin.  It is way too coarse grained, uses the NPE as an intentional error mechanism and expects several iterations of doWork() to complete based on the states returned by the test mocks.
>     Your changes, although they result in a passing test now, have several issues.
>     - They circumvent the run method which under the case being tested, catches the NPE and makes some state changes
>     - The changes are not written in a way that guarantees a deterministic behavior: "while(count < 10)" is a big red flag
>     - The changes will make the test VERY confusing to anybody who needs to fix these tests in the future
>     
>     All of the tests in this class need to be rewrittn in a way that is deterministic and doesn't require obscure NPE exceptions or multiple interations of doWork().
>     
>     I still feel it is better to @Ignore the test, or all of them in this class for that matter, and have them fixed properly when possible.

Hi John, thank you for writing down to share details of you thoughts. I agree all 3 bullets you mentioned, but only in my second patch.

My first intention and patch is much simpler and to fix a "wrong if statement".
https://reviews.apache.org/r/26510/diff/1/

I still hesitant to drop the test coverage, but let me know if I don't still convince you, I will just set a @Ignore on the test.


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56194
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by John Speidel <js...@hortonworks.com>.

> On Oct. 10, 2014, 6:49 p.m., John Speidel wrote:
> > I don't feel comfortable with this test being merged as is.
> > If you need to resolve this before a proper fix is implemented, please ignore the test via @Ignore and file a Jira to properly fix the test.
> 
> jun aoki wrote:
>     Hi John, I hear your concern and agree with you two that the test should not use asynchronous. 
>     But the test has been this way since 2012 and I am hesitent to drop the test coverage by adding @Ignore.
>     https://github.com/apache/ambari/commits/trunk/ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java

Jun, first off thanks for taking the time to analyze this intermittent test failure.
I spent the last hour looking at this test.  
My conclusion is that this is a very poorly written test from it's orgin.  It is way too coarse grained, uses the NPE as an intentional error mechanism and expects several iterations of doWork() to complete based on the states returned by the test mocks.
Your changes, although they result in a passing test now, have several issues.
- They circumvent the run method which under the case being tested, catches the NPE and makes some state changes
- The changes are not written in a way that guarantees a deterministic behavior: "while(count < 10)" is a big red flag
- The changes will make the test VERY confusing to anybody who needs to fix these tests in the future

All of the tests in this class need to be rewrittn in a way that is deterministic and doesn't require obscure NPE exceptions or multiple interations of doWork().

I still feel it is better to @Ignore the test, or all of them in this class for that matter, and have them fixed properly when possible.


- John


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56194
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by John Speidel <js...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56194
-----------------------------------------------------------


I don't feel comfortable with this test being merged as is.
If you need to resolve this before a proper fix is implemented, please ignore the test via @Ignore and file a Jira to properly fix the test.

- John Speidel


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.

> On Oct. 10, 2014, 6:20 p.m., Sid Wagle wrote:
> > Approval contingent to a long term fix proposed at,
> > https://issues.apache.org/jira/browse/AMBARI-7735
> 
> jun aoki wrote:
>     Hi Sid, I just want to make sure you are OK with the first minimal patch. Is it OK with you?

Hi Jun,
I have assigned AMBARI-7735 to myself, I will create a patch for this next week and ask dev-group if we should port to 1.7.0 as well.
Could you change the patch to @Ignore the test for now?
Thanks.


- Sid


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56184
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.

> On Oct. 10, 2014, 6:20 p.m., Sid Wagle wrote:
> > Approval contingent to a long term fix proposed at,
> > https://issues.apache.org/jira/browse/AMBARI-7735

Hi Sid, I just want to make sure you are OK with the first minimal patch. Is it OK with you?


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56184
-----------------------------------------------------------


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56184
-----------------------------------------------------------

Ship it!


Approval contingent to a long term fix proposed at,
https://issues.apache.org/jira/browse/AMBARI-7735

- Sid Wagle


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by John Speidel <js...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56450
-----------------------------------------------------------

Ship it!


Ship It!

- John Speidel


On Oct. 13, 2014, 8:54 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2014, 8:54 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56452
-----------------------------------------------------------

Ship it!


Ship It!

- Alejandro Fernandez


On Oct. 13, 2014, 8:54 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2014, 8:54 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56444
-----------------------------------------------------------

Ship it!


Ship It!

- Sid Wagle


On Oct. 13, 2014, 8:54 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2014, 8:54 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/
-----------------------------------------------------------

(Updated Oct. 13, 2014, 8:54 p.m.)


Review request for Ambari and Yusaku Sako.


Bugs: AMBARI-7622
    https://issues.apache.org/jira/browse/AMBARI-7622


Repository: ambari


Description
-------

Tweaked the waiting condition upon ActionScheduler


Diffs (updated)
-----

  ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 

Diff: https://reviews.apache.org/r/26510/diff/


Testing
-------


Thanks,

jun aoki


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56073
-----------------------------------------------------------


Also, seems we do a "scheduler.start();" from multiple places.

- Sid Wagle


On Oct. 9, 2014, 10:05 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 10:05 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/
-----------------------------------------------------------

(Updated Oct. 9, 2014, 10:05 p.m.)


Review request for Ambari and Yusaku Sako.


Changes
-------

Applied Sid's feedback.


Bugs: AMBARI-7622
    https://issues.apache.org/jira/browse/AMBARI-7622


Repository: ambari


Description
-------

Tweaked the waiting condition upon ActionScheduler


Diffs (updated)
-----

  ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 

Diff: https://reviews.apache.org/r/26510/diff/


Testing
-------


Thanks,

jun aoki


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by jun aoki <ju...@gmail.com>.

> On Oct. 9, 2014, 7:26 p.m., Sid Wagle wrote:
> > This test does something a unit test should not do, wait on a asynchronous operation.
> > How about, instead of starting ActionScheduler thread "scheduler.start()" and wait for the Thread to run, call the doWork() which is called by the run() method.

Make sense, Sid.
One thing I noticed from the jenkins failure [1] is 
doWork() is called a few times and throws NullPointerException's while ActionScheduler is running, and eventually goes into the expected state.
(I believe this is the expected behavior since it is a negative test case)
So I added a while loop along with calling doWork().kk


[1]
https://builds.apache.org/view/A-D/view/Ambari/job/Ambari-trunk-Commit/526/testReport/junit/org.apache.ambari.server.actionmanager/TestActionScheduler/testOpFailedEventRaisedForAbortedHostRole/


2014-10-09 17:36:49,198 INFO  [main] configuration.Configuration (Configuration.java:<init>(398)) - Reading password from existing file
2014-10-09 17:36:49,700 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208)) - Scheduler wakes up
2014-10-09 17:36:49,702 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209)) - Processing 1 in progress stages 
2014-10-09 17:36:49,702 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229)) - ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:49,703 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509)) - ==> Collecting commands to schedule...
2014-10-09 17:36:49,705 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615)) - Collected 2 commands to schedule in this wakeup.
2014-10-09 17:36:49,706 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:DATANODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:49,706 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:NAMENODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=1.0
2014-10-09 17:36:49,777 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368)) - Scheduler finished work.
2014-10-09 17:36:49,778 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:run(189)) - Exception received
java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
	at com.google.common.cache.LocalCache.put(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
	at org.apache.ambari.server.actionmanager.ActionScheduler.processHostRole(ActionScheduler.java:782)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:298)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:184)
	at java.lang.Thread.run(Thread.java:724)
2014-10-09 17:36:49,880 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208)) - Scheduler wakes up
2014-10-09 17:36:49,880 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209)) - Processing 1 in progress stages 
2014-10-09 17:36:49,881 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229)) - ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:49,881 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509)) - ==> Collecting commands to schedule...
2014-10-09 17:36:49,882 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:timeOutActionNeeded(722)) - Timing out action since agent is not heartbeating.
2014-10-09 17:36:49,883 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(587)) - Host:host1, role:DATANODE, actionId:1--1 timed out
2014-10-09 17:36:49,886 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615)) - Collected 2 commands to schedule in this wakeup.
2014-10-09 17:36:49,887 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:DATANODE, stats=numQueued=1, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=0, numPending=0, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:49,887 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:NAMENODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=1.0
2014-10-09 17:36:49,888 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368)) - Scheduler finished work.
2014-10-09 17:36:49,888 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:run(189)) - Exception received
java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
	at com.google.common.cache.LocalCache.put(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
	at org.apache.ambari.server.actionmanager.ActionScheduler.processHostRole(ActionScheduler.java:782)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:298)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:184)
	at java.lang.Thread.run(Thread.java:724)
2014-10-09 17:36:49,989 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208)) - Scheduler wakes up
2014-10-09 17:36:49,989 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209)) - Processing 1 in progress stages 
2014-10-09 17:36:49,990 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229)) - ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:49,990 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509)) - ==> Collecting commands to schedule...
2014-10-09 17:36:49,991 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:timeOutActionNeeded(722)) - Timing out action since agent is not heartbeating.
2014-10-09 17:36:49,991 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(587)) - Host:host1, role:DATANODE, actionId:1--1 timed out
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615)) - Collected 2 commands to schedule in this wakeup.
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:DATANODE, stats=numQueued=1, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=0, numPending=0, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:NAMENODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=0, numPending=1, numAborted=0, totalHosts=1, successFactor=1.0
2014-10-09 17:36:49,992 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368)) - Scheduler finished work.
2014-10-09 17:36:49,993 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:run(189)) - Exception received
java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
	at com.google.common.cache.LocalCache.put(LocalCache.java:4210)
	at com.google.common.cache.LocalCache$LocalManualCache.put(LocalCache.java:4804)
	at org.apache.ambari.server.actionmanager.ActionScheduler.processHostRole(ActionScheduler.java:782)
	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:298)
	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:184)
	at java.lang.Thread.run(Thread.java:724)
2014-10-09 17:36:50,093 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(208)) - Scheduler wakes up
2014-10-09 17:36:50,094 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(209)) - Processing 1 in progress stages 
2014-10-09 17:36:50,094 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(229)) - ==> STAGE_i = 1(requestId=1,StageId=-1)
2014-10-09 17:36:50,094 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(509)) - ==> Collecting commands to schedule...
2014-10-09 17:36:50,095 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:timeOutActionNeeded(722)) - Timing out action since agent is not heartbeating.
2014-10-09 17:36:50,096 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(587)) - Host:host1, role:DATANODE, actionId:1--1 timed out
2014-10-09 17:36:50,096 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(589)) - Host:host1, role:DATANODE, actionId:1--1 expired
2014-10-09 17:36:50,097 INFO  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(599)) - Removing command from queue, host=host1, commandId=1--1 
2014-10-09 17:36:50,098 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:processInProgressStage(615)) - Collected 1 commands to schedule in this wakeup.
2014-10-09 17:36:50,098 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(261)) - Stats for role:DATANODE, stats=numQueued=0, numInProgress=0, numSucceeded=0, numFailed=0, numTimedOut=1, numPending=0, numAborted=0, totalHosts=1, successFactor=0.5
2014-10-09 17:36:50,099 WARN  [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(275)) - Operation completely failed, aborting request id:1
2014-10-09 17:36:50,101 DEBUG [Thread-1] actionmanager.ActionScheduler (ActionScheduler.java:doWork(368)) - Scheduler finished work.


- jun


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56035
-----------------------------------------------------------


On Oct. 9, 2014, 6:34 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 6:34 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>


Re: Review Request 26510: AMBARI-7622 TestActionScheduler fails occasionally on builds.a.o stating expected: but was:

Posted by Sid Wagle <sw...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26510/#review56035
-----------------------------------------------------------


This test does something a unit test should not do, wait on a asynchronous operation.
How about, instead of starting ActionScheduler thread "scheduler.start()" and wait for the Thread to run, call the doWork() which is called by the run() method.

- Sid Wagle


On Oct. 9, 2014, 6:34 p.m., jun aoki wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26510/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2014, 6:34 p.m.)
> 
> 
> Review request for Ambari and Yusaku Sako.
> 
> 
> Bugs: AMBARI-7622
>     https://issues.apache.org/jira/browse/AMBARI-7622
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Tweaked the waiting condition upon ActionScheduler
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/test/java/org/apache/ambari/server/actionmanager/TestActionScheduler.java a20f252 
> 
> Diff: https://reviews.apache.org/r/26510/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> jun aoki
> 
>