You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Timothy Chen <tn...@apache.org> on 2014/07/13 00:26:28 UTC

Review Request 23443: Fix Health check tests and send healthy state on task kill

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

Review request for mesos, Ben Mahler and Jie Yu.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.

Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.


Diffs
-----

  src/launcher/executor.cpp 9c80848 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Timothy Chen <tn...@apache.org>.

> On July 13, 2014, 11:45 p.m., Benjamin Hindman wrote:
> > src/tests/health_check_tests.cpp, line 159
> > <https://reviews.apache.org/r/23443/diff/1/?file=629112#file629112line159>
> >
> >     I don't understand how changing to ASSERT from EXPECT would change the flakiness.

Well currently the tests keeps moving on even when it fails expectation, so just having a short circuit will try to do cleanup earlier.
I never was able to repro what the tests machine was seeing where it hangs forever, but I do found issues doing a clean clone and test run so I'm trying to fix those and see what's next.


> On July 13, 2014, 11:45 p.m., Benjamin Hindman wrote:
> > src/launcher/executor.cpp, lines 601-605
> > <https://reviews.apache.org/r/23443/diff/1/?file=629111#file629111line601>
> >
> >     This is a hack which assumes autotools layout. At the very least we should have a comment here that explains what is being done. ;-)
> >     
> >     That being said, I remember at one point you had an environment variable that I told you to remove. That would have been much more generic then this, my apologies.
> >     
> >     My concern with the environment variable was that it wasn't very explicit and therefore discoverable. That is, if someone else launched the mesos-executor they wouldn't know they couldn't pass a task with a health check unless you set the environment variable (and they'd have to look in the code to figure that out)!
> >     
> >     So it seems like we should always use dirname(argv[0]), which should work for the installed case. But also provide the environment variable to override dirname[argv[0]). Even better than an environment variable would be a flag, but I think that might be more complicated to take on. Given that this is an extension of 'launcher_dir' from the slave, how about naming that environment variable (or flag) MESOS_LAUNCHER_DIR as well?
> >     
> >     Just a little different than what you originally had, sorry for my lack of foresight in the earlier comments, thanks Tim.

Sounds good, let me do this change.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47701
-----------------------------------------------------------


On July 13, 2014, 10:22 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 13, 2014, 10:22 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Timothy Chen <tn...@apache.org>.

> On July 13, 2014, 11:45 p.m., Benjamin Hindman wrote:
> > src/tests/health_check_tests.cpp, line 159
> > <https://reviews.apache.org/r/23443/diff/1/?file=629112#file629112line159>
> >
> >     I don't understand how changing to ASSERT from EXPECT would change the flakiness.
> 
> Timothy Chen wrote:
>     Well currently the tests keeps moving on even when it fails expectation, so just having a short circuit will try to do cleanup earlier.
>     I never was able to repro what the tests machine was seeing where it hangs forever, but I do found issues doing a clean clone and test run so I'm trying to fix those and see what's next.
> 
> Benjamin Hindman wrote:
>     Sorry if I'm out of the loop here, but the tests were "hanging"? Sounds like these tests have non-deterministic bugs that we need to be fixing first and foremost before we consider re-enabling them or s/EXPECT/ASSERT/. Can you update the JIRA ticket with more info on how the tests are now hanging?

Hmm so I believe it was reported the tests was hanging on CI occasinoally, but I never was able to repro on mac/ubuntu/debian. I have landed a fix and also make the tests fail fast now, and also this path bug as well.

I really can't find or repro the original problems but I do notice with the existing bugs health check will not work and the task will be left running, but it should been cleaned up driver stop and Shutdown. 

Unless there is a way to test CI manually otherwise I'd like to take another shot with all these fixes to see if it can repro again.

Let me know what you think.


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47701
-----------------------------------------------------------


On July 14, 2014, 7:13 a.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 14, 2014, 7:13 a.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Benjamin Hindman <be...@berkeley.edu>.

> On July 13, 2014, 11:45 p.m., Benjamin Hindman wrote:
> > src/tests/health_check_tests.cpp, line 159
> > <https://reviews.apache.org/r/23443/diff/1/?file=629112#file629112line159>
> >
> >     I don't understand how changing to ASSERT from EXPECT would change the flakiness.
> 
> Timothy Chen wrote:
>     Well currently the tests keeps moving on even when it fails expectation, so just having a short circuit will try to do cleanup earlier.
>     I never was able to repro what the tests machine was seeing where it hangs forever, but I do found issues doing a clean clone and test run so I'm trying to fix those and see what's next.

Sorry if I'm out of the loop here, but the tests were "hanging"? Sounds like these tests have non-deterministic bugs that we need to be fixing first and foremost before we consider re-enabling them or s/EXPECT/ASSERT/. Can you update the JIRA ticket with more info on how the tests are now hanging?


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47701
-----------------------------------------------------------


On July 14, 2014, 7:13 a.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 14, 2014, 7:13 a.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47701
-----------------------------------------------------------



src/launcher/executor.cpp
<https://reviews.apache.org/r/23443/#comment83855>

    This is a hack which assumes autotools layout. At the very least we should have a comment here that explains what is being done. ;-)
    
    That being said, I remember at one point you had an environment variable that I told you to remove. That would have been much more generic then this, my apologies.
    
    My concern with the environment variable was that it wasn't very explicit and therefore discoverable. That is, if someone else launched the mesos-executor they wouldn't know they couldn't pass a task with a health check unless you set the environment variable (and they'd have to look in the code to figure that out)!
    
    So it seems like we should always use dirname(argv[0]), which should work for the installed case. But also provide the environment variable to override dirname[argv[0]). Even better than an environment variable would be a flag, but I think that might be more complicated to take on. Given that this is an extension of 'launcher_dir' from the slave, how about naming that environment variable (or flag) MESOS_LAUNCHER_DIR as well?
    
    Just a little different than what you originally had, sorry for my lack of foresight in the earlier comments, thanks Tim.



src/tests/health_check_tests.cpp
<https://reviews.apache.org/r/23443/#comment83854>

    I don't understand how changing to ASSERT from EXPECT would change the flakiness.


- Benjamin Hindman


On July 13, 2014, 10:22 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 13, 2014, 10:22 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47733
-----------------------------------------------------------


Bad patch!

Reviews applied: [23416]

Failed command: git apply --index 23416.patch

Error:
 error: patch failed: src/health-check/main.cpp:151
error: src/health-check/main.cpp: patch does not apply


- Mesos ReviewBot


On July 14, 2014, 7:13 a.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 14, 2014, 7:13 a.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47947
-----------------------------------------------------------


Bad patch!

Reviews applied: [23443]

Failed command: ./support/mesos-style.py

Error:
 Checking 490 files using filter --filter=-,+build/class,+build/deprecated,+build/endif_comment,+runtime/vlog,+whitespace/blank_line,+whitespace/comma,+whitespace/ending_newline,+whitespace/forcolon,+whitespace/indent,+whitespace/line_length,+whitespace/tab,+whitespace/todo
src/tests/health_check_tests.cpp:78:  Lines should be <= 80 characters long  [whitespace/line_length] [2]
Total errors found: 1


- Mesos ReviewBot


On July 16, 2014, 8:08 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 16, 2014, 8:08 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47960
-----------------------------------------------------------


Patch looks great!

Reviews applied: [23443]

All tests passed.

- Mesos ReviewBot


On July 16, 2014, 9:36 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 16, 2014, 9:36 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 16, 2014, 9:36 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.


Diffs (updated)
-----

  src/launcher/executor.cpp a573637 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 16, 2014, 8:08 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.


Diffs (updated)
-----

  src/launcher/executor.cpp a573637 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 16, 2014, 7:19 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.


Diffs (updated)
-----

  src/launcher/executor.cpp a573637 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Re-enable health check tests

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47931
-----------------------------------------------------------


Bad patch!

Reviews applied: [23443]

Failed command: ./support/mesos-style.py

Error:
 Checking 490 files using filter --filter=-,+build/class,+build/deprecated,+build/endif_comment,+runtime/vlog,+whitespace/blank_line,+whitespace/comma,+whitespace/ending_newline,+whitespace/forcolon,+whitespace/indent,+whitespace/line_length,+whitespace/tab,+whitespace/todo
src/tests/health_check_tests.cpp:78:  Lines should be <= 80 characters long  [whitespace/line_length] [2]
Total errors found: 1


- Mesos ReviewBot


On July 16, 2014, 5:30 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 16, 2014, 5:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 16, 2014, 5:30 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.


Diffs (updated)
-----

  src/launcher/executor.cpp a573637 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Re-enable health check tests

Posted by Adam B <ad...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47862
-----------------------------------------------------------


LGTM, but I'll let BenH give his final word since he's already had some comments.
Also, I'm still not convinced we need to change all the EXPECTs to ASSERTs. While an ASSERT will fail the test/function faster, it may also skip cleanup code causing a leak.

- Adam B


On July 15, 2014, 1:34 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 15, 2014, 1:34 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47842
-----------------------------------------------------------


Vinod help me setup a CI job running the health check tests with this fix and it has been passing over 3 times so far (https://builds.apache.org/view/All/job/mesos-tnachen/)

I think it should be good to re-enable again, let me know if anyone thinks different!

- Timothy Chen


On July 15, 2014, 8:34 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 15, 2014, 8:34 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.

> On July 16, 2014, 4:04 p.m., Benjamin Hindman wrote:
> > src/tests/health_check_tests.cpp, line 79
> > <https://reviews.apache.org/r/23443/diff/4/?file=632794#file632794line79>
> >
> >     Please add a newline above and a comment that explains why we need to do this and why we're calling this MESOS_LAUNCHER_DIR. In fact, I was expecting the slave to always set this environment variable (like your old patch did originally), but I think this is fine for now. Also, minor nit: s/Environment_Variable* var/Environment::Variable* variable/. Thanks Tim!

Sounds good, I just reverted the EXPECT ASSERT changes and addressed your comment.
The tests has been passing all night so far. Thanks!


- Timothy


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47891
-----------------------------------------------------------


On July 16, 2014, 5:30 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 16, 2014, 5:30 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Benjamin Hindman <be...@berkeley.edu>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47891
-----------------------------------------------------------

Ship it!


If these tests are now passing let's revert the EXPECT to ASSERT. Also, see my comment below. Otherwise, LGTM.


src/tests/health_check_tests.cpp
<https://reviews.apache.org/r/23443/#comment84118>

    Please add a newline above and a comment that explains why we need to do this and why we're calling this MESOS_LAUNCHER_DIR. In fact, I was expecting the slave to always set this environment variable (like your old patch did originally), but I think this is fine for now. Also, minor nit: s/Environment_Variable* var/Environment::Variable* variable/. Thanks Tim!


- Benjamin Hindman


On July 15, 2014, 8:34 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 15, 2014, 8:34 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp a573637 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Re-enable health check tests

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 15, 2014, 8:34 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Summary (updated)
-----------------

Re-enable health check tests


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description (updated)
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.


Diffs (updated)
-----

  src/launcher/executor.cpp a573637 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Mesos ReviewBot <de...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47741
-----------------------------------------------------------


Patch looks great!

Reviews applied: [23443]

All tests passed.

- Mesos ReviewBot


On July 14, 2014, 8:05 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 14, 2014, 8:05 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/#review47812
-----------------------------------------------------------


I'm splitting the task kill status update into a seperate reviewboard, and updating this patch for purely enabling all the tests and test related problems for health check.

- Timothy Chen


On July 14, 2014, 8:05 p.m., Timothy Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23443/
> -----------------------------------------------------------
> 
> (Updated July 14, 2014, 8:05 p.m.)
> 
> 
> Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.
> 
> 
> Bugs: MESOS-1533
>     https://issues.apache.org/jira/browse/MESOS-1533
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
> Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
> However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.
> 
> Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.
> 
> 
> Diffs
> -----
> 
>   src/launcher/executor.cpp 9c80848 
>   src/tests/health_check_tests.cpp 44711fd 
> 
> Diff: https://reviews.apache.org/r/23443/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Timothy Chen
> 
>


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 14, 2014, 8:05 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.

Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.


Diffs (updated)
-----

  src/launcher/executor.cpp 9c80848 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 14, 2014, 7:13 a.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.

Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.


Diffs (updated)
-----

  src/launcher/executor.cpp 9c80848 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen


Re: Review Request 23443: Fix Health check tests and send healthy state on task kill

Posted by Timothy Chen <tn...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23443/
-----------------------------------------------------------

(Updated July 13, 2014, 10:22 p.m.)


Review request for mesos, Adam B, Ben Mahler, Jie Yu, and Niklas Nielsen.


Bugs: MESOS-1533
    https://issues.apache.org/jira/browse/MESOS-1533


Repository: mesos-git


Description
-------

Currently the health check tests are disabled as it was flaky. Part of the reason is that all assertions was EXPECTs instead of ASSERTs so it will continue to execute. 
Another issue was it's currently discovering the path to launch health check executable from the command executor main argv[0] path.
However, the correct path is to launch the generated script that does library path resolution, which is the parent of .libs.

Also changed the command executor to also include the health state when it kills the task due to unhealthy checks.


Diffs
-----

  src/launcher/executor.cpp 9c80848 
  src/tests/health_check_tests.cpp 44711fd 

Diff: https://reviews.apache.org/r/23443/diff/


Testing
-------

make check


Thanks,

Timothy Chen