You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Jiang Yan Xu <ya...@jxu.me> on 2014/07/31 00:44:03 UTC

Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/
-----------------------------------------------------------

Review request for mesos and Ben Mahler.


Bugs: MESOS-1655
    https://issues.apache.org/jira/browse/MESOS-1655


Repository: mesos-git


Description
-------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
- The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
- Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.


Diffs
-----

  src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 

Diff: https://reviews.apache.org/r/24123/diff/


Testing
-------

Ran the test for 350 iterations so far and it didn't fail.


Thanks,

Jiang Yan Xu


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/#review49205
-----------------------------------------------------------

Ship it!



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/24123/#comment86084>

    What about:
    
    Duration timeout = Milliseconds(100);


- Ben Mahler


On July 31, 2014, 12:26 a.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24123/
> -----------------------------------------------------------
> 
> (Updated July 31, 2014, 12:26 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1655
>     https://issues.apache.org/jira/browse/MESOS-1655
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
> - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
> - Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.
> 
> 
> Diffs
> -----
> 
>   src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 
> 
> Diff: https://reviews.apache.org/r/24123/diff/
> 
> 
> Testing
> -------
> 
> Ran the test for 350 iterations so far and it didn't fail.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

Posted by Jiang Yan Xu <ya...@jxu.me>.

> On Aug. 1, 2014, 11:15 a.m., Ben Mahler wrote:
> > Are these tests faster given the split? I see this test used to take 10 seconds to run.

yeah. < 200 ms each because the wait is removed.


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/#review49368
-----------------------------------------------------------


On July 31, 2014, 11:23 p.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24123/
> -----------------------------------------------------------
> 
> (Updated July 31, 2014, 11:23 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1655
>     https://issues.apache.org/jira/browse/MESOS-1655
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
> 
> - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
> - Split the the test into two and removed the part that waits several seconds for ZooKeeperTestServer to expire the session which is unnecessary to test the detector and slows down the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 
> 
> Diff: https://reviews.apache.org/r/24123/diff/
> 
> 
> Testing
> -------
> 
> Ran the test for 2000 iterations.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/#review49368
-----------------------------------------------------------

Ship it!


Are these tests faster given the split? I see this test used to take 10 seconds to run.

- Ben Mahler


On Aug. 1, 2014, 6:23 a.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24123/
> -----------------------------------------------------------
> 
> (Updated Aug. 1, 2014, 6:23 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1655
>     https://issues.apache.org/jira/browse/MESOS-1655
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
> 
> - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
> - Split the the test into two and removed the part that waits several seconds for ZooKeeperTestServer to expire the session which is unnecessary to test the detector and slows down the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 
> 
> Diff: https://reviews.apache.org/r/24123/diff/
> 
> 
> Testing
> -------
> 
> Ran the test for 2000 iterations.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

Posted by Jiang Yan Xu <ya...@jxu.me>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/
-----------------------------------------------------------

(Updated July 31, 2014, 11:23 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

Split the the test into two and removed the part that waits several seconds for ZooKeeperTestServer to expire the session which is unnecessary to test the detector and slows down the test.


Summary (updated)
-----------------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling


Bugs: MESOS-1655
    https://issues.apache.org/jira/browse/MESOS-1655


Repository: mesos-git


Description (updated)
-------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

- The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
- Split the the test into two and removed the part that waits several seconds for ZooKeeperTestServer to expire the session which is unnecessary to test the detector and slows down the test.


Diffs (updated)
-----

  src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 

Diff: https://reviews.apache.org/r/24123/diff/


Testing (updated)
-------

Ran the test for 2000 iterations.


Thanks,

Jiang Yan Xu


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling.

Posted by Jiang Yan Xu <ya...@jxu.me>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/
-----------------------------------------------------------

(Updated July 30, 2014, 5:26 p.m.)


Review request for mesos and Ben Mahler.


Summary (updated)
-----------------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling.


Bugs: MESOS-1655
    https://issues.apache.org/jira/browse/MESOS-1655


Repository: mesos-git


Description
-------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
- The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
- Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.


Diffs
-----

  src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 

Diff: https://reviews.apache.org/r/24123/diff/


Testing
-------

Ran the test for 350 iterations so far and it didn't fail.


Thanks,

Jiang Yan Xu


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race. - Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.

Posted by Jiang Yan Xu <ya...@jxu.me>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/
-----------------------------------------------------------

(Updated July 30, 2014, 5:25 p.m.)


Review request for mesos and Ben Mahler.


Changes
-------

Comments.


Summary (updated)
-----------------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race. - Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.


Bugs: MESOS-1655
    https://issues.apache.org/jira/browse/MESOS-1655


Repository: mesos-git


Description
-------

Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
- The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
- Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.


Diffs (updated)
-----

  src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 

Diff: https://reviews.apache.org/r/24123/diff/


Testing
-------

Ran the test for 350 iterations so far and it didn't fail.


Thanks,

Jiang Yan Xu


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

Posted by Jiang Yan Xu <ya...@jxu.me>.

> On July 30, 2014, 5:12 p.m., Ben Mahler wrote:
> > src/tests/zookeeper_tests.cpp, lines 202-205
> > <https://reviews.apache.org/r/24123/diff/1/?file=646315#file646315line202>
> >
> >     What's stopping it from being even lower?
> >     
> >     Say, 100 milliseconds?

Ok so with 100 milliseconds I have seen the test fail when run with a large number of iterations due to loss connection to the local ZooKeeperTestServer when the server's up.
I haven't found the root cause but 100 milliseconds is way lower than usually zookeeper setup and it may mess with zookeeper's tickTime. And the way we're using the test server is a bit hacky anyway.
I think the wait for ZooKeeperTestServer to expire the session is not an essential part of the detector test so I removed it and split the test into two.


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/#review49186
-----------------------------------------------------------


On July 31, 2014, 11:23 p.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24123/
> -----------------------------------------------------------
> 
> (Updated July 31, 2014, 11:23 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1655
>     https://issues.apache.org/jira/browse/MESOS-1655
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
> 
> - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
> - Split the the test into two and removed the part that waits several seconds for ZooKeeperTestServer to expire the session which is unnecessary to test the detector and slows down the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 
> 
> Diff: https://reviews.apache.org/r/24123/diff/
> 
> 
> Testing
> -------
> 
> Ran the test for 2000 iterations.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 24123: Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24123/#review49186
-----------------------------------------------------------



src/tests/zookeeper_tests.cpp
<https://reviews.apache.org/r/24123/#comment86048>

    What's stopping it from being even lower?
    
    Say, 100 milliseconds?


- Ben Mahler


On July 30, 2014, 10:44 p.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24123/
> -----------------------------------------------------------
> 
> (Updated July 30, 2014, 10:44 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1655
>     https://issues.apache.org/jira/browse/MESOS-1655
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Fixed a flaky test: ZooKeeperTest.LeaderDetectorTimeoutHandling
> - The original ZK session timeout was the same as AWAIT_READY timeout so it's possible that AWAIT_READY timed out in a race.
> - Plus we were spending 10 seconds clock time on this wait so I reduced the timeout.
> 
> 
> Diffs
> -----
> 
>   src/tests/zookeeper_tests.cpp be9fa06818b96e5170c68810fe16cc472f1f8b28 
> 
> Diff: https://reviews.apache.org/r/24123/diff/
> 
> 
> Testing
> -------
> 
> Ran the test for 350 iterations so far and it didn't fail.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>