You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Jiang Yan Xu <ya...@jxu.me> on 2013/11/05 01:39:41 UTC

Re: Review Request 15016: Added local timeout so that Group operations fail after we locally determine that the session is timed out.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15016/
-----------------------------------------------------------

(Updated Nov. 5, 2013, 12:39 a.m.)


Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.


Changes
-------

- Separated out https://reviews.apache.org/r/15221/
- Added overloaded ctor:
Group(const URL& url,
      const Duration& timeout);


Summary (updated)
-----------------

Added local timeout so that Group operations fail after we locally determine that the session is timed out.


Repository: mesos-git


Description (updated)
-------

See summary


Diffs (updated)
-----

  src/zookeeper/group.hpp 6b6c1557658df1bb7b7df3ded5c203e2dc0c8308 
  src/zookeeper/group.cpp cd58d230a2715b9ed01002443ebd7b5366d31be6 

Diff: https://reviews.apache.org/r/15016/diff/


Testing
-------

make check.


Thanks,

Jiang Yan Xu


Re: Review Request 15016: Added local timeout so that Group operations fail after we locally determine that the session is timed out.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15016/#review28571
-----------------------------------------------------------

Ship it!


Ship It!

- Ben Mahler


On Nov. 5, 2013, 10:12 p.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15016/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 10:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   src/zookeeper/group.hpp 6b6c1557658df1bb7b7df3ded5c203e2dc0c8308 
>   src/zookeeper/group.cpp cd58d230a2715b9ed01002443ebd7b5366d31be6 
> 
> Diff: https://reviews.apache.org/r/15016/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 15016: Added local timeout so that Group operations fail after we locally determine that the session is timed out.

Posted by Jiang Yan Xu <ya...@jxu.me>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15016/
-----------------------------------------------------------

(Updated Nov. 5, 2013, 10:12 p.m.)


Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.


Changes
-------

BenM's


Repository: mesos-git


Description
-------

See summary


Diffs (updated)
-----

  src/zookeeper/group.hpp 6b6c1557658df1bb7b7df3ded5c203e2dc0c8308 
  src/zookeeper/group.cpp cd58d230a2715b9ed01002443ebd7b5366d31be6 

Diff: https://reviews.apache.org/r/15016/diff/


Testing
-------

make check.


Thanks,

Jiang Yan Xu


Re: Review Request 15016: Added local timeout so that Group operations fail after we locally determine that the session is timed out.

Posted by Jiang Yan Xu <ya...@jxu.me>.

> On Nov. 5, 2013, 12:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 325
> > <https://reviews.apache.org/r/15016/diff/3/?file=377511#file377511line325>
> >
> >     What happens if cancel returns false?
> >     
> >     We'll set the timer to None() but there's an impending call to timedout() in the queue? Is that intended?

If timedout() dispatched -> connected() executed (timer = NONE) -> timedout() executed: timer will be NONE in the "if (timer.isSome() && zk->getSessionId() == sessionId)" check and the group will not abort.
If timedout() dispatched -> connected (timer = NONE) -> reconnecting (new timer) -> previous enqueued timedout() executed: this is what I hadn't thought about and we should not allow the timedout() method to abort everything. 

So I changed the check condition in timedout() to be:

if (timer.isSome() &&
    timer.get().timeout().expired() &&
    zk->getSessionId() == sessionId) { ... }

So because the new timer has a pending timeout, the condition will evaluate to false and the group will not abort.


> On Nov. 5, 2013, 12:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 350
> > <https://reviews.apache.org/r/15016/diff/3/?file=377511#file377511line350>
> >
> >     Why is this CHECK safe?
> >     
> >     Is it safe because expired() is always called before reconnecting()?

It should be safe because the timer is only set in reconnecting(), which won't be called twice consecutively without other timer resetting events (i.e. connected, expired) in between.


> On Nov. 5, 2013, 12:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 373
> > <https://reviews.apache.org/r/15016/diff/3/?file=377511#file377511line373>
> >
> >     If you fail to cancel it you'll have a pending call to timedout(), are you assuming that the Timer will not become SOME before the event for timedout() is processed?

It should be fine here too because in timedout():

if (timer.isSome() &&
    timer.get().timeout().expired() &&
    zk->getSessionId() == sessionId) { ... }


The sessionId will be different and thus the old timeout is ignored.


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15016/#review28161
-----------------------------------------------------------


On Nov. 5, 2013, 12:39 a.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15016/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:39 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   src/zookeeper/group.hpp 6b6c1557658df1bb7b7df3ded5c203e2dc0c8308 
>   src/zookeeper/group.cpp cd58d230a2715b9ed01002443ebd7b5366d31be6 
> 
> Diff: https://reviews.apache.org/r/15016/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 15016: Added local timeout so that Group operations fail after we locally determine that the session is timed out.

Posted by Ben Mahler <be...@gmail.com>.

> On Nov. 5, 2013, 12:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 373
> > <https://reviews.apache.org/r/15016/diff/3/?file=377511#file377511line373>
> >
> >     If you fail to cancel it you'll have a pending call to timedout(), are you assuming that the Timer will not become SOME before the event for timedout() is processed?
> 
> Jiang Yan Xu wrote:
>     It should be fine here too because in timedout():
>     
>     if (timer.isSome() &&
>         timer.get().timeout().expired() &&
>         zk->getSessionId() == sessionId) { ... }
>     
>     
>     The sessionId will be different and thus the old timeout is ignored.

Sounds good.


> On Nov. 5, 2013, 12:58 a.m., Ben Mahler wrote:
> > src/zookeeper/group.cpp, line 325
> > <https://reviews.apache.org/r/15016/diff/3/?file=377511#file377511line325>
> >
> >     What happens if cancel returns false?
> >     
> >     We'll set the timer to None() but there's an impending call to timedout() in the queue? Is that intended?
> 
> Jiang Yan Xu wrote:
>     If timedout() dispatched -> connected() executed (timer = NONE) -> timedout() executed: timer will be NONE in the "if (timer.isSome() && zk->getSessionId() == sessionId)" check and the group will not abort.
>     If timedout() dispatched -> connected (timer = NONE) -> reconnecting (new timer) -> previous enqueued timedout() executed: this is what I hadn't thought about and we should not allow the timedout() method to abort everything. 
>     
>     So I changed the check condition in timedout() to be:
>     
>     if (timer.isSome() &&
>         timer.get().timeout().expired() &&
>         zk->getSessionId() == sessionId) { ... }
>     
>     So because the new timer has a pending timeout, the condition will evaluate to false and the group will not abort.

Ok great, thanks.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15016/#review28161
-----------------------------------------------------------


On Nov. 5, 2013, 10:12 p.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15016/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 10:12 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   src/zookeeper/group.hpp 6b6c1557658df1bb7b7df3ded5c203e2dc0c8308 
>   src/zookeeper/group.cpp cd58d230a2715b9ed01002443ebd7b5366d31be6 
> 
> Diff: https://reviews.apache.org/r/15016/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>


Re: Review Request 15016: Added local timeout so that Group operations fail after we locally determine that the session is timed out.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15016/#review28161
-----------------------------------------------------------



src/zookeeper/group.cpp
<https://reviews.apache.org/r/15016/#comment54777>

    Add a TODO to re-use the peer constructor when we have C++11. (Not a big deal, I would just like to have a nice way to track cleanups when we make the switch).
    
    (See: http://en.wikipedia.org/wiki/C++11#Object_construction_improvement for what I'm talking about).



src/zookeeper/group.cpp
<https://reviews.apache.org/r/15016/#comment54779>

    What happens if cancel returns false?
    
    We'll set the timer to None() but there's an impending call to timedout() in the queue? Is that intended?



src/zookeeper/group.cpp
<https://reviews.apache.org/r/15016/#comment54781>

    Why is this CHECK safe?
    
    Is it safe because expired() is always called before reconnecting()?



src/zookeeper/group.cpp
<https://reviews.apache.org/r/15016/#comment54782>

    If you fail to cancel it you'll have a pending call to timedout(), are you assuming that the Timer will not become SOME before the event for timedout() is processed?


- Ben Mahler


On Nov. 5, 2013, 12:39 a.m., Jiang Yan Xu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15016/
> -----------------------------------------------------------
> 
> (Updated Nov. 5, 2013, 12:39 a.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, and Vinod Kone.
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   src/zookeeper/group.hpp 6b6c1557658df1bb7b7df3ded5c203e2dc0c8308 
>   src/zookeeper/group.cpp cd58d230a2715b9ed01002443ebd7b5366d31be6 
> 
> Diff: https://reviews.apache.org/r/15016/diff/
> 
> 
> Testing
> -------
> 
> make check.
> 
> 
> Thanks,
> 
> Jiang Yan Xu
> 
>