You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Benjamin Hindman <be...@berkeley.edu> on 2013/03/14 08:25:54 UTC

Review Request: Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9922/
-----------------------------------------------------------

Review request for mesos and Vinod Kone.


Description
-------

The change in https://reviews.apache.org/r/9916 was insufficient. Rather than continue along the lines of the previous fix this change effectively reverts the previous fix and instead adds a TERMINATING process state.


This addresses bug MESOS-392.
    https://issues.apache.org/jira/browse/MESOS-392


Diffs
-----

  third_party/libprocess/include/process/process.hpp 1433115516b2110acad0f5c2ee996dfb4203ba8b 
  third_party/libprocess/src/process.cpp a1da05e83f24c4c94d4eef7ef0db662886efee7a 

Diff: https://reviews.apache.org/r/9922/diff/


Testing
-------

./bin/mesos-tests.sh --gtest_filter="*.SchedulerExit" --gtest_repeat=1000


Thanks,

Benjamin Hindman


Re: Review Request: Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun.

Posted by Benjamin Hindman <be...@berkeley.edu>.

> On March 14, 2013, 5:18 p.m., Vinod Kone wrote:
> > third_party/libprocess/src/process.cpp, line 2787
> > <https://reviews.apache.org/r/9922/diff/1/?file=270476#file270476line2787>
> >
> >     Why the recursive lock? Looks like you want this because of the logic you added in #2806? IIUC, this allows the follows:
> >     
> >     --> terminate A
> >     --> A gets a recursive lock
> >     --> All its pending events are fired
> >     --> Enque of an event to A is called
> >     --> Recursive lock is acquired
> >     --> Enqueuing is skipped because process is in TERMINATING
> >     
> >     If that is the case, what happens if 
> >     --> terminate A
> >     --> inside A->finalize() we do terminate B
> >     --> B has a pending event to A
> >     
> >     Doesn't this cause a deadlock?
> 
> Benjamin Hindman wrote:
>     Calling process::terminate is asynchronous, so we won't cleanup B while we are cleaning up A.

And to be clear, we are not "firing" the pending events, just deleting them, which might cause discarded callbacks to get invoked on futures.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9922/#review17877
-----------------------------------------------------------


On March 14, 2013, 7:25 a.m., Benjamin Hindman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9922/
> -----------------------------------------------------------
> 
> (Updated March 14, 2013, 7:25 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Description
> -------
> 
> The change in https://reviews.apache.org/r/9916 was insufficient. Rather than continue along the lines of the previous fix this change effectively reverts the previous fix and instead adds a TERMINATING process state.
> 
> 
> This addresses bug MESOS-392.
>     https://issues.apache.org/jira/browse/MESOS-392
> 
> 
> Diffs
> -----
> 
>   third_party/libprocess/include/process/process.hpp 1433115516b2110acad0f5c2ee996dfb4203ba8b 
>   third_party/libprocess/src/process.cpp a1da05e83f24c4c94d4eef7ef0db662886efee7a 
> 
> Diff: https://reviews.apache.org/r/9922/diff/
> 
> 
> Testing
> -------
> 
> ./bin/mesos-tests.sh --gtest_filter="*.SchedulerExit" --gtest_repeat=1000
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>


Re: Review Request: Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun.

Posted by Benjamin Hindman <be...@berkeley.edu>.

> On March 14, 2013, 5:18 p.m., Vinod Kone wrote:
> > third_party/libprocess/src/process.cpp, line 2787
> > <https://reviews.apache.org/r/9922/diff/1/?file=270476#file270476line2787>
> >
> >     Why the recursive lock? Looks like you want this because of the logic you added in #2806? IIUC, this allows the follows:
> >     
> >     --> terminate A
> >     --> A gets a recursive lock
> >     --> All its pending events are fired
> >     --> Enque of an event to A is called
> >     --> Recursive lock is acquired
> >     --> Enqueuing is skipped because process is in TERMINATING
> >     
> >     If that is the case, what happens if 
> >     --> terminate A
> >     --> inside A->finalize() we do terminate B
> >     --> B has a pending event to A
> >     
> >     Doesn't this cause a deadlock?

Calling process::terminate is asynchronous, so we won't cleanup B while we are cleaning up A.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9922/#review17877
-----------------------------------------------------------


On March 14, 2013, 7:25 a.m., Benjamin Hindman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9922/
> -----------------------------------------------------------
> 
> (Updated March 14, 2013, 7:25 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Description
> -------
> 
> The change in https://reviews.apache.org/r/9916 was insufficient. Rather than continue along the lines of the previous fix this change effectively reverts the previous fix and instead adds a TERMINATING process state.
> 
> 
> This addresses bug MESOS-392.
>     https://issues.apache.org/jira/browse/MESOS-392
> 
> 
> Diffs
> -----
> 
>   third_party/libprocess/include/process/process.hpp 1433115516b2110acad0f5c2ee996dfb4203ba8b 
>   third_party/libprocess/src/process.cpp a1da05e83f24c4c94d4eef7ef0db662886efee7a 
> 
> Diff: https://reviews.apache.org/r/9922/diff/
> 
> 
> Testing
> -------
> 
> ./bin/mesos-tests.sh --gtest_filter="*.SchedulerExit" --gtest_repeat=1000
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>


Re: Review Request: Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9922/#review17877
-----------------------------------------------------------



third_party/libprocess/src/process.cpp
<https://reviews.apache.org/r/9922/#comment37856>

    Why the recursive lock? Looks like you want this because of the logic you added in #2806? IIUC, this allows the follows:
    
    --> terminate A
    --> A gets a recursive lock
    --> All its pending events are fired
    --> Enque of an event to A is called
    --> Recursive lock is acquired
    --> Enqueuing is skipped because process is in TERMINATING
    
    If that is the case, what happens if 
    --> terminate A
    --> inside A->finalize() we do terminate B
    --> B has a pending event to A
    
    Doesn't this cause a deadlock?


- Vinod Kone


On March 14, 2013, 7:25 a.m., Benjamin Hindman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9922/
> -----------------------------------------------------------
> 
> (Updated March 14, 2013, 7:25 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Description
> -------
> 
> The change in https://reviews.apache.org/r/9916 was insufficient. Rather than continue along the lines of the previous fix this change effectively reverts the previous fix and instead adds a TERMINATING process state.
> 
> 
> This addresses bug MESOS-392.
>     https://issues.apache.org/jira/browse/MESOS-392
> 
> 
> Diffs
> -----
> 
>   third_party/libprocess/include/process/process.hpp 1433115516b2110acad0f5c2ee996dfb4203ba8b 
>   third_party/libprocess/src/process.cpp a1da05e83f24c4c94d4eef7ef0db662886efee7a 
> 
> Diff: https://reviews.apache.org/r/9922/diff/
> 
> 
> Testing
> -------
> 
> ./bin/mesos-tests.sh --gtest_filter="*.SchedulerExit" --gtest_repeat=1000
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>


Re: Review Request: Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun.

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9922/#review17883
-----------------------------------------------------------

Ship it!


gotcha..thank you.

- Vinod Kone


On March 14, 2013, 7:25 a.m., Benjamin Hindman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9922/
> -----------------------------------------------------------
> 
> (Updated March 14, 2013, 7:25 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Description
> -------
> 
> The change in https://reviews.apache.org/r/9916 was insufficient. Rather than continue along the lines of the previous fix this change effectively reverts the previous fix and instead adds a TERMINATING process state.
> 
> 
> This addresses bug MESOS-392.
>     https://issues.apache.org/jira/browse/MESOS-392
> 
> 
> Diffs
> -----
> 
>   third_party/libprocess/include/process/process.hpp 1433115516b2110acad0f5c2ee996dfb4203ba8b 
>   third_party/libprocess/src/process.cpp a1da05e83f24c4c94d4eef7ef0db662886efee7a 
> 
> Diff: https://reviews.apache.org/r/9922/diff/
> 
> 
> Testing
> -------
> 
> ./bin/mesos-tests.sh --gtest_filter="*.SchedulerExit" --gtest_repeat=1000
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>


Re: Review Request: Added a TERMINATING process state to avoid filtering or enqueing events after process cleanup has begun.

Posted by Ben Mahler <be...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9922/#review17886
-----------------------------------------------------------

Ship it!



third_party/libprocess/src/process.cpp
<https://reviews.apache.org/r/9922/#comment37864>

    Definitely warrants a comment.


- Ben Mahler


On March 14, 2013, 7:25 a.m., Benjamin Hindman wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9922/
> -----------------------------------------------------------
> 
> (Updated March 14, 2013, 7:25 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Description
> -------
> 
> The change in https://reviews.apache.org/r/9916 was insufficient. Rather than continue along the lines of the previous fix this change effectively reverts the previous fix and instead adds a TERMINATING process state.
> 
> 
> This addresses bug MESOS-392.
>     https://issues.apache.org/jira/browse/MESOS-392
> 
> 
> Diffs
> -----
> 
>   third_party/libprocess/include/process/process.hpp 1433115516b2110acad0f5c2ee996dfb4203ba8b 
>   third_party/libprocess/src/process.cpp a1da05e83f24c4c94d4eef7ef0db662886efee7a 
> 
> Diff: https://reviews.apache.org/r/9922/diff/
> 
> 
> Testing
> -------
> 
> ./bin/mesos-tests.sh --gtest_filter="*.SchedulerExit" --gtest_repeat=1000
> 
> 
> Thanks,
> 
> Benjamin Hindman
> 
>