You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Greg Mann <gr...@mesosphere.io> on 2019/06/19 18:55:33 UTC

Review Request 70899: Refactored the agent's task-killing code.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/
-----------------------------------------------------------

Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.


Bugs: MESOS-9821
    https://issues.apache.org/jira/browse/MESOS-9821


Repository: mesos


Description
-------

This patch factors the code responsible for killing tasks
out into two helper functions. This will facilitate the
calling of this common code by the agent-draining handler.


Diffs
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70899/diff/1/


Testing
-------

Testing details at the end of this chain.


Thanks,

Greg Mann


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.

> On July 2, 2019, 10:52 p.m., Joseph Wu wrote:
> > src/slave/slave.cpp
> > Lines 3673-3680 (original), 3673-3679 (patched)
> > <https://reviews.apache.org/r/70899/diff/4/?file=2152982#file2152982line3673>
> >
> >     Due to splitting out the `killPendingTask`, this logic is now executed before the check for `framework->state == Framework::TERMINATING`.  This means the agent will no longer log that warning message (Ignoring kill task ...) for pending tasks only.
> >     
> >     Is this intentional?  (It probably isn't a big deal if we don't log this warning.)

Oh whoops this was unintentional; thanks!! I think it's best to move the conditional logging back into `killTask()` just before we call into `killPendingTask()`. This maintains exactly the same behavior as before.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216322
-----------------------------------------------------------


On July 10, 2019, 6:59 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated July 10, 2019, 6:59 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/5/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216322
-----------------------------------------------------------




src/slave/slave.cpp
Lines 3673-3680 (original), 3673-3679 (patched)
<https://reviews.apache.org/r/70899/#comment303492>

    Due to splitting out the `killPendingTask`, this logic is now executed before the check for `framework->state == Framework::TERMINATING`.  This means the agent will no longer log that warning message (Ignoring kill task ...) for pending tasks only.
    
    Is this intentional?  (It probably isn't a big deal if we don't log this warning.)


- Joseph Wu


On July 1, 2019, 12:49 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated July 1, 2019, 12:49 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/4/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216565
-----------------------------------------------------------


Ship it!




Ship It!

- Joseph Wu


On July 10, 2019, 11:59 a.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated July 10, 2019, 11:59 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/5/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/
-----------------------------------------------------------

(Updated July 10, 2019, 6:59 p.m.)


Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.


Bugs: MESOS-9821
    https://issues.apache.org/jira/browse/MESOS-9821


Repository: mesos


Description
-------

This patch factors the code responsible for killing tasks
out into two helper functions. This will facilitate the
calling of this common code by the agent-draining handler.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70899/diff/5/

Changes: https://reviews.apache.org/r/70899/diff/4-5/


Testing
-------

Testing details at the end of this chain.


Thanks,

Greg Mann


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/
-----------------------------------------------------------

(Updated July 1, 2019, 7:49 p.m.)


Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.


Bugs: MESOS-9821
    https://issues.apache.org/jira/browse/MESOS-9821


Repository: mesos


Description
-------

This patch factors the code responsible for killing tasks
out into two helper functions. This will facilitate the
calling of this common code by the agent-draining handler.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70899/diff/4/

Changes: https://reviews.apache.org/r/70899/diff/3-4/


Testing
-------

Testing details at the end of this chain.


Thanks,

Greg Mann


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Joseph Wu <jo...@mesosphere.io>.

> On June 27, 2019, 6:02 a.m., Benjamin Bannier wrote:
> > src/slave/slave.cpp
> > Lines 3755-3757 (original), 3787-3789 (patched)
> > <https://reviews.apache.org/r/70899/diff/3/?file=2151692#file2151692line3794>
> >
> >     The `CHECK` for `TERMINATING` here does not fit the branch on `TERMINATING` below, and one of them should be removed.
> 
> Greg Mann wrote:
>     I don't understand - the switch statement below is for the executor state, and I think this block _does_ make sense with respect to the conditional below for `framework->state == Framework::TERMINATING`. I'll ask Joseph to take a look at this as well, but dropping this issue for now.

I think what could be done is this: 
```
void Slave::kill(...)
{
  // ...
  CHECK_NOTNULL(framework);
  CHECK_NOTNULL(executor);
  
  // We don't send a status update here because a terminating
  // framework cannot send acknowledgements.
  if (framework->state == Framework::TERMINATING) {
    LOG(WARNING) << "Ignoring kill task " << taskId
                 << " of framework " << frameworkId
                 << " because the framework is terminating";
    return;
  }
  
  CHECK(framework->state == Framework::RUNNING)
    << framework->state;
    
  ...
```


- Joseph


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216190
-----------------------------------------------------------


On July 1, 2019, 12:49 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated July 1, 2019, 12:49 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/4/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.

> On June 27, 2019, 1:02 p.m., Benjamin Bannier wrote:
> > src/slave/slave.cpp
> > Lines 3755-3757 (original), 3787-3789 (patched)
> > <https://reviews.apache.org/r/70899/diff/3/?file=2151692#file2151692line3794>
> >
> >     The `CHECK` for `TERMINATING` here does not fit the branch on `TERMINATING` below, and one of them should be removed.

I don't understand - the switch statement below is for the executor state, and I think this block _does_ make sense with respect to the conditional below for `framework->state == Framework::TERMINATING`. I'll ask Joseph to take a look at this as well, but dropping this issue for now.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216190
-----------------------------------------------------------


On July 1, 2019, 7:49 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated July 1, 2019, 7:49 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/4/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.

> On June 27, 2019, 1:02 p.m., Benjamin Bannier wrote:
> > src/slave/slave.cpp
> > Lines 3676-3678 (original), 3676-3678 (patched)
> > <https://reviews.apache.org/r/70899/diff/3/?file=2151692#file2151692line3676>
> >
> >     Move this comment into `killPendingTask`? Nothing here ensures that we actually send `TASK_KILLED`.

I just removed this comment, since there is already a similar one in `killPendingTask`.


- Greg


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216190
-----------------------------------------------------------


On July 1, 2019, 7:49 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated July 1, 2019, 7:49 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/4/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Benjamin Bannier <bb...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216190
-----------------------------------------------------------


Fix it, then Ship it!





src/slave/slave.cpp
Lines 3676-3678 (original), 3676-3678 (patched)
<https://reviews.apache.org/r/70899/#comment303264>

    Move this comment into `killPendingTask`? Nothing here ensures that we actually send `TASK_KILLED`.



src/slave/slave.cpp
Lines 3755-3757 (original), 3787-3789 (patched)
<https://reviews.apache.org/r/70899/#comment303265>

    The `CHECK` for `TERMINATING` here does not fit the branch on `TERMINATING` below, and one of them should be removed.


- Benjamin Bannier


On June 22, 2019, 1:59 a.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated June 22, 2019, 1:59 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/3/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/
-----------------------------------------------------------

(Updated June 21, 2019, 11:59 p.m.)


Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.


Bugs: MESOS-9821
    https://issues.apache.org/jira/browse/MESOS-9821


Repository: mesos


Description
-------

This patch factors the code responsible for killing tasks
out into two helper functions. This will facilitate the
calling of this common code by the agent-draining handler.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70899/diff/3/

Changes: https://reviews.apache.org/r/70899/diff/2-3/


Testing
-------

Testing details at the end of this chain.


Thanks,

Greg Mann


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/#review216051
-----------------------------------------------------------


Fix it, then Ship it!




Refactor looks reasonable.  One possible commenting oversight below.


src/slave/slave.cpp
Lines 3782-3784 (patched)
<https://reviews.apache.org/r/70899/#comment303026>

    In the header, the comment says:
    ```
    This function should be used to kill tasks which are queued or launched, but not tasks which are pending.
    ```
    
    And the refactor seems to prevent this function from being called with a pending task.  So shouldn't `executor` be non-null?
    
    Or is there some future use-case where this is not true?


- Joseph Wu


On June 19, 2019, 12:05 p.m., Greg Mann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70899/
> -----------------------------------------------------------
> 
> (Updated June 19, 2019, 12:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.
> 
> 
> Bugs: MESOS-9821
>     https://issues.apache.org/jira/browse/MESOS-9821
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This patch factors the code responsible for killing tasks
> out into two helper functions. This will facilitate the
> calling of this common code by the agent-draining handler.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
>   src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 
> 
> 
> Diff: https://reviews.apache.org/r/70899/diff/2/
> 
> 
> Testing
> -------
> 
> Testing details at the end of this chain.
> 
> 
> Thanks,
> 
> Greg Mann
> 
>


Re: Review Request 70899: Refactored the agent's task-killing code.

Posted by Greg Mann <gr...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70899/
-----------------------------------------------------------

(Updated June 19, 2019, 7:05 p.m.)


Review request for mesos, Benjamin Bannier, Benno Evers, Benjamin Mahler, and Joseph Wu.


Bugs: MESOS-9821
    https://issues.apache.org/jira/browse/MESOS-9821


Repository: mesos


Description
-------

This patch factors the code responsible for killing tasks
out into two helper functions. This will facilitate the
calling of this common code by the agent-draining handler.


Diffs (updated)
-----

  src/slave/slave.hpp 6954f53ff1531b9fcb688ef76acddf6a3d849a41 
  src/slave/slave.cpp 30039b0857a4d85b4b96fa95d7f8724d57cdec6e 


Diff: https://reviews.apache.org/r/70899/diff/2/

Changes: https://reviews.apache.org/r/70899/diff/1-2/


Testing
-------

Testing details at the end of this chain.


Thanks,

Greg Mann