You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Neil Conway <ne...@gmail.com> on 2017/05/16 20:02:25 UTC

Review Request 59320: Added test case for agent re-registration.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/
-----------------------------------------------------------

Review request for mesos and Benjamin Mahler.


Repository: mesos


Description
-------

Add an explanatory comment and a test case for a particular case in the
master's logic for handling agent re-registration.


Diffs
-----

  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
  src/tests/partition_tests.cpp 4ff428564d1fa6cb96e6f8ec8edc331da88a3eb6 


Diff: https://reviews.apache.org/r/59320/diff/1/


Testing
-------

`make check`


Thanks,

Neil Conway


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review176416
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59213, 59214, 59215, 59216, 59217, 59218, 59219, 59259, 59301, 59302, 59320]

Passed command: support\windows-build.bat

- Mesos Reviewbot Windows


On May 16, 2017, 1:19 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 16, 2017, 1:19 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
>   src/tests/partition_tests.cpp 4ff428564d1fa6cb96e6f8ec8edc331da88a3eb6 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Neil Conway <ne...@gmail.com>.

> On May 31, 2017, 12:09 a.m., Benjamin Mahler wrote:
> > src/master/master.cpp
> > Lines 6018-6025 (original), 6018-6025 (patched)
> > <https://reviews.apache.org/r/59320/diff/2/?file=1721802#file1721802line6018>
> >
> >     Hm.. what does slaveWasRemoved mean? Is it equivalen to the agent being marked as unreachable? Or is being removed something beyond being marked as unreachable (what it sounds like)? I had a hard time grasping the code below when it was referring to "unreachable" but the variable is called "removed".

`slaveWasRemoved` means that _this_ instance of the master was the one that removed the agent (because `removed` is only a cache, there might be false negatives). Similarly, `removed` contains agents that _this_ instance of the master has marked unreachable. No objection to renaming both variables, as long as we keep them in sync, although I can't think of a ton of better names.


> On May 31, 2017, 12:09 a.m., Benjamin Mahler wrote:
> > src/master/master.cpp
> > Lines 6027-6043 (original), 6027-6043 (patched)
> > <https://reviews.apache.org/r/59320/diff/2/?file=1721802#file1721802line6027>
> >
> >     Hm.. it seems like we could maybe make this code easier to read. Currently, the loop for dealing with frameworks is down below and the loop for dealing with tasks is up here. However, it seems to me that these are highly related, and it would be easier to reason about if it were consolidated into a loop over the framework and then a loop over the tasks belonging to that framework. That would also eliminate the need for `partitionAwareFrameworks`.
> >     
> >     E.g.
> >     
> >     ```
> >       foreach (const FrameworkInfo& framework, frameworks):
> >         if completed:
> >           shut down the framework, drop the tasks
> >           continue
> >           
> >         if agent was unreachable/removed and framework is not partition aware:
> >           shut down the framework, drop the tasks
> >           continue
> >         
> >         for each task in the framework:
> >           recoveredTasks.push_back(task)
> >           remove from unreachable tasks
> >     ```

Hmmm. I can see how this would makes things clearer, but it might result in some subtle behavioral differences. Right now, the master notifies the agent that it has re-registered, then sends it `ShutdownFrameworkMessage`s in some cases. Adopting the rewrite you suggested would mean shutting down frameworks while the agent thinks it hasn't re-registered yet. Not wrong but confusing.

Since this code is likely to change (see the "RFC: Partition Awareness" thread on `dev`), I'm inclined to leave this as-is in the short term.


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review176399
-----------------------------------------------------------


On May 31, 2017, 8:35 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 31, 2017, 8:35 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp a5376f9fc3c3c6c72b22cfd36ec56a1de0d03a3a 
>   src/tests/partition_tests.cpp 62a84f797201ccd18b71490949e3130d2b9c3668 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Neil Conway <ne...@gmail.com>.

> On May 31, 2017, 12:09 a.m., Benjamin Mahler wrote:
> > src/tests/partition_tests.cpp
> > Lines 2387 (patched)
> > <https://reviews.apache.org/r/59320/diff/2/?file=1721803#file1721803line2387>
> >
> >     Hm.. why disabled on windows? Do you know why or is this just copied? For example, it's not clear to me why the above partition test isn't disabled but the below one is.

I don't really have a way to run tests on Windows, and I'm nervous just depending on the Windows reviewbot/CI. I'll try enabling it and see what happens...


- Neil


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review176399
-----------------------------------------------------------


On May 16, 2017, 8:19 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 16, 2017, 8:19 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
>   src/tests/partition_tests.cpp 4ff428564d1fa6cb96e6f8ec8edc331da88a3eb6 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Benjamin Mahler <bm...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review176399
-----------------------------------------------------------




src/master/master.cpp
Lines 6018-6025 (original), 6018-6025 (patched)
<https://reviews.apache.org/r/59320/#comment249773>

    Hm.. what does slaveWasRemoved mean? Is it equivalen to the agent being marked as unreachable? Or is being removed something beyond being marked as unreachable (what it sounds like)? I had a hard time grasping the code below when it was referring to "unreachable" but the variable is called "removed".



src/master/master.cpp
Lines 6027-6043 (original), 6027-6043 (patched)
<https://reviews.apache.org/r/59320/#comment249772>

    Hm.. it seems like we could maybe make this code easier to read. Currently, the loop for dealing with frameworks is down below and the loop for dealing with tasks is up here. However, it seems to me that these are highly related, and it would be easier to reason about if it were consolidated into a loop over the framework and then a loop over the tasks belonging to that framework. That would also eliminate the need for `partitionAwareFrameworks`.
    
    E.g.
    
    ```
      foreach (const FrameworkInfo& framework, frameworks):
        if completed:
          shut down the framework, drop the tasks
          continue
          
        if agent was unreachable/removed and framework is not partition aware:
          shut down the framework, drop the tasks
          continue
        
        for each task in the framework:
          recoveredTasks.push_back(task)
          remove from unreachable tasks
    ```



src/master/master.cpp
Lines 6058 (patched)
<https://reviews.apache.org/r/59320/#comment249766>

    "it" is referring to the task? Threw me off initially, since it sounds like "remove the framework" given the previous setence.
    
    s/it/the task/ ?



src/tests/partition_tests.cpp
Lines 2387 (patched)
<https://reviews.apache.org/r/59320/#comment249765>

    Hm.. why disabled on windows? Do you know why or is this just copied? For example, it's not clear to me why the above partition test isn't disabled but the below one is.


- Benjamin Mahler


On May 16, 2017, 8:19 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 16, 2017, 8:19 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
>   src/tests/partition_tests.cpp 4ff428564d1fa6cb96e6f8ec8edc331da88a3eb6 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review176595
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59213, 59214, 59215, 59216, 59217, 59218, 59219, 59259, 59301, 59302, 59320]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On May 31, 2017, 8:35 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 31, 2017, 8:35 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp a5376f9fc3c3c6c72b22cfd36ec56a1de0d03a3a 
>   src/tests/partition_tests.cpp 62a84f797201ccd18b71490949e3130d2b9c3668 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review176731
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59213, 59214, 59215, 59216, 59217, 59218, 59219, 59259, 59301, 59302, 59320]

Passed command: support\windows-build.bat

- Mesos Reviewbot Windows


On May 31, 2017, 8:35 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 31, 2017, 8:35 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp a5376f9fc3c3c6c72b22cfd36ec56a1de0d03a3a 
>   src/tests/partition_tests.cpp 62a84f797201ccd18b71490949e3130d2b9c3668 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Neil Conway <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/
-----------------------------------------------------------

(Updated May 31, 2017, 8:35 p.m.)


Review request for mesos and Benjamin Mahler.


Changes
-------

Improve comment, try enabling test on Windows.


Repository: mesos


Description
-------

Add an explanatory comment and a test case for a particular case in the
master's logic for handling agent re-registration.


Diffs (updated)
-----

  src/master/master.cpp a5376f9fc3c3c6c72b22cfd36ec56a1de0d03a3a 
  src/tests/partition_tests.cpp 62a84f797201ccd18b71490949e3130d2b9c3668 


Diff: https://reviews.apache.org/r/59320/diff/3/

Changes: https://reviews.apache.org/r/59320/diff/2-3/


Testing
-------

`make check`

Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).


Thanks,

Neil Conway


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/#review175204
-----------------------------------------------------------



Patch looks great!

Reviews applied: [59213, 59214, 59215, 59216, 59217, 59218, 59219, 59259, 59301, 59302, 59320]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On May 16, 2017, 8:19 p.m., Neil Conway wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59320/
> -----------------------------------------------------------
> 
> (Updated May 16, 2017, 8:19 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add an explanatory comment and a test case for a particular case in the
> master's logic for handling agent re-registration.
> 
> 
> Diffs
> -----
> 
>   src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
>   src/tests/partition_tests.cpp 4ff428564d1fa6cb96e6f8ec8edc331da88a3eb6 
> 
> 
> Diff: https://reviews.apache.org/r/59320/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> Checked that if you add `CHECK_NOTNULL(framework)` to the master code (circa the newly added comment), the incorrect `CHECK` is triggered by this test case (but not by any existing test cases).
> 
> 
> Thanks,
> 
> Neil Conway
> 
>


Re: Review Request 59320: Added test case for agent re-registration.

Posted by Neil Conway <ne...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59320/
-----------------------------------------------------------

(Updated May 16, 2017, 8:19 p.m.)


Review request for mesos and Benjamin Mahler.


Changes
-------

Add querying of state endpoint to test case.


Repository: mesos


Description
-------

Add an explanatory comment and a test case for a particular case in the
master's logic for handling agent re-registration.


Diffs (updated)
-----

  src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad 
  src/tests/partition_tests.cpp 4ff428564d1fa6cb96e6f8ec8edc331da88a3eb6 


Diff: https://reviews.apache.org/r/59320/diff/2/

Changes: https://reviews.apache.org/r/59320/diff/1-2/


Testing
-------

`make check`


Thanks,

Neil Conway