You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Bannier <be...@mesosphere.io> on 2019/01/17 11:15:59 UTC

Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/
-----------------------------------------------------------

Review request for mesos and Chun-Hung Hsiao.


Bugs: MESOS-9130
    https://issues.apache.org/jira/browse/MESOS-9130


Repository: mesos


Description
-------

While we addressed one source of flakiness of this test in
`2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
flakiness (agents send more than the expected number of
`UpdateSlaveMessage`s since they failed a timeout in registration with
the master).

This patch ensures that the agent registers successfully before
continuing with the test.


Diffs
-----

  src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 


Diff: https://reviews.apache.org/r/69781/diff/1/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Benjamin Bannier <be...@mesosphere.io>.

> On Jan. 18, 2019, 4:22 a.m., Chun-Hung Hsiao wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 3340 (patched)
> > <https://reviews.apache.org/r/69781/diff/1/?file=2120161#file2120161line3344>
> >
> >     Could it be possible that the second `UpdateSlaveMessage` has been received before this `FUTURE_PROTOBUF` is set up?

Definitely, installing two expectations right away now instead.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212122
-----------------------------------------------------------


On Jan. 18, 2019, 12:45 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 12:45 p.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Chun-Hung Hsiao <ch...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212122
-----------------------------------------------------------



Ah... thanks for fixing this. Actually when I reviewed r/69687 I did have a concern about expecting two `UpdateSlaveMessage`s without a paused clock but then forgot about the concern and gave a ship-it :(


src/tests/storage_local_resource_provider_tests.cpp
Line 3319 (original), 3319 (patched)
<https://reviews.apache.org/r/69781/#comment297740>

    How about moving this to the very beginning and pause the clock for the whole test?



src/tests/storage_local_resource_provider_tests.cpp
Lines 3340 (patched)
<https://reviews.apache.org/r/69781/#comment297741>

    Could it be possible that the second `UpdateSlaveMessage` has been received before this `FUTURE_PROTOBUF` is set up?


- Chun-Hung Hsiao


On Jan. 17, 2019, 11:15 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 17, 2019, 11:15 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Chun-Hung Hsiao <ch...@apache.org>.

> On Jan. 24, 2019, 3:41 p.m., Chun-Hung Hsiao wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 3388 (patched)
> > <https://reviews.apache.org/r/69781/diff/2/?file=2120346#file2120346line3404>
> >
> >     IIUC the async loop you mentioned is the one trying to detect the creation of the domain socket, which is asynchronously called in `launchContainer`, so waiting for just this function to be dispatched might not be sufficient before pausing the clock again. How about waiting for `waitContainer` (to ensure that the post-start hook and thus the async loop has been finished) after `launchContainer` and before the pause?
> >     
> >     Alternatively, since this test doesn't actually care what happens after `launchContainer`, is it needed to resume the clock at all?
> 
> Chun-Hung Hsiao wrote:
>     I believe it's because of some timers we used in, e.g., reaper or cgroup to destroy the container. How about:
>     ```
>     AWAIT_ASSERT_EQ(0, pluginKilled);
>     
>     // Resume the clock so the plugin container can be properly destroyed.
>     Clock::resume();
>     
>     AWAIT_READY(pluginRestarted);
>     
>     Clock::pause();
>     ```
>     
>     Feel free to drop this. Gave you a ship-it.

Actually, let's just don't pause the clock again.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212275
-----------------------------------------------------------


On Jan. 18, 2019, 11:45 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 11:45 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Benjamin Bannier <be...@mesosphere.io>.

> On Jan. 24, 2019, 4:41 p.m., Chun-Hung Hsiao wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 3388 (patched)
> > <https://reviews.apache.org/r/69781/diff/2/?file=2120346#file2120346line3404>
> >
> >     IIUC the async loop you mentioned is the one trying to detect the creation of the domain socket, which is asynchronously called in `launchContainer`, so waiting for just this function to be dispatched might not be sufficient before pausing the clock again. How about waiting for `waitContainer` (to ensure that the post-start hook and thus the async loop has been finished) after `launchContainer` and before the pause?
> >     
> >     Alternatively, since this test doesn't actually care what happens after `launchContainer`, is it needed to resume the clock at all?
> 
> Chun-Hung Hsiao wrote:
>     I believe it's because of some timers we used in, e.g., reaper or cgroup to destroy the container. How about:
>     ```
>     AWAIT_ASSERT_EQ(0, pluginKilled);
>     
>     // Resume the clock so the plugin container can be properly destroyed.
>     Clock::resume();
>     
>     AWAIT_READY(pluginRestarted);
>     
>     Clock::pause();
>     ```
>     
>     Feel free to drop this. Gave you a ship-it.
> 
> Chun-Hung Hsiao wrote:
>     Actually, let's just don't pause the clock again.

I went with your suggestion above, but explicitly bracketed the part requiring a running clock with `resume` and `pause`. This makes it a little clearer where exactly we need a running clock.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212275
-----------------------------------------------------------


On Jan. 25, 2019, 10:28 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 25, 2019, 10:28 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/3/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Chun-Hung Hsiao <ch...@apache.org>.

> On Jan. 24, 2019, 3:41 p.m., Chun-Hung Hsiao wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 3388 (patched)
> > <https://reviews.apache.org/r/69781/diff/2/?file=2120346#file2120346line3404>
> >
> >     IIUC the async loop you mentioned is the one trying to detect the creation of the domain socket, which is asynchronously called in `launchContainer`, so waiting for just this function to be dispatched might not be sufficient before pausing the clock again. How about waiting for `waitContainer` (to ensure that the post-start hook and thus the async loop has been finished) after `launchContainer` and before the pause?
> >     
> >     Alternatively, since this test doesn't actually care what happens after `launchContainer`, is it needed to resume the clock at all?

I believe it's because of some timers we used in, e.g., reaper or cgroup to destroy the container. How about:
```
AWAIT_ASSERT_EQ(0, pluginKilled);

// Resume the clock so the plugin container can be properly destroyed.
Clock::resume();

AWAIT_READY(pluginRestarted);

Clock::pause();
```

Feel free to drop this. Gave you a ship-it.


- Chun-Hung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212275
-----------------------------------------------------------


On Jan. 18, 2019, 11:45 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 11:45 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Chun-Hung Hsiao <ch...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212275
-----------------------------------------------------------




src/tests/storage_local_resource_provider_tests.cpp
Lines 3388 (patched)
<https://reviews.apache.org/r/69781/#comment298013>

    IIUC the async loop you mentioned is the one trying to detect the creation of the domain socket, which is asynchronously called in `launchContainer`, so waiting for just this function to be dispatched might not be sufficient before pausing the clock again. How about waiting for `waitContainer` (to ensure that the post-start hook and thus the async loop has been finished) after `launchContainer` and before the pause?
    
    Alternatively, since this test doesn't actually care what happens after `launchContainer`, is it needed to resume the clock at all?


- Chun-Hung Hsiao


On Jan. 18, 2019, 11:45 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 11:45 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212142
-----------------------------------------------------------



PASS: Mesos patch 69781 was successfully built and tested.

Reviews applied: `['69781']`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2801/mesos-review-69781

- Mesos Reviewbot Windows


On Jan. 18, 2019, 11:45 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 11:45 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Benjamin Bannier <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/
-----------------------------------------------------------

(Updated Jan. 25, 2019, 10:28 a.m.)


Review request for mesos and Chun-Hung Hsiao.


Bugs: MESOS-9130
    https://issues.apache.org/jira/browse/MESOS-9130


Repository: mesos


Description
-------

While we addressed one source of flakiness of this test in
`2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
flakiness (agents send more than the expected number of
`UpdateSlaveMessage`s since they failed a timeout in registration with
the master).

This patch ensures that the agent registers successfully before
continuing with the test.


Diffs (updated)
-----

  src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 


Diff: https://reviews.apache.org/r/69781/diff/3/

Changes: https://reviews.apache.org/r/69781/diff/2-3/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Chun-Hung Hsiao <ch...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212307
-----------------------------------------------------------


Ship it!




Ship It!

- Chun-Hung Hsiao


On Jan. 18, 2019, 11:45 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 11:45 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Benjamin Bannier <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/
-----------------------------------------------------------

(Updated Jan. 18, 2019, 12:45 p.m.)


Review request for mesos and Chun-Hung Hsiao.


Changes
-------

Addressed issues raised by Chun.


Bugs: MESOS-9130
    https://issues.apache.org/jira/browse/MESOS-9130


Repository: mesos


Description
-------

While we addressed one source of flakiness of this test in
`2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
flakiness (agents send more than the expected number of
`UpdateSlaveMessage`s since they failed a timeout in registration with
the master).

This patch ensures that the agent registers successfully before
continuing with the test.


Diffs (updated)
-----

  src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 


Diff: https://reviews.apache.org/r/69781/diff/2/

Changes: https://reviews.apache.org/r/69781/diff/1-2/


Testing
-------

`make check`


Thanks,

Benjamin Bannier


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Mesos Reviewbot Windows <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212110
-----------------------------------------------------------



PASS: Mesos patch 69781 was successfully built and tested.

Reviews applied: `['69781']`

All the build artifacts available at: http://dcos-win.westus2.cloudapp.azure.com/artifacts/mesos-reviewbot-testing/2796/mesos-review-69781

- Mesos Reviewbot Windows


On Jan. 17, 2019, 7:15 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 17, 2019, 7:15 p.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Benjamin Bannier <be...@mesosphere.io>.

> On Jan. 18, 2019, 4:42 a.m., Chun-Hung Hsiao wrote:
> > src/tests/storage_local_resource_provider_tests.cpp
> > Lines 3330 (patched)
> > <https://reviews.apache.org/r/69781/diff/1/?file=2120161#file2120161line3334>
> >
> >     Is this really required?

This one conservative approach to deal with the possibility of them being different. Alternatively one could instead advance only by their maximum value. If at all we currently advance by both durations.

Dropping.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212125
-----------------------------------------------------------


On Jan. 18, 2019, 12:45 p.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2019, 12:45 p.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/2/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>


Re: Review Request 69781: Fixed flakiness of resource provider ContainerTerminationMetric test.

Posted by Chun-Hung Hsiao <ch...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69781/#review212125
-----------------------------------------------------------




src/tests/storage_local_resource_provider_tests.cpp
Lines 3330 (patched)
<https://reviews.apache.org/r/69781/#comment297743>

    Is this really required?


- Chun-Hung Hsiao


On Jan. 17, 2019, 11:15 a.m., Benjamin Bannier wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69781/
> -----------------------------------------------------------
> 
> (Updated Jan. 17, 2019, 11:15 a.m.)
> 
> 
> Review request for mesos and Chun-Hung Hsiao.
> 
> 
> Bugs: MESOS-9130
>     https://issues.apache.org/jira/browse/MESOS-9130
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> While we addressed one source of flakiness of this test in
> `2117f671c450d7c74edc53078e8c0ed6035020aa` we introduced a new source of
> flakiness (agents send more than the expected number of
> `UpdateSlaveMessage`s since they failed a timeout in registration with
> the master).
> 
> This patch ensures that the agent registers successfully before
> continuing with the test.
> 
> 
> Diffs
> -----
> 
>   src/tests/storage_local_resource_provider_tests.cpp 164e93a3749d4d668e12de31641aecddddece384 
> 
> 
> Diff: https://reviews.apache.org/r/69781/diff/1/
> 
> 
> Testing
> -------
> 
> `make check`
> 
> 
> Thanks,
> 
> Benjamin Bannier
> 
>