You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Guangya Liu <gy...@gmail.com> on 2015/10/22 11:12:39 UTC

Review Request 39548: SchedulerTest.Suppress is flaky

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/
-----------------------------------------------------------

Review request for mesos and Vinod Kone.


Bugs: MESOS-3733
    https://issues.apache.org/jira/browse/MESOS-3733


Repository: mesos


Description
-------

Root Cause: The reason is that the DECLINE call set filter as 1hr,
the Clock::advance set as 100m. A race condition is that both DECLINE
and SUPPRESS started up in different threads and the call Clock::advance
may be called before SUPPRESS finished. The clock advanced for 100m which
is greater than 1hr, this caused the allocator start to allocate resource
again and ASSERT_TRUE(event.isPending()) will be failed.

Solution: Call SUPPRESS first, this can make sure the SUPPRESS can
be finished before DECLINE.


Diffs
-----

  src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 

Diff: https://reviews.apache.org/r/39548/diff/


Testing
-------

Platform: Ubuntu 14.04
make 
make check
bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure


Thanks,

Guangya Liu


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103561
-----------------------------------------------------------


Patch looks great!

Reviews applied: [39548]

All tests passed.

- Mesos ReviewBot


On Oct. 22, 2015, 9:12 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 22, 2015, 9:12 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Call SUPPRESS first, this can make sure the SUPPRESS can
> be finished before DECLINE.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Guangya Liu <gy...@gmail.com>.

> On 十月 23, 2015, 3:53 a.m., Anand Mazumdar wrote:
> > src/tests/scheduler_tests.cpp, lines 999-1008
> > <https://reviews.apache.org/r/39548/diff/2/?file=1103771#file1103771line999>
> >
> >     Wondering, why did you change order here i.e. move this piece of code up ? 
> >     
> >     Now, with using `FUTURE_DISPATCH` you can gurrantee that `Suppress` would be completed before the `Clock` is advanced. Am I missing something ?

I think that the order of SUPPRESS and DECLINEN is not that important as long as we can make sure the offer will be send after decline filter time out. So here it is OK to adjust and keep the order.


- Guangya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103710
-----------------------------------------------------------


On 十月 23, 2015, 2:32 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated 十月 23, 2015, 2:32 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Call SUPPRESS first, and make sure SUPPRESS call ready before
> call DECLINE.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Anand Mazumdar <ma...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103710
-----------------------------------------------------------



src/tests/scheduler_tests.cpp (lines 999 - 1008)
<https://reviews.apache.org/r/39548/#comment161781>

    Wondering, why did you change order here i.e. move this piece of code up ? 
    
    Now, with using `FUTURE_DISPATCH` you can gurrantee that `Suppress` would be completed before the `Clock` is advanced. Am I missing something ?



src/tests/scheduler_tests.cpp (line 1007)
<https://reviews.apache.org/r/39548/#comment161783>

    This alone won't suffice. AFAIK, the `Future` returned by `FUTURE_DISPATCH` is marked ready when the function is invoked not neccessarily when the function has completely executed.
    
    You would want to pause/settle the clock too. The modified code would look like:
    
    ```
     Future<Nothing> suppressOffers =
        FUTURE_DISPATCH(_, &MesosAllocatorProcess::suppressOffers);
    
      {
        Call call;
        call.mutable_framework_id()->CopyFrom(id);
        call.set_type(Call::SUPPRESS);
    
        mesos.send(call);
      }
    
      AWAIT_READY(suppressOffers);
      
      // Wait for suppress to be complete.
      Clock::pause();
      Clock::settle();
      
      // No offers should be sent within 100 mins because the framework
      // suppressed offers.
      Clock::advance(Minutes(100));
      Clock::settle();
    ```
    
    Here is some more info: http://mesos.apache.org/documentation/latest/testing-patterns/ . What do you think ?


- Anand Mazumdar


On Oct. 23, 2015, 2:32 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 2:32 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Call SUPPRESS first, and make sure SUPPRESS call ready before
> call DECLINE.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103711
-----------------------------------------------------------


Patch looks great!

Reviews applied: [39548]

All tests passed.

- Mesos ReviewBot


On Oct. 23, 2015, 2:32 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 2:32 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Call SUPPRESS first, and make sure SUPPRESS call ready before
> call DECLINE.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103735
-----------------------------------------------------------


Patch looks great!

Reviews applied: [39548]

All tests passed.

- Mesos ReviewBot


On Oct. 23, 2015, 5:29 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 5:29 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Call SUPPRESS first, and make sure SUPPRESS call ready before
> call DECLINE.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103752
-----------------------------------------------------------


Patch looks great!

Reviews applied: [39548]

All tests passed.

- Mesos ReviewBot


On Oct. 23, 2015, 9:05 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 9:05 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Make SUPPRESS call is finished before call Clock::advance
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103810
-----------------------------------------------------------

Ship it!


Ship It!

- Vinod Kone


On Oct. 23, 2015, 9:05 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 9:05 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Make SUPPRESS call is finished before call Clock::advance
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Anand Mazumdar <ma...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103745
-----------------------------------------------------------

Ship it!


LGTM !

- Anand Mazumdar


On Oct. 23, 2015, 9:05 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 23, 2015, 9:05 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Make SUPPRESS call is finished before call Clock::advance
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Guangya Liu <gy...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/
-----------------------------------------------------------

(Updated 十月 23, 2015, 9:05 a.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Address Anand's comments per IRC discussion


Bugs: MESOS-3733
    https://issues.apache.org/jira/browse/MESOS-3733


Repository: mesos


Description (updated)
-------

Root Cause: The reason is that the DECLINE call set filter as 1hr,
the Clock::advance set as 100m. A race condition is that both DECLINE
and SUPPRESS started up in different threads and the call Clock::advance
may be called before SUPPRESS finished. The clock advanced for 100m which
is greater than 1hr, this caused the allocator start to allocate resource
again and ASSERT_TRUE(event.isPending()) will be failed.

Solution: Make SUPPRESS call is finished before call Clock::advance


Diffs (updated)
-----

  src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 

Diff: https://reviews.apache.org/r/39548/diff/


Testing
-------

Platform: Ubuntu 14.04
make 
make check
bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure


Thanks,

Guangya Liu


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Guangya Liu <gy...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/
-----------------------------------------------------------

(Updated 十月 23, 2015, 5:29 a.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Address Anand's comments


Bugs: MESOS-3733
    https://issues.apache.org/jira/browse/MESOS-3733


Repository: mesos


Description
-------

Root Cause: The reason is that the DECLINE call set filter as 1hr,
the Clock::advance set as 100m. A race condition is that both DECLINE
and SUPPRESS started up in different threads and the call Clock::advance
may be called before SUPPRESS finished. The clock advanced for 100m which
is greater than 1hr, this caused the allocator start to allocate resource
again and ASSERT_TRUE(event.isPending()) will be failed.

Solution: Call SUPPRESS first, and make sure SUPPRESS call ready before
call DECLINE.


Diffs (updated)
-----

  src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 

Diff: https://reviews.apache.org/r/39548/diff/


Testing
-------

Platform: Ubuntu 14.04
make 
make check
bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure


Thanks,

Guangya Liu


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Guangya Liu <gy...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/
-----------------------------------------------------------

(Updated Oct. 23, 2015, 2:32 a.m.)


Review request for mesos and Vinod Kone.


Changes
-------

Address Vinod's comments


Bugs: MESOS-3733
    https://issues.apache.org/jira/browse/MESOS-3733


Repository: mesos


Description (updated)
-------

Root Cause: The reason is that the DECLINE call set filter as 1hr,
the Clock::advance set as 100m. A race condition is that both DECLINE
and SUPPRESS started up in different threads and the call Clock::advance
may be called before SUPPRESS finished. The clock advanced for 100m which
is greater than 1hr, this caused the allocator start to allocate resource
again and ASSERT_TRUE(event.isPending()) will be failed.

Solution: Call SUPPRESS first, and make sure SUPPRESS call ready before
call DECLINE.


Diffs (updated)
-----

  src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 

Diff: https://reviews.apache.org/r/39548/diff/


Testing
-------

Platform: Ubuntu 14.04
make 
make check
bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure


Thanks,

Guangya Liu


Re: Review Request 39548: SchedulerTest.Suppress is flaky

Posted by Vinod Kone <vi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39548/#review103660
-----------------------------------------------------------



src/tests/scheduler_tests.cpp (line 1001)
<https://reviews.apache.org/r/39548/#comment161723>

    Hmm. Does this really fix the race? What's the guarantee that SUPPRESS and DECLINE are processed by the master/allocator before you advance the clock?
    
    I think what you want do instead here is setup a FUTURE_DISPATCH() on HierarchicalAllocatorProcess::suppressOffers() and do a clock::settle() on that future before you advance the clock. 
    
    Does that make sense?


- Vinod Kone


On Oct. 22, 2015, 9:12 a.m., Guangya Liu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39548/
> -----------------------------------------------------------
> 
> (Updated Oct. 22, 2015, 9:12 a.m.)
> 
> 
> Review request for mesos and Vinod Kone.
> 
> 
> Bugs: MESOS-3733
>     https://issues.apache.org/jira/browse/MESOS-3733
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Root Cause: The reason is that the DECLINE call set filter as 1hr,
> the Clock::advance set as 100m. A race condition is that both DECLINE
> and SUPPRESS started up in different threads and the call Clock::advance
> may be called before SUPPRESS finished. The clock advanced for 100m which
> is greater than 1hr, this caused the allocator start to allocate resource
> again and ASSERT_TRUE(event.isPending()) will be failed.
> 
> Solution: Call SUPPRESS first, this can make sure the SUPPRESS can
> be finished before DECLINE.
> 
> 
> Diffs
> -----
> 
>   src/tests/scheduler_tests.cpp 7946cb48d62f4ed6d0fdbc771746518e31921f97 
> 
> Diff: https://reviews.apache.org/r/39548/diff/
> 
> 
> Testing
> -------
> 
> Platform: Ubuntu 14.04
> make 
> make check
> bin/mesos-tests.sh --gtest_filter="ContentType/SchedulerTest.Suppress/*"  --gtest_repeat=-1 --gtest_break_on_failure
> 
> 
> Thanks,
> 
> Guangya Liu
> 
>