You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Bernd Mathiske <be...@mesosphere.io> on 2016/01/11 18:15:13 UTC

Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch. Inlined awaiting fetch contention.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/
-----------------------------------------------------------

Review request for mesos, Alexander Rojas, Jie Yu, Joseph Wu, Neil Conway, and Till Toenshoff.


Bugs: MESOS-3235
    https://issues.apache.org/jira/browse/MESOS-3235


Repository: mesos


Description
-------

This mutex was prone to causing races at task startup by firmly
blocking an internal libprocess thread. The latch avoids this.

Failing to launch a task due to such a race did not get flagged
by directly related test failures, because the AWAIT catching this
situation was ineffective, having been placed inside a call from the
test. Only the subsequent wait for task completion triggered a test
failure then. By then it was obscured what exactly had happened.


Diffs
-----

  src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 

Diff: https://reviews.apache.org/r/42149/diff/


Testing
-------

make check
bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"


Thanks,

Bernd Mathiske


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch. Inlined awaiting fetch contention.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review113829
-----------------------------------------------------------


Bad patch!

Reviews applied: [42149]

Failed command: ./support/apply-review.sh -n -r 42149

Error:
 2016-01-11 20:14:04 URL:https://reviews.apache.org/r/42149/diff/raw/ [7747/7747] -> "42149.patch" [1]
Total errors found: 0
Checking 1 files
Error: Commit message summary (the first line) must not exceed 72 characters.

- Mesos ReviewBot


On Jan. 11, 2016, 5:15 p.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 11, 2016, 5:15 p.m.)
> 
> 
> Review request for mesos, Alexander Rojas, Jie Yu, Joseph Wu, Neil Conway, and Till Toenshoff.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review115059
-----------------------------------------------------------


Patch looks great!

Reviews applied: [42149]

Passed command: export OS=ubuntu:14.04;export CONFIGURATION="--verbose";export COMPILER=gcc; ./support/docker_build.sh

- Mesos ReviewBot


On Jan. 18, 2016, 11:28 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2016, 11:28 a.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rojas, Benjamin Hindman, Joseph Wu, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Bernd Mathiske <be...@mesosphere.io>.

> On Jan. 18, 2016, 8:26 a.m., Till Toenshoff wrote:
> > src/tests/fetcher_cache_tests.cpp, line 842
> > <https://reviews.apache.org/r/42149/diff/4/?file=1200150#file1200150line842>
> >
> >     Given that pause() and resume() will always be used on the same thread, we dont have a race here accross an already used latch just after this case, right?

Right.


- Bernd


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review115013
-----------------------------------------------------------


On Jan. 18, 2016, 3:28 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2016, 3:28 a.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rojas, Benjamin Hindman, Joseph Wu, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Till Toenshoff <to...@me.com>.

> On Jan. 18, 2016, 4:26 p.m., Till Toenshoff wrote:
> > src/tests/fetcher_cache_tests.cpp, line 837
> > <https://reviews.apache.org/r/42149/diff/4/?file=1200150#file1200150line837>
> >
> >     Lets either remove the brackets, making this an equally important comment, OR remove the second part as a whole.

Removed that comment while committing.


- Till


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review115013
-----------------------------------------------------------


On Jan. 18, 2016, 11:28 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2016, 11:28 a.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rojas, Benjamin Hindman, Joseph Wu, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Till Toenshoff <to...@me.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review115013
-----------------------------------------------------------

Ship it!


Looks good Bernd, thanks!


src/tests/fetcher_cache_tests.cpp (line 822)
<https://reviews.apache.org/r/42149/#comment175865>

    Lets either remove the brackets, making this an equally important comment, OR remove the second part as a whole.



src/tests/fetcher_cache_tests.cpp (line 827)
<https://reviews.apache.org/r/42149/#comment175866>

    Given that pause() and resume() will always be used on the same thread, we dont have a race here accross an already used latch just after this case, right?


- Till Toenshoff


On Jan. 18, 2016, 11:28 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2016, 11:28 a.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rojas, Benjamin Hindman, Joseph Wu, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review114999
-----------------------------------------------------------

Ship it!


Ship It!

- Alexander Rojas


On Jan. 18, 2016, 12:28 p.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 18, 2016, 12:28 p.m.)
> 
> 
> Review request for mesos, Adam B, Alexander Rojas, Benjamin Hindman, Joseph Wu, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Bernd Mathiske <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/
-----------------------------------------------------------

(Updated Jan. 18, 2016, 3:28 a.m.)


Review request for mesos, Adam B, Alexander Rojas, Benjamin Hindman, Joseph Wu, Till Toenshoff, and Timothy Chen.


Changes
-------

Addressed Alexander's review. Now conforming with standard pattern of "provide()" in "initialize()".


Bugs: MESOS-3235
    https://issues.apache.org/jira/browse/MESOS-3235


Repository: mesos


Description
-------

Also inlined the function that awaits fetch contention.

This mutex was prone to causing races at task startup by firmly
blocking an internal libprocess thread. The latch avoids this.

Failing to launch a task due to such a race did not get flagged
by directly related test failures, because the AWAIT catching this
situation was ineffective, having been placed inside a call from the
test. Only the subsequent wait for task completion triggered a test
failure then. By then it was obscured what exactly had happened.


Diffs (updated)
-----

  src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 

Diff: https://reviews.apache.org/r/42149/diff/


Testing
-------

make check
bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"


Thanks,

Bernd Mathiske


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Joseph Wu <jo...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review114096
-----------------------------------------------------------

Ship it!


LGTM.

- Joseph Wu


On Jan. 12, 2016, 6:55 a.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 6:55 a.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review114701
-----------------------------------------------------------



src/tests/fetcher_cache_tests.cpp (lines 803 - 804)
<https://reviews.apache.org/r/42149/#comment175572>

    This should go in a call to `process::initialize()`


- Alexander Rojas


On Jan. 12, 2016, 3:55 p.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 3:55 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Mesos ReviewBot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review114043
-----------------------------------------------------------


Patch looks great!

Reviews applied: [42149]

Passed command: export OS=ubuntu:14.04;export CONFIGURATION="--verbose";export COMPILER=gcc; ./support/docker_build.sh

- Mesos ReviewBot


On Jan. 12, 2016, 2:55 p.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 2:55 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Bernd Mathiske <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/
-----------------------------------------------------------

(Updated Jan. 12, 2016, 6:55 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy Chen.


Changes
-------

Made the pause/resume mechanism in the HTTP server more resilient to repeated calls, even though it is never used that way.


Bugs: MESOS-3235
    https://issues.apache.org/jira/browse/MESOS-3235


Repository: mesos


Description
-------

Also inlined the function that awaits fetch contention.

This mutex was prone to causing races at task startup by firmly
blocking an internal libprocess thread. The latch avoids this.

Failing to launch a task due to such a race did not get flagged
by directly related test failures, because the AWAIT catching this
situation was ineffective, having been placed inside a call from the
test. Only the subsequent wait for task completion triggered a test
failure then. By then it was obscured what exactly had happened.


Diffs (updated)
-----

  src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 

Diff: https://reviews.apache.org/r/42149/diff/


Testing
-------

make check
bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"


Thanks,

Bernd Mathiske


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Alexander Rojas <al...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/#review113981
-----------------------------------------------------------



src/tests/fetcher_cache_tests.cpp (lines 813 - 816)
<https://reviews.apache.org/r/42149/#comment174773>

    The problem with this second version is, if you call `pause()` twice, the effect is the same as:
    
    ```c++
    pause();
    resume();
    pause();
    ```
    
    since a latch triggers on [destruction](https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/latch.cpp#L37), which I doubt was your intention.


- Alexander Rojas


On Jan. 12, 2016, 2:16 p.m., Bernd Mathiske wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42149/
> -----------------------------------------------------------
> 
> (Updated Jan. 12, 2016, 2:16 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy Chen.
> 
> 
> Bugs: MESOS-3235
>     https://issues.apache.org/jira/browse/MESOS-3235
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Also inlined the function that awaits fetch contention.
> 
> This mutex was prone to causing races at task startup by firmly
> blocking an internal libprocess thread. The latch avoids this.
> 
> Failing to launch a task due to such a race did not get flagged
> by directly related test failures, because the AWAIT catching this
> situation was ineffective, having been placed inside a call from the
> test. Only the subsequent wait for task completion triggered a test
> failure then. By then it was obscured what exactly had happened.
> 
> 
> Diffs
> -----
> 
>   src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 
> 
> Diff: https://reviews.apache.org/r/42149/diff/
> 
> 
> Testing
> -------
> 
> make check
> bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"
> 
> 
> Thanks,
> 
> Bernd Mathiske
> 
>


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Bernd Mathiske <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/
-----------------------------------------------------------

(Updated Jan. 12, 2016, 5:16 a.m.)


Review request for mesos, Adam B, Benjamin Hindman, Till Toenshoff, and Timothy Chen.


Changes
-------

Less changes, creating the latch by need.


Bugs: MESOS-3235
    https://issues.apache.org/jira/browse/MESOS-3235


Repository: mesos


Description
-------

Also inlined the function that awaits fetch contention.

This mutex was prone to causing races at task startup by firmly
blocking an internal libprocess thread. The latch avoids this.

Failing to launch a task due to such a race did not get flagged
by directly related test failures, because the AWAIT catching this
situation was ineffective, having been placed inside a call from the
test. Only the subsequent wait for task completion triggered a test
failure then. By then it was obscured what exactly had happened.


Diffs (updated)
-----

  src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 

Diff: https://reviews.apache.org/r/42149/diff/


Testing
-------

make check
bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"


Thanks,

Bernd Mathiske


Re: Review Request 42149: Replaced mutex in HTTP server for fetcher cache tests with latch.

Posted by Bernd Mathiske <be...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42149/
-----------------------------------------------------------

(Updated Jan. 11, 2016, 12:21 p.m.)


Review request for mesos, Alexander Rojas, Jie Yu, Joseph Wu, Neil Conway, and Till Toenshoff.


Changes
-------

Separated second line from first line in commit comment.


Summary (updated)
-----------------

Replaced mutex in HTTP server for fetcher cache tests with latch.


Bugs: MESOS-3235
    https://issues.apache.org/jira/browse/MESOS-3235


Repository: mesos


Description (updated)
-------

Also inlined the function that awaits fetch contention.

This mutex was prone to causing races at task startup by firmly
blocking an internal libprocess thread. The latch avoids this.

Failing to launch a task due to such a race did not get flagged
by directly related test failures, because the AWAIT catching this
situation was ineffective, having been placed inside a call from the
test. Only the subsequent wait for task completion triggered a test
failure then. By then it was obscured what exactly had happened.


Diffs
-----

  src/tests/fetcher_cache_tests.cpp 1fb1e213d3c35479789688d1a3a49a3c6058b198 

Diff: https://reviews.apache.org/r/42149/diff/


Testing
-------

make check
bin/mesos-tests.sh --gtest_repeat=1000 --gtest_break_on_failure --gtest_filter="*HttpCachedConcurrent*"


Thanks,

Bernd Mathiske