You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2019/10/03 20:59:16 UTC

Review Request 71579: Eliminate accidental loop amplification over unreachable tasks.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71579/
-----------------------------------------------------------

Review request for mesos, haosdent huang and Meng Zhu.


Bugs: MESOS-9889
    https://issues.apache.org/jira/browse/MESOS-9889


Repository: mesos


Description
-------

Per MESOS-9889, the foreachkey operator on a multimap will actually
loop over each <key,value> entry, thus looping over the same key
multiple times. The code in the master is written such that
excessive looping occurs.

For example: <F1,T1> <F1,T2> <F1,T3>
This will consider T1,T2,T3 3 times each rather than once each!


Diffs
-----

  src/master/master.hpp 23eb2a6854b1b53f80167cc8425339530de78e81 
  src/master/master.cpp 65994aa72ec4bfefdb09b88a62db910727bb0897 


Diff: https://reviews.apache.org/r/71579/diff/1/


Testing
-------

make check


Thanks,

Benjamin Mahler


Re: Review Request 71579: Eliminate accidental loop amplification over unreachable tasks.

Posted by Benjamin Mahler <bm...@apache.org>.

> On Oct. 3, 2019, 9:25 p.m., Meng Zhu wrote:
> > Can you briefly describe the fix in the description? e.g.
> > ```
> > This patch replaces `multimap<k, v>` with `hasmap<k, vector<v>>`.
> > ```
> > 
> > A quick search of multimap shows more instances. After a few spot checks, I don't see multimap.keys().
> > Maybe we should consider getting rid of `Multimap<K, V>::keys`.
> > 
> > Also, it is not clear to me why we used multimap in a few places (other than a little bit less code), they could be replaced with `hasmap<k, vector<v>>` for better performance. e.g. a common used method, `multihashmap.contains(k, v)` copies the value of the given key:
> > https://github.com/apache/mesos/blob/master/3rdparty/stout/include/stout/multimap.hpp#L126

> Can you briefly describe the fix in the description?

will do!

> A quick search of multimap shows more instances. After a few spot checks, I don't see multimap.keys().
> Maybe we should consider getting rid of Multimap<K, V>::keys.

The keys() function does the right thing and only returns unique keys, but it also copies all the keys which is expensive.
Ideally we could do something to fix foreach(), I think we would need to track the previous key and skip it, but doing so might be problematic with some loop bodies (e.g. that mutate the container). In any case, there's the more general ticket to address this: https://issues.apache.org/jira/browse/MESOS-5037

> Also, it is not clear to me why we used multimap in a few places (other than a little bit less code), they could be replaced with hasmap<k, vector<v>> for better performance. e.g. a common used method, multihashmap.contains(k, v) copies the value of the given key:

It's also not that obvious to me, multimap can avoid the empty vector bug that we've run into a few times. I think probably it depends on whether we reason about it as a collection of items for a given key vs a bunch of keyed values without key uniqueness. Maybe more importantly whether iterator validity on insertion property of multimap is needed. If we can eliminate the error-prone foreachkey pattern, I don't see a reason to avoid multimaps.

> multihashmap.contains(k, v) copies the value of the given key

yeah, that's a bad implementation that can be fixed


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71579/#review218061
-----------------------------------------------------------


On Oct. 3, 2019, 8:59 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71579/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2019, 8:59 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Meng Zhu.
> 
> 
> Bugs: MESOS-9889
>     https://issues.apache.org/jira/browse/MESOS-9889
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Per MESOS-9889, the foreachkey operator on a multimap will actually
> loop over each <key,value> entry, thus looping over the same key
> multiple times. The code in the master is written such that
> excessive looping occurs.
> 
> For example: <F1,T1> <F1,T2> <F1,T3>
> This will consider T1,T2,T3 3 times each rather than once each!
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 23eb2a6854b1b53f80167cc8425339530de78e81 
>   src/master/master.cpp 65994aa72ec4bfefdb09b88a62db910727bb0897 
> 
> 
> Diff: https://reviews.apache.org/r/71579/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 71579: Eliminate accidental loop amplification over unreachable tasks.

Posted by Benjamin Mahler <bm...@apache.org>.

> On Oct. 3, 2019, 9:25 p.m., Meng Zhu wrote:
> > Can you briefly describe the fix in the description? e.g.
> > ```
> > This patch replaces `multimap<k, v>` with `hasmap<k, vector<v>>`.
> > ```
> > 
> > A quick search of multimap shows more instances. After a few spot checks, I don't see multimap.keys().
> > Maybe we should consider getting rid of `Multimap<K, V>::keys`.
> > 
> > Also, it is not clear to me why we used multimap in a few places (other than a little bit less code), they could be replaced with `hasmap<k, vector<v>>` for better performance. e.g. a common used method, `multihashmap.contains(k, v)` copies the value of the given key:
> > https://github.com/apache/mesos/blob/master/3rdparty/stout/include/stout/multimap.hpp#L126
> 
> Benjamin Mahler wrote:
>     > Can you briefly describe the fix in the description?
>     
>     will do!
>     
>     > A quick search of multimap shows more instances. After a few spot checks, I don't see multimap.keys().
>     > Maybe we should consider getting rid of Multimap<K, V>::keys.
>     
>     The keys() function does the right thing and only returns unique keys, but it also copies all the keys which is expensive.
>     Ideally we could do something to fix foreach(), I think we would need to track the previous key and skip it, but doing so might be problematic with some loop bodies (e.g. that mutate the container). In any case, there's the more general ticket to address this: https://issues.apache.org/jira/browse/MESOS-5037
>     
>     > Also, it is not clear to me why we used multimap in a few places (other than a little bit less code), they could be replaced with hasmap<k, vector<v>> for better performance. e.g. a common used method, multihashmap.contains(k, v) copies the value of the given key:
>     
>     It's also not that obvious to me, multimap can avoid the empty vector bug that we've run into a few times. I think probably it depends on whether we reason about it as a collection of items for a given key vs a bunch of keyed values without key uniqueness. Maybe more importantly whether iterator validity on insertion property of multimap is needed. If we can eliminate the error-prone foreachkey pattern, I don't see a reason to avoid multimaps.
>     
>     > multihashmap.contains(k, v) copies the value of the given key
>     
>     yeah, that's a bad implementation that can be fixed

I also checked all multimap and multihashmap usage, and found no other uses of foreachkey, but it's very risky without addressing https://issues.apache.org/jira/browse/MESOS-5037. Someone could easily introduce a foreachkey, especially to replace a correct keys() usage.


- Benjamin


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71579/#review218061
-----------------------------------------------------------


On Oct. 3, 2019, 8:59 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71579/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2019, 8:59 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Meng Zhu.
> 
> 
> Bugs: MESOS-9889
>     https://issues.apache.org/jira/browse/MESOS-9889
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Per MESOS-9889, the foreachkey operator on a multimap will actually
> loop over each <key,value> entry, thus looping over the same key
> multiple times. The code in the master is written such that
> excessive looping occurs.
> 
> For example: <F1,T1> <F1,T2> <F1,T3>
> This will consider T1,T2,T3 3 times each rather than once each!
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 23eb2a6854b1b53f80167cc8425339530de78e81 
>   src/master/master.cpp 65994aa72ec4bfefdb09b88a62db910727bb0897 
> 
> 
> Diff: https://reviews.apache.org/r/71579/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 71579: Eliminate accidental loop amplification over unreachable tasks.

Posted by Meng Zhu <mz...@mesosphere.io>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71579/#review218061
-----------------------------------------------------------


Ship it!




Can you briefly describe the fix in the description? e.g.
```
This patch replaces `multimap<k, v>` with `hasmap<k, vector<v>>`.
```

A quick search of multimap shows more instances. After a few spot checks, I don't see multimap.keys().
Maybe we should consider getting rid of `Multimap<K, V>::keys`.

Also, it is not clear to me why we used multimap in a few places (other than a little bit less code), they could be replaced with `hasmap<k, vector<v>>` for better performance. e.g. a common used method, `multihashmap.contains(k, v)` copies the value of the given key:
https://github.com/apache/mesos/blob/master/3rdparty/stout/include/stout/multimap.hpp#L126

- Meng Zhu


On Oct. 3, 2019, 1:59 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71579/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2019, 1:59 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Meng Zhu.
> 
> 
> Bugs: MESOS-9889
>     https://issues.apache.org/jira/browse/MESOS-9889
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Per MESOS-9889, the foreachkey operator on a multimap will actually
> loop over each <key,value> entry, thus looping over the same key
> multiple times. The code in the master is written such that
> excessive looping occurs.
> 
> For example: <F1,T1> <F1,T2> <F1,T3>
> This will consider T1,T2,T3 3 times each rather than once each!
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 23eb2a6854b1b53f80167cc8425339530de78e81 
>   src/master/master.cpp 65994aa72ec4bfefdb09b88a62db910727bb0897 
> 
> 
> Diff: https://reviews.apache.org/r/71579/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>


Re: Review Request 71579: Eliminate accidental loop amplification over unreachable tasks.

Posted by Mesos Reviewbot <re...@mesos.apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71579/#review218062
-----------------------------------------------------------



Patch looks great!

Reviews applied: [71579]

Passed command: export OS='ubuntu:14.04' BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose --disable-libtool-wrappers --disable-parallel-test-execution' ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh

- Mesos Reviewbot


On Oct. 3, 2019, 8:59 p.m., Benjamin Mahler wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71579/
> -----------------------------------------------------------
> 
> (Updated Oct. 3, 2019, 8:59 p.m.)
> 
> 
> Review request for mesos, haosdent huang and Meng Zhu.
> 
> 
> Bugs: MESOS-9889
>     https://issues.apache.org/jira/browse/MESOS-9889
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Per MESOS-9889, the foreachkey operator on a multimap will actually
> loop over each <key,value> entry, thus looping over the same key
> multiple times. The code in the master is written such that
> excessive looping occurs.
> 
> For example: <F1,T1> <F1,T2> <F1,T3>
> This will consider T1,T2,T3 3 times each rather than once each!
> 
> 
> Diffs
> -----
> 
>   src/master/master.hpp 23eb2a6854b1b53f80167cc8425339530de78e81 
>   src/master/master.cpp 65994aa72ec4bfefdb09b88a62db910727bb0897 
> 
> 
> Diff: https://reviews.apache.org/r/71579/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>