You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2018/08/15 19:32:45 UTC

[Performance WG] Meeting Notes - August 15

For folks that missed it, here are my notes. Thanks to jie for presenting!

(1) Jie presented a containerization benchmark:

https://reviews.apache.org/r/68266/

The motivation to add this was the mount table read issue that came up
originally in MESOS-8418 [1]. We only pushed a short term fix for container
metrics requests, and the performance of recovering / launching containers
is still affected by it until MESOS-9081 [2] is fixed.

The benchmark launches N containers, then destroys them and waits for them
to terminate. Running it against a 24 core (48 hyperthreaded core) machine
for 1000 containers showed roughly a minute for getting 1000 containers
launched and a similar amount of time to get them destroyed. We haven't
spent time optimizing this so there should be room for improvement.

It will be interesting to extend this to compare apples to apples with the
numbers in:

https://kubernetes.io/blog/2018/05/24/kubernetes-containerd-integration-goes-ga/#pod-startup-latency

Please share feedback on the review if you have any thoughts!


(2) I gave an update on other performance work:

  (a) State serving improvements: The parallel serving for the /state
endpoint is committed and will land in 1.7.0. There's still more to do here
to improve performance by avoiding an extra trip through the master's queue
during authorization [3], as well as migrate all reads over to parallel
serving.

  (b) There were some authentication scalability issues discovered during a
scale test. Fixes are underway. [4] [5] [6] [7]

  (c) There remains still some ongoing performance improvements to the
allocator. Recently, range subtraction for ports resources was improved
[8]. The main focus right now is additional copy elimination of Resources,
and the approach that we're attempting is to make Resources copy-on-write
[9]. Regardless, 1.7.0 does already include some significant improvements
to allocation cycle performance and I'll be sure to cover it in the blog
post.


(3) Lastly, we had a very brief discussion about jemalloc. James' agreed to
share on the existing thread some more details about potential
compatibility issues if we were to link our libraries against it.


Agenda Doc:
https://docs.google.com/document/d/12hWGuzbqyNWc2l1ysbPcXwc0pzHEy
4bodagrlNGCuQU


[1] https://issues.apache.org/jira/browse/MESOS-8418
[2] https://issues.apache.org/jira/browse/MESOS-9081
[3] https://issues.apache.org/jira/browse/MESOS-9082
[4] https://issues.apache.org/jira/browse/MESOS-9144
[5] https://issues.apache.org/jira/browse/MESOS-9145
[6] https://issues.apache.org/jira/browse/MESOS-9146
[7] https://issues.apache.org/jira/browse/MESOS-9147
[8] https://issues.apache.org/jira/browse/MESOS-9086
[9] https://issues.apache.org/jira/browse/MESOS-6765