You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@mesos.apache.org by Anindya Sinha <an...@apple.com> on 2017/05/16 06:24:13 UTC

Re: Review Request 53841: Added metrics for sorting of the sorters in the allocator.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/
-----------------------------------------------------------

(Updated May 16, 2017, 6:24 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
-------

Rebased and minor edits.


Bugs: MESOS-6579
    https://issues.apache.org/jira/browse/MESOS-6579


Repository: mesos


Description
-------

Following metrics have been added:
a) allocator/mesos/roles/sort_runs: Number of role level sorts.
b) allocator/mesos/roles/sort_run: Latency in role level sorts.
c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.


Diffs (updated)
-----

  src/master/allocator/mesos/hierarchical.hpp 123f97cf495bff0f822838e09df0d88818f04da6 
  src/master/allocator/sorter/drf/metrics.hpp 61568cb520826ab59d675824b212e0d3deb63764 
  src/master/allocator/sorter/drf/metrics.cpp ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
  src/tests/hierarchical_allocator_tests.cpp 08180b9975869de328f0c095dd3cddf0c84fbecf 


Diff: https://reviews.apache.org/r/53841/diff/4/

Changes: https://reviews.apache.org/r/53841/diff/3-4/


Testing
-------

All tests passed.


Thanks,

Anindya Sinha


Re: Review Request 53841: Added metrics for sorting in the role and quota sorters.

Posted by Anindya Sinha <an...@apple.com>.

> On May 26, 2017, 4:56 p.m., Jiang Yan Xu wrote:
> > For this and the next review, could you summarize how these metrics can be used to reason about the allocator/sorter's performance?
> > 
> > I agree that conceptually we'd like something that tells us how well the (dirty or overall) sort performs but it's not immediately clear how to derive that from the provided metrics because `sort` on each sorter is called many times during one allocation, multiple times per agent. The three sorters are of the same implementation, how to interpret the metrics from each? The amount of work split between each sorter seems to be pretty dynamic?
> > 
> > Also given the frequency that the timer (relatively expensive) is invoked, how much overhead would it cost the sort()? This is probably worth measuring if we add these.

The latency in role (and quota) level sorts indicate how much time in the allocate cycle is being spent on the sorts. This can be significant if:

i) A very high level of roles/quotas are being sorted; and
ii) Events that make the `dirty` flag `false` occuring significant number of times (determined from the number of sorts being performed indicated by `allocator/mesos/roles/sort_runs` or `allocator/mesos/quotas/sort_runs`) leading to a significant number of times the sorts are actually happening.

These stats by themselves are indicative of an actual problem but should help in diagnosing such conditions.


- Anindya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/#review176110
-----------------------------------------------------------


On May 24, 2017, 4:23 a.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53841/
> -----------------------------------------------------------
> 
> (Updated May 24, 2017, 4:23 a.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6579
>     https://issues.apache.org/jira/browse/MESOS-6579
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Following metrics have been added:
> a) allocator/mesos/roles/sort_runs: Number of role level sorts.
> b) allocator/mesos/roles/sort_run: Latency in role level sorts.
> c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
> d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.
> 
> 
> Diffs
> -----
> 
>   docs/monitoring.md cb2833642e7e41c03c98ea92f7300d156a216a2e 
>   src/master/allocator/mesos/hierarchical.hpp 123f97cf495bff0f822838e09df0d88818f04da6 
>   src/master/allocator/sorter/drf/metrics.hpp 61568cb520826ab59d675824b212e0d3deb63764 
>   src/master/allocator/sorter/drf/metrics.cpp ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
>   src/tests/hierarchical_allocator_tests.cpp f911110068a50c822aa90b864329ae87c9b5f8bb 
> 
> 
> Diff: https://reviews.apache.org/r/53841/diff/7/
> 
> 
> Testing
> -------
> 
> All tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>


Re: Review Request 53841: Added metrics for sorting in the role and quota sorters.

Posted by Jiang Yan Xu <ya...@jxu.me>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/#review176110
-----------------------------------------------------------



For this and the next review, could you summarize how these metrics can be used to reason about the allocator/sorter's performance?

I agree that conceptually we'd like something that tells us how well the (dirty or overall) sort performs but it's not immediately clear how to derive that from the provided metrics because `sort` on each sorter is called many times during one allocation, multiple times per agent. The three sorters are of the same implementation, how to interpret the metrics from each? The amount of work split between each sorter seems to be pretty dynamic?

Also given the frequency that the timer (relatively expensive) is invoked, how much overhead would it cost the sort()? This is probably worth measuring if we add these.

- Jiang Yan Xu


On May 23, 2017, 9:23 p.m., Anindya Sinha wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53841/
> -----------------------------------------------------------
> 
> (Updated May 23, 2017, 9:23 p.m.)
> 
> 
> Review request for mesos, James Peach and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6579
>     https://issues.apache.org/jira/browse/MESOS-6579
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Following metrics have been added:
> a) allocator/mesos/roles/sort_runs: Number of role level sorts.
> b) allocator/mesos/roles/sort_run: Latency in role level sorts.
> c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
> d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.
> 
> 
> Diffs
> -----
> 
>   docs/monitoring.md cb2833642e7e41c03c98ea92f7300d156a216a2e 
>   src/master/allocator/mesos/hierarchical.hpp 123f97cf495bff0f822838e09df0d88818f04da6 
>   src/master/allocator/sorter/drf/metrics.hpp 61568cb520826ab59d675824b212e0d3deb63764 
>   src/master/allocator/sorter/drf/metrics.cpp ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
>   src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
>   src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
>   src/tests/hierarchical_allocator_tests.cpp f911110068a50c822aa90b864329ae87c9b5f8bb 
> 
> 
> Diff: https://reviews.apache.org/r/53841/diff/7/
> 
> 
> Testing
> -------
> 
> All tests passed.
> 
> 
> Thanks,
> 
> Anindya Sinha
> 
>


Re: Review Request 53841: Added metrics for sorting in the role and quota sorters.

Posted by Anindya Sinha <an...@apple.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/
-----------------------------------------------------------

(Updated May 24, 2017, 4:23 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
-------

Rebased after the 1st patch in the chain was pushed.


Bugs: MESOS-6579
    https://issues.apache.org/jira/browse/MESOS-6579


Repository: mesos


Description
-------

Following metrics have been added:
a) allocator/mesos/roles/sort_runs: Number of role level sorts.
b) allocator/mesos/roles/sort_run: Latency in role level sorts.
c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.


Diffs (updated)
-----

  docs/monitoring.md cb2833642e7e41c03c98ea92f7300d156a216a2e 
  src/master/allocator/mesos/hierarchical.hpp 123f97cf495bff0f822838e09df0d88818f04da6 
  src/master/allocator/sorter/drf/metrics.hpp 61568cb520826ab59d675824b212e0d3deb63764 
  src/master/allocator/sorter/drf/metrics.cpp ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
  src/tests/hierarchical_allocator_tests.cpp f911110068a50c822aa90b864329ae87c9b5f8bb 


Diff: https://reviews.apache.org/r/53841/diff/7/

Changes: https://reviews.apache.org/r/53841/diff/6-7/


Testing
-------

All tests passed.


Thanks,

Anindya Sinha


Re: Review Request 53841: Added metrics for sorting in the role and quota sorters.

Posted by Anindya Sinha <an...@apple.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/
-----------------------------------------------------------

(Updated May 23, 2017, 4:30 p.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
-------

Rebase.


Bugs: MESOS-6579
    https://issues.apache.org/jira/browse/MESOS-6579


Repository: mesos


Description
-------

Following metrics have been added:
a) allocator/mesos/roles/sort_runs: Number of role level sorts.
b) allocator/mesos/roles/sort_run: Latency in role level sorts.
c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.


Diffs (updated)
-----

  docs/monitoring.md a027f4905a0e6e41ff4e1348d34fd7aa5f1cbe61 
  src/master/allocator/mesos/hierarchical.hpp 123f97cf495bff0f822838e09df0d88818f04da6 
  src/master/allocator/sorter/drf/metrics.hpp 61568cb520826ab59d675824b212e0d3deb63764 
  src/master/allocator/sorter/drf/metrics.cpp ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
  src/tests/hierarchical_allocator_tests.cpp 6dee2296d5a14185dbf7eee17968b20148839bfd 


Diff: https://reviews.apache.org/r/53841/diff/6/

Changes: https://reviews.apache.org/r/53841/diff/5-6/


Testing
-------

All tests passed.


Thanks,

Anindya Sinha


Re: Review Request 53841: Added metrics for sorting in the role and quota sorters.

Posted by Anindya Sinha <an...@apple.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53841/
-----------------------------------------------------------

(Updated May 16, 2017, 7:22 a.m.)


Review request for mesos, James Peach and Jiang Yan Xu.


Changes
-------

Updated docs.


Summary (updated)
-----------------

Added metrics for sorting in the role and quota sorters.


Bugs: MESOS-6579
    https://issues.apache.org/jira/browse/MESOS-6579


Repository: mesos


Description
-------

Following metrics have been added:
a) allocator/mesos/roles/sort_runs: Number of role level sorts.
b) allocator/mesos/roles/sort_run: Latency in role level sorts.
c) allocator/mesos/quotas/sort_runs: Number of quota level sorts.
d) allocator/mesos/quotas/sort_run: Latency in quota level sorts.


Diffs (updated)
-----

  docs/monitoring.md c42afce0a26dc15e033ee1375b7cb23d55be40ab 
  src/master/allocator/mesos/hierarchical.hpp 123f97cf495bff0f822838e09df0d88818f04da6 
  src/master/allocator/sorter/drf/metrics.hpp 61568cb520826ab59d675824b212e0d3deb63764 
  src/master/allocator/sorter/drf/metrics.cpp ff63fbac5bbcf54e1ae39c3b650c0dafe7ea46d4 
  src/master/allocator/sorter/drf/sorter.hpp fee58d6d1f08163e2a06a4a20c891fe535c3dcff 
  src/master/allocator/sorter/drf/sorter.cpp 26b77f578f3235a8792c72d4575d607cdb2c7de7 
  src/tests/hierarchical_allocator_tests.cpp 08180b9975869de328f0c095dd3cddf0c84fbecf 


Diff: https://reviews.apache.org/r/53841/diff/5/

Changes: https://reviews.apache.org/r/53841/diff/4-5/


Testing
-------

All tests passed.


Thanks,

Anindya Sinha