You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@mesos.apache.org by Benjamin Mahler <bm...@apache.org> on 2017/08/03 22:08:36 UTC

Re: Review Request 61125: Track the total number of time series samples.

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61125/#review182175
-----------------------------------------------------------



I would imagine if people want to know this information they would use a counter, e.g.:

```
request_latency_ms/p50
request_latency_ms/p90
...
request_latency_ms/total = 1000

vs.

request_count = 1000
```

Here, getting the total number of requests by looking at the total number of latency samples seems rather indirect and a bit redundant with just keeping a separate counter. Do you have any examples that can motivate this?

- Benjamin Mahler


On July 25, 2017, 11:57 p.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61125/
> -----------------------------------------------------------
> 
> (Updated July 25, 2017, 11:57 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Kevin Klues, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6918
>     https://issues.apache.org/jira/browse/MESOS-6918
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The time series only keeps a windows of data, so we can know
> the current number of samples, but we can't get a count of the
> total number of samples. Add a counter to the TimeSeries so
> that we can now the total number of samples seen in the series.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/statistics.hpp e9f1fc23bf83f92a2e7de94dba0df48272cc3394 
>   3rdparty/libprocess/include/process/timeseries.hpp 64b10a8d551ba33e252aa33987e3d5da8d56a1d6 
>   3rdparty/libprocess/src/tests/statistics_tests.cpp 144b5109cfb7640b29bec8de8f5b2ad00665212f 
> 
> 
> Diff: https://reviews.apache.org/r/61125/diff/1/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>

Re: Review Request 61125: Track the total number of time series samples.

Posted by Jiang Yan Xu <ya...@jxu.me>.


> On Aug. 3, 2017, 3:08 p.m., Benjamin Mahler wrote:
> > I would imagine if people want to know this information they would use a counter, e.g.:
> > 
> > ```
> > request_latency_ms/p50
> > request_latency_ms/p90
> > ...
> > request_latency_ms/total = 1000
> > 
> > vs.
> > 
> > request_count = 1000
> > ```
> > 
> > Here, getting the total number of requests by looking at the total number of latency samples seems rather indirect and a bit redundant with just keeping a separate counter. Do you have any examples that can motivate this?

IIUC it would only make sense with the field "sum" which is added in /r/61126/. In this case probably splitting the patches made it not as clear as keeping them together in one patch would.


- Jiang Yan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61125/#review182175
-----------------------------------------------------------


On July 25, 2017, 4:57 p.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61125/
> -----------------------------------------------------------
> 
> (Updated July 25, 2017, 4:57 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Kevin Klues, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6918
>     https://issues.apache.org/jira/browse/MESOS-6918
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The time series only keeps a windows of data, so we can know
> the current number of samples, but we can't get a count of the
> total number of samples. Add a counter to the TimeSeries so
> that we can now the total number of samples seen in the series.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/statistics.hpp e9f1fc23bf83f92a2e7de94dba0df48272cc3394 
>   3rdparty/libprocess/include/process/timeseries.hpp 64b10a8d551ba33e252aa33987e3d5da8d56a1d6 
>   3rdparty/libprocess/src/tests/statistics_tests.cpp 144b5109cfb7640b29bec8de8f5b2ad00665212f 
> 
> 
> Diff: https://reviews.apache.org/r/61125/diff/1/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>

Re: Review Request 61125: Track the total number of time series samples.

Posted by James Peach <jp...@apache.org>.


> On Aug. 3, 2017, 10:08 p.m., Benjamin Mahler wrote:
> > I would imagine if people want to know this information they would use a counter, e.g.:
> > 
> > ```
> > request_latency_ms/p50
> > request_latency_ms/p90
> > ...
> > request_latency_ms/total = 1000
> > 
> > vs.
> > 
> > request_count = 1000
> > ```
> > 
> > Here, getting the total number of requests by looking at the total number of latency samples seems rather indirect and a bit redundant with just keeping a separate counter. Do you have any examples that can motivate this?
> 
> Jiang Yan Xu wrote:
>     IIUC it would only make sense with the field "sum" which is added in /r/61126/. In this case probably splitting the patches made it not as clear as keeping them together in one patch would.

I think that the current behaviour is not that useful, since you are basically publishing a saturating counter. When every cluster I look as has 1000 for the value of `allocator/mesos/allocation_run_ms/count`, that's not helping me at all. If you count all the events, then I can plot a graph that shows me a rate and I can use that to reason about the system.


- James


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61125/#review182175
-----------------------------------------------------------


On July 25, 2017, 11:57 p.m., James Peach wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61125/
> -----------------------------------------------------------
> 
> (Updated July 25, 2017, 11:57 p.m.)
> 
> 
> Review request for mesos, Benjamin Mahler, Kevin Klues, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-6918
>     https://issues.apache.org/jira/browse/MESOS-6918
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> The time series only keeps a windows of data, so we can know
> the current number of samples, but we can't get a count of the
> total number of samples. Add a counter to the TimeSeries so
> that we can now the total number of samples seen in the series.
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/include/process/statistics.hpp e9f1fc23bf83f92a2e7de94dba0df48272cc3394 
>   3rdparty/libprocess/include/process/timeseries.hpp 64b10a8d551ba33e252aa33987e3d5da8d56a1d6 
>   3rdparty/libprocess/src/tests/statistics_tests.cpp 144b5109cfb7640b29bec8de8f5b2ad00665212f 
> 
> 
> Diff: https://reviews.apache.org/r/61125/diff/1/
> 
> 
> Testing
> -------
> 
> make check (Fedora 26)
> 
> 
> Thanks,
> 
> James Peach
> 
>