You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Philipp Bussche <ph...@gmail.com> on 2017/09/17 13:17:48 UTC

Sink metric numRecordsIn drops temporarily

Hi there, I witnessed an interesting behaviour on my Grafana dashboard where
sink related metrics would show activity where there should not be any. I am
saying this because for this particular sink activity is triggered by data
being ingested through a cronjob at a particular time, however the dashboard
is saying there is activity also outside this time.
I had a closer look and in my graph I am using the NonNegativeDerivative
function (the data actually sits in Graphite) on the metric. Disabling this
filter shows that for a short period of time the numRecordsIn counter is
dropping and then gets back to the previous value. This drop is then shown
on the graph and is looking like data activity because of the
NonNegativeDerivative function.
Why would the value of a counter temporarily decrease and then go back to
its previous level ?
Please see screenshots attached.

Thanks
Philipp

<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t576/Sink_numRecordsIn_NonNegativeDerivative_.png> 
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t576/Sink_numRecordsIn_Value_Drops.png> 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Sink metric numRecordsIn drops temporarily

Posted by Michael Fong <mc...@gmail.com>.
Hi,

Thank you for your description. What I tried to understand is what the
counter value is at that moment of spikes. Grafana would take the average
out of a continuous data values before rendering result to UI. That is, if
the metrics value is not transmitted continuously, where at some data point
appears to be zeros, then the average value over time would be lower than
the snapshot value. I would suggest to first check what the value is by
zooming into the minimum scale in term of data retention policy set in
Graphite. (per minute, or second, depending on settings)


I actually do not have concrete answer for that counter in Flink. Perhaps
someone knows better on the semantics of this metrics would. However, there
is a possibility which we have observed similarly in other Java
application. This usually happens to a fast-growing counter, when its next
proceeding value exceeds its positive upper bound. Normally, metrics
library does not reset its value to 0. If I remember correctly,
Long.MAX_VALUE + 1 = Long.MIN_VALUE, take long data type for example.
Therefore, taking NonNegativeDerivative( delta ) results in a very high
peak in graph.

Hope this helps.

On Sun, Sep 17, 2017 at 11:02 PM, Philipp Bussche <philipp.bussche@gmail.com
> wrote:

> Hi, thank you for your answer. So for September 11th which is shown on the
> screenshot I had the counter sitting at 26.91k where when the drop happened
> it was going down to 26.01k. This happened 3 times during that day and it
> was always going back to the same value.
> Philipp
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/
>

Re: Sink metric numRecordsIn drops temporarily

Posted by Philipp Bussche <ph...@gmail.com>.
Hi, thank you for your answer. So for September 11th which is shown on the
screenshot I had the counter sitting at 26.91k where when the drop happened
it was going down to 26.01k. This happened 3 times during that day and it
was always going back to the same value.
Philipp



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Sink metric numRecordsIn drops temporarily

Posted by Michael Fong <mc...@gmail.com>.
Missed to cc to user@flink.apache.org

Hi,
>
> Just wondering what is the value of that counter (wo/ applying NonNegativeDerivative
> function) when you observe the spikes? If I remember correctly, Grafana
> is known to aggregate those values by averaging them across the time
> duration selected before rendering to the front-end. The charts show value
> across multiple days, and what values do that metric stand at minute scale?
>
> Regards,
>
> Michael
>
> On Sun, Sep 17, 2017 at 9:17 PM, Philipp Bussche <
> philipp.bussche@gmail.com> wrote:
>
>> Hi there, I witnessed an interesting behaviour on my Grafana dashboard
>> where
>> sink related metrics would show activity where there should not be any. I
>> am
>> saying this because for this particular sink activity is triggered by data
>> being ingested through a cronjob at a particular time, however the
>> dashboard
>> is saying there is activity also outside this time.
>> I had a closer look and in my graph I am using the NonNegativeDerivative
>> function (the data actually sits in Graphite) on the metric. Disabling
>> this
>> filter shows that for a short period of time the numRecordsIn counter is
>> dropping and then gets back to the previous value. This drop is then shown
>> on the graph and is looking like data activity because of the
>> NonNegativeDerivative function.
>> Why would the value of a counter temporarily decrease and then go back to
>> its previous level ?
>> Please see screenshots attached.
>>
>> Thanks
>> Philipp
>>
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.
>> nabble.com/file/t576/Sink_numRecordsIn_NonNegativeDerivative_.png>
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.
>> nabble.com/file/t576/Sink_numRecordsIn_Value_Drops.png>
>>
>>
>>
>> --
>> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.
>> nabble.com/
>>
>
>