You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Steven Wu <st...@gmail.com> on 2018/11/21 21:50:11 UTC

backpressure metrics

Flink has two backpressure related metrics: “lastCheckpointAlignmentBuffered”
and “checkpointAlignmentTime”. But they seems to always report zero.
Similar thing in web UI, “Buffered During Alignment” always shows zero,
even backpressure testing shows high backpressure for some operators. Has
anyone else seen similar problem?

We are running flink 1.4.0 with some cherry-picked fixes. there was a bug
and fix for 1.5 and above, which shouldn't affect us
https://issues.apache.org/jira/browse/FLINK-10135

Thanks,
Steven

Re: backpressure metrics

Posted by Steven Wu <st...@gmail.com>.
Nargarjun, thanks a lot for the reply, which makes sense to me. Yes, we are
running with AT_LEAST_ONCE mode.

On Wed, Nov 21, 2018 at 3:19 PM Nagarjun Guraja <na...@gmail.com> wrote:

> Hi Steven,
>
> The metric 'Buffered During Alignment' you are talking about will always
> be zero when the job is run in ATLEAST_ONCE mode. Is that the case with
> your job? My understanding is, backpressure can only be monitored by
> sampling thread stacktraces and interpreting the situation based on the
> contention for network buffers on demand.
>
> Regards,
> Nagarjun
>
> *Success is not final, failure is not fatal: it is the courage to continue
> that counts. *
> *- Winston Churchill - *
>
>
> On Wed, Nov 21, 2018 at 1:50 PM Steven Wu <st...@gmail.com> wrote:
>
>>
>> Flink has two backpressure related metrics: “
>> lastCheckpointAlignmentBuffered” and “checkpointAlignmentTime”. But they
>> seems to always report zero. Similar thing in web UI, “Buffered During
>> Alignment” always shows zero, even backpressure testing shows high
>> backpressure for some operators. Has anyone else seen similar problem?
>>
>> We are running flink 1.4.0 with some cherry-picked fixes. there was a bug
>> and fix for 1.5 and above, which shouldn't affect us
>> https://issues.apache.org/jira/browse/FLINK-10135
>>
>> Thanks,
>> Steven
>>
>

Re: backpressure metrics

Posted by Nagarjun Guraja <na...@gmail.com>.
Hi Steven,

The metric 'Buffered During Alignment' you are talking about will always be
zero when the job is run in ATLEAST_ONCE mode. Is that the case with your
job? My understanding is, backpressure can only be monitored by sampling
thread stacktraces and interpreting the situation based on the contention
for network buffers on demand.

Regards,
Nagarjun

*Success is not final, failure is not fatal: it is the courage to continue
that counts. *
*- Winston Churchill - *


On Wed, Nov 21, 2018 at 1:50 PM Steven Wu <st...@gmail.com> wrote:

>
> Flink has two backpressure related metrics: “
> lastCheckpointAlignmentBuffered” and “checkpointAlignmentTime”. But they
> seems to always report zero. Similar thing in web UI, “Buffered During
> Alignment” always shows zero, even backpressure testing shows high
> backpressure for some operators. Has anyone else seen similar problem?
>
> We are running flink 1.4.0 with some cherry-picked fixes. there was a bug
> and fix for 1.5 and above, which shouldn't affect us
> https://issues.apache.org/jira/browse/FLINK-10135
>
> Thanks,
> Steven
>