You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Terry Chia-Wei Wu <te...@gmail.com> on 2020/07/31 02:40:40 UTC

is it possible one task manager stuck and still fetching data from Kinesis?

We are running Flink 1.10 about 900+ task managers with kinesis as an input
stream. The problem we are having now is that only Max Age of kinesis shard
is growing and the average age of that kinesis is very low meaning most of
shards having very low age. We already checked the data skew issue but it's
quite uniformly distributed. Any idea how this can happen and how to debug
on this issue? I'm wondering is it possible to have one TM's operator stuck
and source still fetching data so that Kinesis's age still going high.

Terry

Re: is it possible one task manager stuck and still fetching data from Kinesis?

Posted by Till Rohrmann <tr...@apache.org>.
Hi Terry,

I am not a Kinesis expert that's why I've pulled in Thomas and Max who
might know more about Flink's Kinesis behaviour. What could help, though,
would be access to the Flink cluster logs to see whether something fishy is
going on.

Cheers,
Till

On Fri, Jul 31, 2020 at 4:41 AM Terry Chia-Wei Wu <te...@gmail.com> wrote:

> We are running Flink 1.10 about 900+ task managers with kinesis as an
> input stream. The problem we are having now is that only Max Age of
> kinesis shard is growing and the average age of that kinesis is very low
> meaning most of shards having very low age. We already checked the data
> skew issue but it's quite uniformly distributed. Any idea how this can
> happen and how to debug on this issue? I'm wondering is it possible to have
> one TM's operator stuck and source still fetching data so that
> Kinesis's age still going high.
>
> Terry
>
>
>