You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Fanbin Bu <fa...@coinbase.com> on 2020/03/30 23:43:44 UTC

some subtask taking too long

Hi,

I m running flink 1.9 on EMR using flink sql blink planner reading and
writing to JDBC input/output. my sql is just a listagg over window for the
last 7 days. However, i notice that there are one or two subtasks that take
too long to finish. In this thread
http://mail-archives.apache.org/mod_mbox/flink-user/201901.mbox/%3CCAEv5b0yD+0WBXgAnfT0b=ZqLC8rPE9_izzE3g+9Vxw8oK9w2=A@mail.gmail.com%3E,
that is a similar issue.

Any idea on how to debug this?

Thanks
Fanbin

Re: some subtask taking too long

Posted by Piotr Nowojski <pi...@ververica.com>.
Hey,

The thread you are referring to is about DataStream API job and long checkpointing issue. While from your message it seems like you are using Table API (SQL) to process a batch data? Or what exactly do you mean by:

>  i notice that there are one or two subtasks that take too long to finish

Aside from that, don’t you have just a problem with a data skew, where some subset of keys are more heavily used than others?

Piotrek

> On 31 Mar 2020, at 01:43, Fanbin Bu <fa...@coinbase.com> wrote:
> 
> Hi,
> 
> I m running flink 1.9 on EMR using flink sql blink planner reading and writing to JDBC input/output. my sql is just a listagg over window for the last 7 days. However, i notice that there are one or two subtasks that take too long to finish. In this thread http://mail-archives.apache.org/mod_mbox/flink-user/201901.mbox/%3CCAEv5b0yD+0WBXgAnfT0b=ZqLC8rPE9_izzE3g+9Vxw8oK9w2=A@mail.gmail.com%3E <http://mail-archives.apache.org/mod_mbox/flink-user/201901.mbox/%3CCAEv5b0yD+0WBXgAnfT0b=ZqLC8rPE9_izzE3g+9Vxw8oK9w2=A@mail.gmail.com%3E>, that is a similar issue. 
> 
> Any idea on how to debug this?
> 
> Thanks
> Fanbin
>