You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by JING ZHANG <be...@gmail.com> on 2021/09/01 02:42:03 UTC

Re: Flink performance with multiple operators reshuffling data

Hi Jason,
> In our case, our input/output ratio of these Flin operators are all 1 to
1, so I guess it doesn't matter that much..
Yes
> But I think the keys we are using in general are pretty uniform.
Cool. You could run for a period of time to see if there is data skew. If
there is indeed a data skew, then consider how to solve it.

Best,
JING ZHANG

Jason Liu <ja...@ucla.edu> 于2021年8月31日周二 下午4:23写道：

> Thanks for the help guys!
>
> Yea we can potentially append random strings to the keys and duplicate
> data across them to avoid skewness, if necessary. But I think the keys we
> are using in general are pretty uniform.
> The lowest selectivity at the up fornt method is really interesting
> though. In our case, our input/output ratio of these Flin operators are all
> 1 to 1, so I guess it doesn't matter that much..?
> It's good to know Flink would be scalable in this situation.
>
> -Jason
>
>
>