You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/10/26 08:50:42 UTC

[GitHub] [flink] StephanEwen commented on pull request #13595: [FLINK-19582][network] Introduce sort-merge based blocking shuffle to Flink

StephanEwen commented on pull request #13595:
URL: https://github.com/apache/flink/pull/13595#issuecomment-716404546

@wsry

I am confused a bit about the use of max buffers. That value is the upper limit of buffers that will be assigned to the shuffle from the global pool. It is only used for streaming pipelined connections, to limit the amount of in-flight buffers (so checkpoints don't take too long). For batch it should not be used, because there is no harm in using as much memory as possible.

The min buffers is actually hat decouples the memory use from the parallelism. If the min-buffers is related to the number of subpartitions, then we still have the problem that shuffles fail on large parallelism (parallelism is higher than available memory buffers).

So, in conclusion, I think we should not have a max value (because it does not help in decoupling from parallelism) and also need to decouple the min value from the parallelism.

I also liked the idea of having the `taskmanager.network.sort-shuffle.min-parallelism` as a flag. That way low parallelisms (< 50) could use the hash shuffle and larger parallelisms could use the sort shuffle.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org