You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2018/09/14 14:50:31 UTC

[GitHub] pnowojski opened a new pull request #6698: [FLINK-8581][network] Move flushing remote subpartitions from OutputFlusher to netty

pnowojski opened a new pull request #6698: [FLINK-8581][network] Move flushing remote subpartitions from OutputFlusher to netty
URL: https://github.com/apache/flink/pull/6698
 
 
   first commit comes from https://github.com/apache/flink/pull/6697
   
   This solves GC issues for cases with low latency (small flushTimeout) and many output channels and generally significantly improves low latency performance.
       
   OutputFlusher remains as for now to trigger flushes for local subpartitions.
       
   Registering periodic flushes in netty is unfortunately not the most beautiful thing in the world at the moment. It is complicated by two things:
       1. we do know about flushTimeout only in flink-streaming-java and StreamTask, which is long after the point when we are actually creating subpartitions
       2. we do not know before hand which subpartitions will be local and which will be remote
   
   ![Benchmark results](https://docs.google.com/spreadsheets/d/e/2PACX-1vQ4ImkIhEVyd0JuC0_KBzSiZk1ugqRYYJ29fftj8f7bvQHsyNTrS9PBS2g7YaI6q7kfyHXpWWsnb5lq/pubchart?oid=1194867281&format=image)
   
   Average throughput is significantly higher only for extreme cases, however the very important improvement here is solving (mitigating?) current GC issues, which is visible on the "min" graph. Without this change 1ms latency with 1000+ output channels suffers from frequent very long GC pauses.
   
   ## Verifying this change
   
   This change is cover by existing network stack tests, stress tests and almost all it cases.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (**yes** / no / don't know)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services