You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/02/15 10:03:48 UTC

[GitHub] zhijiangW opened a new pull request #7713: [FLINK-10995][network] Copy intermediate serialization results only once for broadcast mode

zhijiangW opened a new pull request #7713: [FLINK-10995][network] Copy intermediate serialization results only once for broadcast mode
URL: https://github.com/apache/flink/pull/7713
 
 
   ## What is the purpose of the change
   
   *The behavior of current channel selector is either for one channel or all the channels for broadcast mode. In broadcast mode, the intermediate serialization results would be copied into every `BufferBuilder` requested for every sub partition,  so this would affect the performance seriously especially in large scale jobs.*
   
   *We can copy to only one target `BufferBuilder` and the corresponding `BufferConsumer` would be shared by all the sub partitions to improve the performance. For mixed operations with broadcast and non-broadcast,  we should finish the previous `BufferBuilder` first before transforming from broadcast to non-broadcast, vice versa.*
   
   ## Brief change log
   
     - *Adjust the related logics for sharing the same `BufferBuilder` for all channels in broadcast mode*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
     - *Added one new test in `RecordWriterTest` to verify the reference count for broadcast mode*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
     - The serializers: (yes)
     - The runtime per-record code paths (performance sensitive): (yes)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)
     - If yes, how is the feature documented? (not applicable)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services