You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by tison1 <gi...@git.apache.org> on 2018/07/17 04:12:23 UTC

[GitHub] flink pull request #6345: [FLINK-9869] Send PartitionInfo in batch to Improv...

GitHub user tison1 opened a pull request:

    https://github.com/apache/flink/pull/6345

    [FLINK-9869] Send PartitionInfo in batch to Improve perfornance

    ## What is the purpose of the change
    
    Current we send partition info as soon as one arrive. we could `cachePartitionInfo` and then `sendPartitionInfoAsync`, which will improve performance.
    
    ... also improve task deployment
    
    ## Brief change log
    
    - `Execution`
      - now deploy task in another thread
      - as describe above, now we first `cachePartitionInfo` and then `sendPartitionInfoAsync`
    - add a config option `JobManagerOptions#UPDATE_PARTITION_INFO_SEND_INTERVAL`, which config the time window for cachePartitionInfo
    - update `ExecutionGraphDeploymentTest` and `ExecutionVertexDeploymentTest`, which also tests changes above
    
    ## Verifying this change
    
    This change is already covered by existing tests, such as `ExecutionGraphDeploymentTest` and `ExecutionVertexDeploymentTest`
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (don't know)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
      - The S3 file system connector: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no, it's internal)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tison1/flink partition-improve

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/6345.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6345
    
----
commit ca9ffbb99e91a8415d7469cba4bf2075615edc0d
Author: 陈梓立 <wa...@...>
Date:   2018-07-17T04:11:36Z

    [FLINK-9869] Send PartitionInfo in batch to Improve perfornance

----


---

[GitHub] flink issue #6345: [FLINK-9869] Send PartitionInfo in batch to Improve perfo...

Posted by tison1 <gi...@git.apache.org>.
Github user tison1 commented on the issue:

    https://github.com/apache/flink/pull/6345
  
    cc @sihuazhou 


---

[GitHub] flink issue #6345: [FLINK-9869] Send PartitionInfo in batch to Improve perfo...

Posted by tison1 <gi...@git.apache.org>.
Github user tison1 commented on the issue:

    https://github.com/apache/flink/pull/6345
  
    OK. This PR is about performance improvement. I will try to give out a benchmark, but since it is inspired by our own batch table tasks, it might take time to give one. Though since this PR concurrently send partition info and deploy task in another thread, it theoretically does good.
    
    Keep on on Flink 1.6! I will nudge you guys to review this one, though(laughed)


---

[GitHub] flink issue #6345: [FLINK-9869] Send PartitionInfo in batch to Improve perfo...

Posted by tison1 <gi...@git.apache.org>.
Github user tison1 commented on the issue:

    https://github.com/apache/flink/pull/6345
  
    cc @tillrohrmann @fhueske 


---

[GitHub] flink issue #6345: [FLINK-9869] Send PartitionInfo in batch to Improve perfo...

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the issue:

    https://github.com/apache/flink/pull/6345
  
    Thanks for opening this PR @tison1. The Flink community is currently preparing the Flink 1.6 release and, thus, it could take a bit longer until someone reviews your PR. Please bear with us until then. Thanks a lot!


---