You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/02/28 07:42:30 UTC

[GitHub] zhijiangW opened a new pull request #7856: [FLINK-11776][coordination] Refactor to simplify the process of scheduleOrUpdateConsumers

zhijiangW opened a new pull request #7856: [FLINK-11776][coordination] Refactor to simplify the process of scheduleOrUpdateConsumers
URL: https://github.com/apache/flink/pull/7856
 
 
   ## What is the purpose of the change
   
   *Based on the work of FLINK-11417, the process of `Execution#scheduleOrUpdateConsumers` do not need to consider the race condition with schedule process, so we can refactor to make it easy to handle.*
   
   - *The concurrent data structure for cacheing partial input channel descriptor can be replaced by common list.*
   - *If the consumer is in `CREATED` state, we only need to schedule it and the partition info would be known during deployment.*
   - *If the consumer is in `SCHEDULED` state, we need do nothing.*
   - *If the consumer is in `RUNNING` state, we can send partition info immediately.*
   - *If the consumer is in `DEPLOYING` state, we can cache the partition info in order to send them in batch after consumer switching to `RUNNING` state.*
   - *`PartialInputChannelDeploymentDescriptor` is not needed any more, we can cache partition info directly.*
   
   This refactoring is also a preparation work for future introducing *ShuffleMaster* in FLINK-11391.
   
   ## Brief change log
   
     - *Remove `PartialInputChannelDeploymentDescriptor`*
     - *Refactor the process of `Execution#scheduleOrUpdateConsumers`*
     - *Change the data structure for caching partition info from concurrent queue into list*
   
   ## Verifying this change
   
   This change is already covered by existing tests, such as *ScheduleOrUpdateConsumersTest*.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (yes / **no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (yes / **no**)
     - The serializers: (yes / **no** / don't know)
     - The runtime per-record code paths (performance sensitive): (yes / **no** / don't know)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / **no** / don't know)
     - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes / **no**)
     - If yes, how is the feature documented? (**not applicable** / docs / JavaDocs / not documented)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services