You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by tzulitai <gi...@git.apache.org> on 2016/06/08 11:14:29 UTC

[GitHub] flink pull request #2081: [FLINK-4020][streaming-connectors] Move shard list...

GitHub user tzulitai opened a pull request:

    https://github.com/apache/flink/pull/2081

    [FLINK-4020][streaming-connectors] Move shard list querying to open() for Kinesis consumer

    Remove shard list querying from the constructor, and let all subtasks independently discover which shards it should consume from in open(). This change is a prerequisite for [FLINK-3231](https://issues.apache.org/jira/browse/FLINK-3231).
    
    Explanation for some changes that might seem irrelevant:
    1. Changed naming of some variables / methods: Since the behaviour of shard assignment to subtasks is now (and will continue to be in the future after FLINK-3231) more like "discovering shards for consuming" instead of "being assigned shards", I've changed the "assignedShards" related namings to "discoveredShards".
    2. I've removed some tests, due to the fact that the corresponding parts of the code will be subject to quite a bit of change with the upcoming changes of [FLINK-3231](https://issues.apache.org/jira/browse/FLINK-3231). Tests will be added back with FLINK-3231.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tzulitai/flink FLINK-4020

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2081.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2081
    
----
commit 1db426be73f572aec2041cb1a9da6ad49425f392
Author: Gordon Tai <go...@vm5.com>
Date:   2016-06-08T10:46:02Z

    [FLINK-4020] Move shard list querying to open() for Kinesis consumer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2081: [FLINK-4020][streaming-connectors] Move shard list queryi...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/2081
  
    Okay, thank you. I'll wait then.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2081: [FLINK-4020][streaming-connectors] Move shard list queryi...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/2081
  
    Hi @rmetzger,
    Thanks for letting me know. However, I'd like to close this PR for now for the following reasons:
    
    1. The new shard-to-subtask assignment logic introduced with this change will actually need to be moved again to run() as part of implementing Kinesis reshard handling [FLINK-3231](https://issues.apache.org/jira/browse/FLINK-3231).
    2. I've testing this change a bit more on Kinesis streams with high shard counts, and it seems like the implementation needs more guarantee on that all subtasks will be able to get the shard list without failing with Amazon's LimitExceededException even after 3 retries. Since the implementation for FLINK-3231 will have a separate thread that polls for changes in the shard list, I'd like to strengthen this guarantee as part of FLINK-3231's PR.
    
    I'm almost done with FLINK-3231, and will reopen a PR to resolve FLINK-3231 and FLINK-4020 together. I'll keep you updated!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2081: [FLINK-4020][streaming-connectors] Move shard list queryi...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/2081
  
    Hi @rmetzger,
    Update: I'm closing this PR now. The new PR with FLINK-4020 & FLINK-3231 is at https://github.com/apache/flink/pull/2131.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2081: [FLINK-4020][streaming-connectors] Move shard list queryi...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/2081
  
    I'll try to review this change soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2081: [FLINK-4020][streaming-connectors] Move shard list...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai closed the pull request at:

    https://github.com/apache/flink/pull/2081


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---