You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by tzulitai <gi...@git.apache.org> on 2016/08/24 17:12:20 UTC

[GitHub] flink pull request #2414: [FLINK-4341] Let idle consumer subtasks emit max v...

GitHub user tzulitai opened a pull request:

    https://github.com/apache/flink/pull/2414

    [FLINK-4341] Let idle consumer subtasks emit max value watermarks and fail on resharding

    This is a short-term fix, until the min-watermark service for the JobManager described in the JIRA discussion is available.
    
    The way this fix works is that we let idle subtasks that initially don't get assigned shards emit a `Long.MAX_VALUE` watermark. Also, we _only fail hard if an idle subtask_ is assigned new shards when resharding happens, to avoid messing up the watermarks. So, if all subtasks are not initially idle on startup (i.e., when total number of shards > consumer parallelism), the Kinesis consumer can still transparently handle resharding like before without failing.
    
    I've tested exactly-once with our manual tests (with and w/o resharding) and the fix works nicely, still retaining exactly-once guarantee despite non-transparency. However, I'm a bit unsure on how to test if the unbounded state with window operators is also fixed with this change, so we're still yet to clarify this.
    
    R: @rmetzger and @aljoscha for review. Thanks in advance!

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tzulitai/flink FLINK-4341

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2414.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2414
    
----
commit bc8e50d99be745300f7418c58e9d30abc5469ba3
Author: Gordon Tai <tz...@gmail.com>
Date:   2016-08-24T08:38:06Z

    [FLINK-4341] Let idle consumer subtasks emit max value watermarks and fail on resharding
    
    This no longer allows the Kinesis consumer to transparently handle resharding.
    This is a short-term workaround until we have a min-watermark notification service available in the JobManager.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    Ah yes, correct. I'll update this soon.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    Thank you for the pull request. I'll merge it to master and the release-1.1 branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    To include the missing case @rmetzger mentioned, it turns out the fix is actually more complicated than I expected due to correct state determination after every reshard, and requires a bit of rework on our current shard discovery mechanism to get it right.
    
    Heads-up notice that this will probably need a re-review. Sorry for the delay, I'm currently still on it, hopefully will update the PR by the end of today ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    Thanks @rmetzger !


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    Thank you for opening a pull request to fix the issue.
    
    I think we also need to cover another case: What happens when the number of shards has been reduced in a resharding and some fetchers are now without a shard? I think in that case, the worker also needs to emit a final Long.MAX_VALUE, and it has to fail once it gets a shard assigned again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #2414: [FLINK-4341] Let idle consumer subtasks emit max v...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/2414


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    Minus @rmetzger's comment this looks good to merge! Thanks for fixing this @tzulitai!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #2414: [FLINK-4341] Let idle consumer subtasks emit max value wa...

Posted by tzulitai <gi...@git.apache.org>.
Github user tzulitai commented on the issue:

    https://github.com/apache/flink/pull/2414
  
    @rmetzger, @aljoscha the changes are ready for another review now, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---