You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/08/26 17:01:45 UTC

[jira] [Commented] (FLINK-2577) Watermarks Stall When a Source Finishes Prematurely

    [ https://issues.apache.org/jira/browse/FLINK-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14713614#comment-14713614 ] 

ASF GitHub Bot commented on FLINK-2577:
---------------------------------------

GitHub user aljoscha opened a pull request:

    https://github.com/apache/flink/pull/1060

    [FLINK-2577] Fix Stalling Watermarks when Sources Close

    Before, when one source closes early it will not emit watermarks
    anymore. Downstream operations don't know about this and expect
    watermarks to keep on coming. This leads to watermarks not being
    forwarded anymore.
    
    Now, when a source closes it will emit a final watermark with timestamp
    Long.MAX_VALUE. This will have the effect of allowing the watermarks
    from the other operations to propagate though because the watermark is
    defined as the minimum over all inputs.
    
    The Long.MAX_VALUE watermark has the added benefit of notifying
    operations that no more elements will arrive in the future.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aljoscha/flink watermark-fix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1060.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1060
    
----
commit 4132802d2b8fb673473b02e2f918ece0262b7c5c
Author: Aljoscha Krettek <al...@gmail.com>
Date:   2015-08-26T13:46:20Z

    [FLINK-2577] Fix Stalling Watermarks when Sources Close
    
    Before, when one source closes early it will not emit watermarks
    anymore. Downstream operations don't know about this and expect
    watermarks to keep on coming. This leads to watermarks not being
    forwarded anymore.
    
    Now, when a source closes it will emit a final watermark with timestamp
    Long.MAX_VALUE. This will have the effect of allowing the watermarks
    from the other operations to propagate though because the watermark is
    defined as the minimum over all inputs.
    
    The Long.MAX_VALUE watermark has the added benefit of notifying
    operations that no more elements will arrive in the future.

----


> Watermarks Stall When a Source Finishes Prematurely
> ---------------------------------------------------
>
>                 Key: FLINK-2577
>                 URL: https://issues.apache.org/jira/browse/FLINK-2577
>             Project: Flink
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 0.10
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>             Fix For: 0.10
>
>
> The problem with a streaming source that closes is that downstream operations never notice that it is not running anymore and keep waiting for watermarks from all upstream operations (including the source). This has the effect that watermarks just stop propagating through the topology.
> I think an easy fix is to change sources to emit a last watermark of +Inf before closing. Because watermarks are always the minimum of all watermarks on the inputs this would have the effect of advancing only depending on the other inputs.
> The added benefit would be that once all sources emit a +Inf watermark the operator also get's a last +Inf watermark which tells it that all sources are done. Right now, streaming operators (and user code) have no way of telling if there are going to come elements in the future. This is especially problematic in Co-Map (Co-FlatMap) operations where you have one input that feeds a hash-table and the other input is elements that you want to stream by this hash-table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)