You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/01/29 21:55:39 UTC

[jira] [Commented] (FLINK-3296) DataStream.write*() methods are not flushing properly

    [ https://issues.apache.org/jira/browse/FLINK-3296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124194#comment-15124194 ] 

ASF GitHub Bot commented on FLINK-3296:
---------------------------------------

GitHub user rmetzger opened a pull request:

    https://github.com/apache/flink/pull/1563

    [FLINK-3296] Remove 'flushing' behavior of the OutputFormat in DataStream API

    I removed the `FileSinkFunctionByMillis` and removed all the `millis` arguments on the writing functions.
    
    The whole "buffering" and "flushing" functionality was broken: Elements were kept in an ArrayList and send to the OutputFormat on "flush()". However, the flush was not really called periodically. It was only checked when new records arrived. So when a stream is not having elements for a certain time, the last few elements would just stay in the list until new elements arrive again.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rmetzger/flink flink3296

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1563.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1563
    
----
commit 3275adaaf27f6e1ec74ffc2a48169239da0e1f5b
Author: Robert Metzger <rm...@apache.org>
Date:   2016-01-28T13:56:29Z

    [FLINK-3296] Remove 'flushing' behavior of the OutputFormat support of the DataStream API

----


> DataStream.write*() methods are not flushing properly
> -----------------------------------------------------
>
>                 Key: FLINK-3296
>                 URL: https://issues.apache.org/jira/browse/FLINK-3296
>             Project: Flink
>          Issue Type: Bug
>          Components: Streaming Connectors
>            Reporter: Robert Metzger
>            Assignee: Robert Metzger
>            Priority: Critical
>
> The DataStream.write() methods rely on the {{FileSinkFunctionByMillis}} class, which has a logic for flushing records, even though the underlying stream is never flushed. This is misleading for users as files are not written as they would expect it.
> The code was initial written with FileOutputFormats in mind, but the types were not set correctly. This PR opened the write() method to any output format: https://github.com/apache/flink/pull/706/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)