You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Aljoscha Krettek (JIRA)" <ji...@apache.org> on 2015/08/20 11:09:46 UTC

[jira] [Commented] (FLINK-2550) Rework DataStream API

    [ https://issues.apache.org/jira/browse/FLINK-2550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704556#comment-14704556 ] 

Aljoscha Krettek commented on FLINK-2550:
-----------------------------------------

I think there are still some issues that need to be ironed out (and/or put into design documents):
- Window Semantics
- Window Meta Information

The first one concerns how windows are specified. The second one concerns information that the user might want to have about a window. For example in this:

{code:java}
DataStream input = ...
DataStram result = input
  .keyBy(...)
  .window(10 Sec).every(2 sec)
  .sum(...)
  .map(new MyUserFunction)
{code}

The user function {{MyUserFunction}} gets the aggregation result for each window but it has no way of discovering from which window the data actually came. Also the "from-which-window" only makes sense for certain types of windows, such as time windows. So I'm not quite sure yet how this should be represented in the API.

> Rework DataStream API
> ---------------------
>
>                 Key: FLINK-2550
>                 URL: https://issues.apache.org/jira/browse/FLINK-2550
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 0.9
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>             Fix For: 0.10
>
>
> After discussions on the mailing list we arrived at a consensus to rework the streaming API to make it more fool-proof and easier to use. The resulting design document is available here: https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Streams



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)