You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by StephanEwen <gi...@git.apache.org> on 2015/09/23 21:27:03 UTC

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/1175

    [FLINK-2753] [streaming] [api breaking] Add first parts of new window API for key grouped windows

    This follows the API design outlined in https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Streams
    
    This change is API breaking because it adds new generic type parameters to Java and Scala classes, breaking binary compatibility.
    
    This new API uses in the background the dedicated operators added in a previous pull request, which improved robustness and performance and fixed several correctness bugs.
    
    ### New window API
    
    ```java
    DataStream<MyType> stream = ...;
    
    stream.keyBy("id")
          .window(Time.of(5, SECONDS))
          .reduceWindow( (a, b) -> a.fuse(b) )
    ```
    
    The Pull Request also introduces the first parts of defining the programs Time characteristic on the StreamExecutionEnvironment:
      - *Processing time*
      - *Ingestion time*
      - *Event time*
    
    Eventually, the other window mechanisms should be replaced by this:
    
      - Windows per key are covered in this new mechanism
      - Global windows (global parallel discretization) were decided to be dropped for safety and replaced by single-group windows (non parallel) with possible parallel pre-aggregation (for aligned time windows)
    
    ### Follow-up
    
      - Adding the generic window operator based on generic policies (adaption of the current windowing system by @aljoscha ), see https://issues.apache.org/jira/browse/FLINK-2677
      - Automatically activating and configuring Watermark generation at the data sources
      - Adding a dedicated Event Time Window operator
      - Adding the single-group non-parallel windows

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink windows

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1175
    
----
commit 61c3666a272413940e65a2195b87c7472a8e8806
Author: Stephan Ewen <se...@apache.org>
Date:   2015-09-23T10:05:54Z

    [FLINK-2753] [streaming] [api breaking] Add first parts of new window API for key grouped windows
    
    This follows the API design outlined in https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Streams
    
    This is API breaking because it adds new generic type parameters to Java and Scala classes, breaking binary compatibility.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-142940983
  
    @StephanEwen  How attached are you to your Generic Type names? Because in the current code base we have single character uppercase style (T, E, V, C and so on, as is the sun/oracle style) and multi character uppercase style (OUT, KEY). Now you introduced a third style in some of your new classes. Here the generic parameters have the same naming convention as class names.
    
    I think we have to standardize a bit here. :smile: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-142941895
  
    Not super attached. I like the new style, but I see the point that it is too much style diversity. So let's change it to the style `OUT`, `KEY`, ... ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-142943745
  
    I don't favor a particular style, I just think consistency is key here. The google style guide, for example allows single char (T, E, K) and camel case with an affixed T, i.e. ResultT, KeyT, and so on.
    
    I'll adapt your code while finishing the windowing stuff based on what we decide. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-142881222
  
    BTW: For now, the old mechanism still exists. Once this one subsumed it, we can remove the old mechanism...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-142856852
  
    Can you also update the documentation to reflect the new windowing API there?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen closed the pull request at:

    https://github.com/apache/flink/pull/1175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1175#discussion_r40295722
  
    --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java ---
    @@ -516,6 +527,30 @@ public void registerType(Class<?> type) {
     	}
     
     	// --------------------------------------------------------------------------------------------
    +	//  Time characteristic
    +	// --------------------------------------------------------------------------------------------
    +
    +	/**
    +	 * Sets the time characteristic for the stream, e.g., processing time, event time,
    +	 * or ingestion time.
    +	 * 
    +	 * @param characteristic The time characteristic.
    +	 */
    +	public void setStreamTimeCharacteristic(TimeCharacteristic characteristic) {
    --- End diff --
    
    Are there plans to allow setting different time characteristics for different streams in the same topology?
    For example an ingestion time and a event time stream?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-144015350
  
    Subsumed by #1184 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1175#issuecomment-142881137
  
    I would like to update the documentation once it is finalized. I actually expect this to change a bit while we add more features.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2753] [streaming] [api breaking] Add fi...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1175#discussion_r40303645
  
    --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java ---
    @@ -516,6 +527,30 @@ public void registerType(Class<?> type) {
     	}
     
     	// --------------------------------------------------------------------------------------------
    +	//  Time characteristic
    +	// --------------------------------------------------------------------------------------------
    +
    +	/**
    +	 * Sets the time characteristic for the stream, e.g., processing time, event time,
    +	 * or ingestion time.
    +	 * 
    +	 * @param characteristic The time characteristic.
    +	 */
    +	public void setStreamTimeCharacteristic(TimeCharacteristic characteristic) {
    --- End diff --
    
    Yes, we are thinking about it for the future. Semantics for that are not trivial, so the first version will not do that...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---