You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2018/08/07 21:36:51 UTC

[GitHub] walterddr opened a new pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream

walterddr opened a new pull request #6521: [FLINK-5315][table] Adding support for distinct operation for table API on DataStream
URL: https://github.com/apache/flink/pull/6521
 
 
   ## What is the purpose of the change
   
   * Adding `distinct` aggregation support for Table API. Example usages are:
     - For built-in expressions `'a.count.distinct`
     - For user-defined aggregate functions `udaggFunc.distinct('a, 'b)`
   
   ## Brief change log
   
     - *Added `distinctAgg` operator in expression as aggregation*
     - *Create aggregation resolve rules in `operators` to accept distinct aggregation modifier before getting to actual aggregation*
     - *Modified UDAGG function interface to add `distinct` modifier API*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
     - *Added integration tests for UDAGG Function call and Expression aggregation, respectively. *
     - *Added unit-test for both cases (prefix modifier for UDAGG, and suffix modifier for expressions) as well as added unsupported use cases (suffix modifier for UDAGG).
     - *Backward compatibility for other aggregations are covered with existing unit-test*
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no, but added `private[flink]` modifier for `AggregationFunction` API which might have been exposed to Java API)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (yes)
     - If yes, how is the feature documented? (not documented yet)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services