You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Lynch Lee (JIRA)" <ji...@apache.org> on 2018/02/09 09:33:00 UTC

[jira] [Commented] (FLINK-6428) Add support DISTINCT in dataStream SQL

    [ https://issues.apache.org/jira/browse/FLINK-6428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358160#comment-16358160 ] 

Lynch Lee commented on FLINK-6428:
----------------------------------

[~fhueske]  I want use flink sql into my product, but i need some suggestion from you, thanks .

For this sql:   SELECT distinct a, b, c FROM t GROUP BY a, b, c

why must we put the fields b,c into the group by keys while the distinct is on field a ??  

> Add support DISTINCT in dataStream SQL
> --------------------------------------
>
>                 Key: FLINK-6428
>                 URL: https://issues.apache.org/jira/browse/FLINK-6428
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API &amp; SQL
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>            Priority: Major
>
> Add support DISTINCT in dataStream SQL as follow:
> DATA:
> {code}
> (name, age)
> (kevin, 28),
> (sunny, 6),
> (jack, 6)
> {code}
> SQL:
> {code}
> SELECT DISTINCT age FROM MyTable"
> {code}
> RESULTS:
> {code}
> 28, 6
> {code}
> To DataStream:
> {code}
> inputDS
>   .keyBy() // KeyBy on all fields
>   .flatMap() //  Eliminate duplicate data
> {code}
> [~fhueske] do we need this feature?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)