You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/10/22 17:10:01 UTC

[jira] [Commented] (BEAM-9198) BeamSQL aggregation analytics functionality

    [ https://issues.apache.org/jira/browse/BEAM-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17219191#comment-17219191 ] 

Beam JIRA Bot commented on BEAM-9198:
-------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean.


> BeamSQL aggregation analytics functionality 
> --------------------------------------------
>
>                 Key: BEAM-9198
>                 URL: https://issues.apache.org/jira/browse/BEAM-9198
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Rui Wang
>            Priority: P2
>              Labels: gsoc, gsoc2020, mentor, stale-P2
>          Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Mentor email: ruwang@google.com. Feel free to send emails for your questions.
> Project Information
> ---------------------
> BeamSQL has a long list of of aggregation/aggregation analytics functionalities to support. 
> To begin with, you will need to support this syntax:
> {code:sql}
> analytic_function_name ( [ argument_list ] )
>   OVER (
>     [ PARTITION BY partition_expression_list ]
>     [ ORDER BY expression [{ ASC | DESC }] [, ...] ]
>     [ window_frame_clause ]
>   )
> {code}
> As there is a long list of analytics functions, a good start point is support rank() first.
> This will requires touch core components of BeamSQL:
> 1. SQL parser to support the syntax above.
> 2. SQL core to implement physical relational operator.
> 3. Distributed algorithms to implement a list of functions in a distributed manner. 
> 4. Enable in ZetaSQL dialect.
> To understand what SQL analytics functionality is, you could check this great explanation doc: https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts.
> To know about Beam's programming model, check: https://beam.apache.org/documentation/programming-guide/#overview



--
This message was sent by Atlassian Jira
(v8.3.4#803005)