You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Beam JIRA Bot (Jira)" <ji...@apache.org> on 2020/08/14 17:07:02 UTC

[jira] [Commented] (BEAM-9198) BeamSQL aggregation analytics functionality

    [ https://issues.apache.org/jira/browse/BEAM-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177907#comment-17177907 ] 

Beam JIRA Bot commented on BEAM-9198:
-------------------------------------

This issue is assigned but has not received an update in 30 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned.

> BeamSQL aggregation analytics functionality 
> --------------------------------------------
>
>                 Key: BEAM-9198
>                 URL: https://issues.apache.org/jira/browse/BEAM-9198
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Rui Wang
>            Assignee: John Mora
>            Priority: P2
>              Labels: gsoc, gsoc2020, mentor, stale-assigned
>          Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> Mentor email: ruwang@google.com. Feel free to send emails for your questions.
> Project Information
> ---------------------
> BeamSQL has a long list of of aggregation/aggregation analytics functionalities to support. 
> To begin with, you will need to support this syntax:
> {code:sql}
> analytic_function_name ( [ argument_list ] )
>   OVER (
>     [ PARTITION BY partition_expression_list ]
>     [ ORDER BY expression [{ ASC | DESC }] [, ...] ]
>     [ window_frame_clause ]
>   )
> {code}
> As there is a long list of analytics functions, a good start point is support rank() first.
> This will requires touch core components of BeamSQL:
> 1. SQL parser to support the syntax above.
> 2. SQL core to implement physical relational operator.
> 3. Distributed algorithms to implement a list of functions in a distributed manner. 
> 4. Enable in ZetaSQL dialect.
> To understand what SQL analytics functionality is, you could check this great explanation doc: https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts.
> To know about Beam's programming model, check: https://beam.apache.org/documentation/programming-guide/#overview



--
This message was sent by Atlassian Jira
(v8.3.4#803005)