You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Rui Wang (Jira)" <ji...@apache.org> on 2020/08/04 03:06:00 UTC

[jira] [Commented] (CALCITE-4146) Implement EMIT Syntax

    [ https://issues.apache.org/jira/browse/CALCITE-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170523#comment-17170523 ] 

Rui Wang commented on CALCITE-4146:
-----------------------------------

The second.

The emit every 1 minute will propagate to both two TableFunctionScanRel. Each TableFunctionScanRel will emit data for each window (if there is any) in every 1 minute. And then the JOIN will be applied on data in the same window from both sides.



[~julianhyde] I converted this JIRA to umbrella jira to host high level discussions. More small tasks will be created as sub-tasks.  

> Implement EMIT Syntax
> ---------------------
>
>                 Key: CALCITE-4146
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4146
>             Project: Calcite
>          Issue Type: New Feature
>            Reporter: Rui Wang
>            Assignee: Rui Wang
>            Priority: Major
>
> The goal is to support the following syntax:
> {code:sql}
> SELECT clause
> FROM TUMBLE/HOP/SESSION
> [EMIT] 
> {code}
> EMIT Syntax  is proposed in [One SQL to Rule Them All|https://arxiv.org/pdf/1905.12133.pdf]. This idea proposes a way to allow streaming SQL queries control materialization latency.
> Regarding the types of emit strategies, due to limit pages, that paper only lists two strategies, and Calcite should support at least four categories:
> 1. Event time triggers. Emitting depends on the relationship between
> watermark and event timestamp of events. Handling late data is also included
> in this category.
> 2. Processing time triggers. Emitting depends on the system clock. This is
> a natural idea of emitting. E.g. emit the current result every hour without
> considering if data in a window is already complete.
> 3. data-driven triggers. E.g. emit when accumulated events exceed a
> threshold (e.g. emit when have acculucated 1000 events)
> 4. Composite triggers. There is a need to concat 1, 2, 3 by OR and AND to
> achieve better latency control.
> There are more context discussed in [CALCITE-3272|https://issues.apache.org/jira/browse/CALCITE-3272?focusedCommentId=17166580&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17166580] and the [EMIT syntax proposal for event-timestamp semantic windowing|https://lists.apache.org/thread.html/r5bd9a6f7af2c0cd81aecd4de512fd889fbf15f112cc3704f188b1d4f%40%3Cdev.calcite.apache.org%3E] email thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)