You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by "Wanglan (Lan)" <la...@huawei.com> on 2016/02/14 07:44:57 UTC

答复: About Stream SQL

Great discussions!

It seems kind of agreement has been reached . In my opinion, the window definitions are the basic concepts we should clearly describe first. How is the progress? Do we need to create a jira or something? 

Btw, happy Chinese new year ;) !

Lan
-----邮件原件-----
发件人: Fabian Hueske [mailto:fhueske@gmail.com] 
发送时间: 2016年2月6日 17:29
收件人: dev@calcite.apache.org
主题: Re: About Stream SQL

Excellent! I missed the punctuations in the todo list.

What kind of strategies do you have in mind to handle events that arrive too late? I see 1. dropping of late events 2. computing an updated window result for each late arriving element (implies that the window state is stored for a certain period before it is discarded) 3. computing a delta to the previous window result for each late arriving element (requires window state as well, not applicable to all aggregation
types)

It would be nice if strategies to handle late-arrivers could be defined in the query.

I think the plans of the Flink community are quite well aligned with your ideas for SQL on Streams.
Should we start by updating / extending the Stream document on the Calcite website to include the new window definitions (TUMBLE, HOP) and a discussion of punctuations/watermarks/time bounds?

Fabian






2016-02-06 2:35 GMT+01:00 Julian Hyde <jh...@apache.org>:

> Let me rephrase: The *majority* of the literature, of which I cited 
> just one example, calls them punctuation, and a couple of recent 
> papers out of Mountain View doesn't change that.
>
> There are some fine distinctions between punctuation, heartbeats, 
> watermarks and rowtime bounds, mostly in terms of how they are 
> generated and propagated, that matter little when planning the query.
>
> On Fri, Feb 5, 2016 at 5:18 PM, Ted Dunning <te...@gmail.com> wrote:
> > On Fri, Feb 5, 2016 at 5:10 PM, Julian Hyde <jh...@apache.org> wrote:
> >
> >> Yes, watermarks, absolutely. The "to do" list has "punctuation", 
> >> which is the same thing. (Actually, I prefer to call it "rowtime bound"
> >> because it is feels more like a dynamic constraint than a piece of 
> >> data, but the literature[1] calls them punctuation.)
> >>
> >
> > Some of the literature calls them punctuation, other literature [1] 
> > calls them watermarks.
> >
> > [1] http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf
>