You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Yi Pan (Data Infrastructure) (JIRA)" <ji...@apache.org> on 2015/05/16 04:04:00 UTC

[jira] [Commented] (SAMZA-552) Implement window operator in Samza

    [ https://issues.apache.org/jira/browse/SAMZA-552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546497#comment-14546497 ] 

Yi Pan (Data Infrastructure) commented on SAMZA-552:
----------------------------------------------------

Merged to samza-sql based on RB review.

> Implement window operator in Samza
> ----------------------------------
>
>                 Key: SAMZA-552
>                 URL: https://issues.apache.org/jira/browse/SAMZA-552
>             Project: Samza
>          Issue Type: New Feature
>          Components: sql
>    Affects Versions: 0.9.0
>            Reporter: Yi Pan (Data Infrastructure)
>            Assignee: Yi Pan (Data Infrastructure)
>              Labels: design, project
>         Attachments: DESIGN-SAMZA-552-3.md, DESIGN-SAMZA-552-3.pdf, DESIGN-SAMZA-552-6.md, DESIGN-SAMZA-552-6.pdf, DESIGN-SAMZA-552-7.md, DESIGN-SAMZA-552-7.pdf, SAMZA-552-0.patch
>
>
> The discussion is based on how to support tuple and/or time based window operators in Samza physical operator layer.
> Here are the few observations:
> # Tuple represents the “physical ordering” of events while time-based window has semantic meanings to users
> # Total ordering between tuples are possible within Samza/Kafka given a deterministic MessageSelector on all input streams and offsets within each stream
> # No matter whether tuple or time is used to measure the window size, the window termination condition is needed to close a window to avoid the job to be wedged forever
> The following questions have to be answered to fully implement a window operator:
> # how to determine that a window is closed and no new tuples will be added?
> ## For tuple based, how do we close the window if messages do not come or get delayed?
> ## For time based, how do we close the window if
> ### the messages are not strictly in order w/ the time?
> ### the message w/ timestamp greater than the window boundary does not come or gets delayed?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)