You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Ewen (JIRA)" <ji...@apache.org> on 2017/01/17 13:59:26 UTC
[jira] [Created] (FLINK-5529) Improve / extends windowing
documentation
Stephan Ewen created FLINK-5529:
-----------------------------------
Summary: Improve / extends windowing documentation
Key: FLINK-5529
URL: https://issues.apache.org/jira/browse/FLINK-5529
Project: Flink
Issue Type: Sub-task
Components: Documentation
Reporter: Stephan Ewen
Assignee: Kostas Kloudas
Fix For: 1.2.0, 1.3.0
Suggested Outline:
{code}
Windows
(0) Outline: The anatomy of a window operation
stream
[.keyBy(...)] <- keyed versus non-keyed windows
.window(...) <- required: "assigner"
[.trigger(...)] <- optional: "trigger" (else default trigger)
[.evictor(...)] <- optional: "evictor" (else no evictor)
[.allowedLateness()] <- optional, else zero
.reduce/fold/apply() <- required: "function"
(1) Types of windows
- tumble
- slide
- session
- global
(2) Pre-defined windows
timeWindow() (tumble, slide)
countWindow() (tumble, slide)
- mention that count windows are inherently
resource leaky unless limited key space
(3) Window Functions
- apply: most basic, iterates over elements in window
- aggregating: reduce and fold, can be used with "apply()" which will get one element
- forward reference to state size section
(4) Advanced Windows
- assigner
- simple
- merging
- trigger
- registering timers (processing time, event time)
- state in triggers
- life cycle of a window
- create
- state
- cleanup
- when is window contents purged
- when is state dropped
- when is metadata (like merging set) dropped
(5) Late data
- picture
- fire vs fire_and_purge: late accumulates vs late resurrects (cf discarding mode)
(6) Evictors
- TDB
(7) State size: How large will the state be?
Basic rule: Each element has one copy per window it is assigned to
--> num windows * num elements in window
--> example: tumbline is one copy, sliding(n,m) is n/m copies
--> per key
Pre-aggregation:
- if reduce or fold is set -> one element per window (rather than num elements in window)
- evictor voids pre-aggregation from the perspective of state
Special rules:
- fold cannot pre-aggregate on session windows (and other merging windows)
(8) Non-keyed windows
- all elements through the same windows
- currently not parallel
- possible parallel in the future when having pre-aggregation functions
- inherently (by definition) produce a result stream with parallelism one
- state similar to one key of keyed windows
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)