You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/02 08:30:00 UTC
[jira] [Commented] (FLINK-8780) Add Broadcast State documentation.
[ https://issues.apache.org/jira/browse/FLINK-8780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460708#comment-16460708 ]
ASF GitHub Bot commented on FLINK-8780:
---------------------------------------
Github user tzulitai commented on a diff in the pull request:
https://github.com/apache/flink/pull/5922#discussion_r185417488
--- Diff: docs/dev/stream/state/broadcast_state.md ---
@@ -0,0 +1,281 @@
+---
+title: "The Broadcast State Pattern"
+nav-parent_id: streaming_state
+nav-pos: 2
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+* ToC
+{:toc}
+
+[Working with State](state.html) described operator state which is either **evenly** distributed among the parallel
+tasks of an operator, or state which **upon restore**, its partial (task) states are **unioned** and the whole state is
+used to initialize the restored parallel tasks.
+
+A third type of supported *operator state* is the *Broadcast State*. Broadcast state was introduced to support use-cases
+where some data coming from one stream is required to be broadcasted to all downstream tasks, where it is stored locally
+and is used to process all incoming elements on the other stream. As an example where broadcast state can emerge as a
+natural fit, one can imagine a low-throughput stream containing a set of rules which we want to evaluate against all
+elements coming from another stream. Having the above type of use-cases in mind, broadcast state differs from the rest
+of operator states in that:
+ 1. it has a map format,
+ 2. it is only available to streams whose elements are *broadcasted*,
--- End diff --
This is a bit confusing, as far as I understood it.
The broadcast state is available to both the broadcast input as well as the non-broadcasted input, i.e. the broadcast state can be read on both `map1` and `map2` of a co-flat map.
However, it can only be updated on the broadcast input.
> Add Broadcast State documentation.
> ----------------------------------
>
> Key: FLINK-8780
> URL: https://issues.apache.org/jira/browse/FLINK-8780
> Project: Flink
> Issue Type: Bug
> Components: DataStream API, Documentation, Streaming
> Affects Versions: 1.5.0
> Reporter: Kostas Kloudas
> Assignee: Kostas Kloudas
> Priority: Critical
> Fix For: 1.5.0
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)