You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by tzulitai <gi...@git.apache.org> on 2018/05/02 08:29:14 UTC

[GitHub] flink pull request #5922: [FLINK-8780] [docs] Add Broadcast State documentat...

Github user tzulitai commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5922#discussion_r185417488
  
    --- Diff: docs/dev/stream/state/broadcast_state.md ---
    @@ -0,0 +1,281 @@
    +---
    +title: "The Broadcast State Pattern"
    +nav-parent_id: streaming_state
    +nav-pos: 2
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +* ToC
    +{:toc}
    +
    +[Working with State](state.html) described operator state which is either **evenly** distributed among the parallel
    +tasks of an operator, or state which **upon restore**, its partial (task) states are **unioned** and the whole state is 
    +used to initialize the restored parallel tasks.
    +
    +A third type of supported *operator state* is the *Broadcast State*. Broadcast state was introduced to support use-cases
    +where some data coming from one stream is required to be broadcasted to all downstream tasks, where it is stored locally
    +and is used to process all incoming elements on the other stream. As an example where broadcast state can emerge as a 
    +natural fit, one can imagine a low-throughput stream containing a set of rules which we want to evaluate against all 
    +elements coming from another stream. Having the above type of use-cases in mind, broadcast state differs from the rest 
    +of operator states in that:
    + 1. it has a map format,
    + 2. it is only available to streams whose elements are *broadcasted*,
    --- End diff --
    
    This is a bit confusing, as far as I understood it.
    
    The broadcast state is available to both the broadcast input as well as the non-broadcasted input, i.e. the broadcast state can be read on both `map1` and `map2` of a co-flat map.
    However, it can only be updated on the broadcast input.


---