You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/03 09:34:00 UTC
[jira] [Commented] (FLINK-7301) Rework state documentation
[ https://issues.apache.org/jira/browse/FLINK-7301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112487#comment-16112487 ]
ASF GitHub Bot commented on FLINK-7301:
---------------------------------------
Github user alpinegizmo commented on a diff in the pull request:
https://github.com/apache/flink/pull/4441#discussion_r131087040
--- Diff: docs/dev/stream/state/index.md ---
@@ -0,0 +1,56 @@
+---
+title: "State & Fault Tolerance"
+nav-id: streaming_state
+nav-title: "State & Fault Tolerance"
+nav-parent_id: streaming
+nav-pos: 3
+nav-show_overview: true
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Stateful functions and operators store data across the processing of individual elements/events, making state a critical building block for
+any type of more elaborate operation.
+
+For example:
+
+ - When an application searches for certain event patterns, the state will store the sequence of events encountered so far.
+ - When aggregating events per minute/hour/day, the state holds the pending aggregates.
+ - When training a machine learning model over a stream of data points, the state holds the current version of the model parameters.
+ - When historic data needs to be managed, the state allows efficient access to events occured in the past.
+
+Flink needs to be aware of the state in order to make state fault tolerant using [checkpoints](checkpointing.html) and allow [savepoints]({{ site.baseurl }}/ops/state/savepoints.html) of streaming applications.
--- End diff --
"and to allow [savepoints]"
> Rework state documentation
> --------------------------
>
> Key: FLINK-7301
> URL: https://issues.apache.org/jira/browse/FLINK-7301
> Project: Flink
> Issue Type: Improvement
> Components: Documentation
> Reporter: Timo Walther
> Assignee: Timo Walther
>
> The documentation about state is spread across different pages, but this is not consistent and it is hard to find what you need. I propose:
> "Mention State Backends and link to them in ""Streaming/Working with State"".
> Create category ""State & Fault Tolerance"" under ""Streaming"". Move ""Working with State"", ""Checkpointing"" and ""Queryable State"".
> Move API related parts (90%) of ""Deployment/State & Fault Tolerance/State Backends"" to ""Streaming/State & Fault Tolerance/State Backends"".
> Move all tuning things from ""Debugging/Large State"" to ""Deployment/State & Fault Tolerance/State Backends"".
> Move ""Streaming/Working with State/Custom Serialization for Managed State"" to ""Streaming/State & Fault Tolerance/Custom Serialization"" (Add a link from previous position, also link from ""Data Types & Serialization"")."
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)