You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by mj...@apache.org on 2015/10/01 14:08:56 UTC

flink git commit: [doc] fixed typos in "Internals -> Fault Tolerance for Data Streaming"

Repository: flink
Updated Branches:
  refs/heads/master 846ad7064 -> bbd97354b


[doc] fixed typos in "Internals -> Fault Tolerance for Data Streaming"


Project: http://git-wip-us.apache.org/repos/asf/flink/repo
Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/bbd97354
Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/bbd97354
Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/bbd97354

Branch: refs/heads/master
Commit: bbd97354b8e681dd68f8ad7528eef433227c5c89
Parents: 846ad70
Author: mjsax <mj...@apache.org>
Authored: Thu Oct 1 14:06:49 2015 +0200
Committer: mjsax <mj...@apache.org>
Committed: Thu Oct 1 14:08:08 2015 +0200

----------------------------------------------------------------------
 docs/internals/stream_checkpointing.md | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/flink/blob/bbd97354/docs/internals/stream_checkpointing.md
----------------------------------------------------------------------
diff --git a/docs/internals/stream_checkpointing.md b/docs/internals/stream_checkpointing.md
index 27eae6b..1c8f74f 100644
--- a/docs/internals/stream_checkpointing.md
+++ b/docs/internals/stream_checkpointing.md
@@ -20,7 +20,7 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This document describes Flink' fault tolerance mechanism for streaming data flows.
+This document describes Flink's fault tolerance mechanism for streaming data flows.
 
 * This will be replaced by the TOC
 {:toc}
@@ -87,9 +87,9 @@ their descendant records) have passed through the entire data flow topology.
   <img src="{{ site.baseurl }}/internals/fig/stream_aligning.svg" alt="Aligning data streams at operators with multiple inputs" style="width:100%; padding-top:10px; padding-bottom:10px;" />
 </div>
 
-Operators that receive more than one input stream need to *align* the input streams on the snapshot barriers. The figure above illutrates this:
+Operators that receive more than one input stream need to *align* the input streams on the snapshot barriers. The figure above illustrates this:
 
-  - As soon as the operator received snapshot barrier *n* from an incoming stream, it cannot process any further records from that stream until it has received the
+  - As soon as the operator received snapshot barrier *n* from an incoming stream, it cannot process any further records from that stream until it has received
 the barrier *n* from the other inputs as well. Otherwise, it would have mixed records that belong to snapshot *n* and with records that belong to snapshot *n+1*.
   - Streams that report barrier *n* are temporarily set aside. Records that are received from these streams are not processed, but put into an input buffer.
   - Once the last stream has received barrier *n*, the operator emits all pending outgoing records, and then emits snapshot *n* barriers itself.
@@ -103,7 +103,7 @@ When operators contain any form of *state*, this state must be part of the snaps
   - *User-defined state*: This is state that is created and modified directly by the transformation functions (like `map()` or `filter()`). User-defined state can either be a simple variable in the function's java object, or the associated key/value state of a function (see [State in Streaming Applications]({{ site.baseurl }}/apis/streaming_guide.html#stateful-computation) for details).
   - *System state*: This state refers to data buffers that are part of the operator's computation. A typical example for this state are the *window buffers*, inside which the system collects (and aggregates) records for windows until the window is evaluated and evicted.
 
-Operators snapshot their state at the point in time when they received all snapshot barriers from their input streams, before emitting the barriers to their output streams. At that point, all updates to the state from records before the barriers will have been made, and no updates that depend on records from after the barriers have been applied. Because the state of a snapshot may be potentially large, it is stored in a configurable *state backend*. By default, this is the JobManager's memory, but for serious setups, a distributed reliable storage should be configured (such as HDFS). After the state has been stored, the operator acknowledges the checkpoint, emity the snapshot barrier into the output streams, and proceeds.
+Operators snapshot their state at the point in time when they received all snapshot barriers from their input streams, before emitting the barriers to their output streams. At that point, all updates to the state from records before the barriers will have been made, and no updates that depend on records from after the barriers have been applied. Because the state of a snapshot may be potentially large, it is stored in a configurable *state backend*. By default, this is the JobManager's memory, but for serious setups, a distributed reliable storage should be configured (such as HDFS). After the state has been stored, the operator acknowledges the checkpoint, emits the snapshot barrier into the output streams, and proceeds.
 
 The resulting snapshot now contains:
 
@@ -118,16 +118,16 @@ The resulting snapshot now contains:
 ### Exactly Once vs. At Least Once
 
 The alignment step may add latency to the streaming program. Usually, this extra latency is in the order of a few milliseconds, but we have seen cases where the latency
-of some outliers increased noticeably. For applications that require consistenty super low latencies (few milliseconds) for all records, Flink has a switch to skip the 
+of some outliers increased noticeably. For applications that require consistently super low latencies (few milliseconds) for all records, Flink has a switch to skip the
 stream alignment during a checkpoint. Checkpoint snapshots are still drawn as soon as an operator has seen the checkpoint barrier from each input.
 
 When the alignment is skipped, an operator keeps processing all inputs, even after some checkpoint barriers for checkpoint *n* arrived. That way, the operator also processes
 elements that belong to checkpoint *n+1* before the state snapshot for checkpoint *n* was taken.
 On a restore, these records will occur as duplicates, because they are both included in the state snapshot of checkpoint *n*, and will be replayed as part
-of the data after checkoint *n*.
+of the data after checkpoint *n*.
 
-*NOTE*: Alignment happens only for operators wih multiple predecessors (joins) as well as operators with multiple senders (after a stream repartitionging/shuffle).
-Because of that, dataflows with only embarassingly parallel streaming operations (`map()`, `flatMap()`, `filter()`, ...) actually give *exactly once* guarantees even
+*NOTE*: Alignment happens only for operators with multiple predecessors (joins) as well as operators with multiple senders (after a stream repartitioning/shuffle).
+Because of that, dataflows with only embarrassingly parallel streaming operations (`map()`, `flatMap()`, `filter()`, ...) actually give *exactly once* guarantees even
 in *at least once* mode.
 
 <!--