You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sr...@apache.org on 2015/06/09 13:19:53 UTC

spark git commit: [STREAMING] [DOC] Remove duplicated description about WAL

Repository: spark
Updated Branches:
  refs/heads/master 1b499993a -> e6fb6cedf


[STREAMING] [DOC] Remove duplicated description about WAL

I noticed there is a duplicated description about WAL.

```
To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming. To ensure zero data loss, enable the Write Ahead Logs (introduced in Spark 1.2).
```

Let's remove the duplication.

I don't file this issue in JIRA because it's minor.

Author: Kousuke Saruta <sa...@oss.nttdata.co.jp>

Closes #6719 from sarutak/remove-multiple-description and squashes the following commits:

cc9bb21 [Kousuke Saruta] Removed duplicated description about WAL


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e6fb6ced
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e6fb6ced
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e6fb6ced

Branch: refs/heads/master
Commit: e6fb6cedf3ecbde6f01d4753d7d05d0c52827fce
Parents: 1b49999
Author: Kousuke Saruta <sa...@oss.nttdata.co.jp>
Authored: Tue Jun 9 12:19:01 2015 +0100
Committer: Sean Owen <so...@cloudera.com>
Committed: Tue Jun 9 12:19:01 2015 +0100

----------------------------------------------------------------------
 docs/streaming-kafka-integration.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/e6fb6ced/docs/streaming-kafka-integration.md
----------------------------------------------------------------------
diff --git a/docs/streaming-kafka-integration.md b/docs/streaming-kafka-integration.md
index d6d5605..998c8c9 100644
--- a/docs/streaming-kafka-integration.md
+++ b/docs/streaming-kafka-integration.md
@@ -7,7 +7,7 @@ title: Spark Streaming + Kafka Integration Guide
 ## Approach 1: Receiver-based Approach
 This approach uses a Receiver to receive the data. The Received is implemented using the Kafka high-level consumer API. As with all receivers, the data received from Kafka through a Receiver is stored in Spark executors, and then jobs launched by Spark Streaming processes the data. 
 
-However, under default configuration, this approach can lose data under failures (see [receiver reliability](streaming-programming-guide.html#receiver-reliability). To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming. To ensure zero data loss, enable the Write Ahead Logs (introduced in Spark 1.2). This synchronously saves all the received Kafka data into write ahead logs on a distributed file system (e.g HDFS), so that all the data can be recovered on failure. See [Deploying section](streaming-programming-guide.html#deploying-applications) in the streaming programming guide for more details on Write Ahead Logs.
+However, under default configuration, this approach can lose data under failures (see [receiver reliability](streaming-programming-guide.html#receiver-reliability). To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming (introduced in Spark 1.2). This synchronously saves all the received Kafka data into write ahead logs on a distributed file system (e.g HDFS), so that all the data can be recovered on failure. See [Deploying section](streaming-programming-guide.html#deploying-applications) in the streaming programming guide for more details on Write Ahead Logs.
 
 Next, we discuss how to use this approach in your streaming application.
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org