You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by pw...@apache.org on 2014/01/13 04:49:52 UTC

[1/3] git commit: Rename DStream.foreach to DStream.foreachRDD

Updated Branches:
  refs/heads/master 074f50232 -> 28a6b0cdb


Rename DStream.foreach to DStream.foreachRDD

`foreachRDD` makes it clear that the granularity of this operator is per-RDD.
As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other
DStream operators which get pushed down to individual records within each RDD.


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/f4d77f8c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/f4d77f8c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/f4d77f8c

Branch: refs/heads/master
Commit: f4d77f8cb8a9eab43bea35e8e6c9bc0d2c2b53a8
Parents: 288a878
Author: Patrick Wendell <pw...@gmail.com>
Authored: Sat Jan 11 10:50:14 2014 -0800
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Sun Jan 12 17:21:00 2014 -0800

----------------------------------------------------------------------
 docs/streaming-programming-guide.md                       |  4 ++--
 .../apache/spark/streaming/examples/RawNetworkGrep.scala  |  2 +-
 .../spark/streaming/examples/TwitterAlgebirdCMS.scala     |  4 ++--
 .../spark/streaming/examples/TwitterAlgebirdHLL.scala     |  4 ++--
 .../spark/streaming/examples/TwitterPopularTags.scala     |  4 ++--
 .../streaming/examples/clickstream/PageViewStream.scala   |  2 +-
 .../main/scala/org/apache/spark/streaming/DStream.scala   | 10 +++++-----
 .../org/apache/spark/streaming/PairDStreamFunctions.scala |  4 ++--
 .../apache/spark/streaming/api/java/JavaDStreamLike.scala |  8 ++++----
 .../org/apache/spark/streaming/BasicOperationsSuite.scala |  2 +-
 10 files changed, 22 insertions(+), 22 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/docs/streaming-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/streaming-programming-guide.md b/docs/streaming-programming-guide.md
index 1c9ece6..3273817 100644
--- a/docs/streaming-programming-guide.md
+++ b/docs/streaming-programming-guide.md
@@ -175,7 +175,7 @@ When an output operator is called, it triggers the computation of a stream. Curr
 <table class="table">
 <tr><th style="width:30%">Operator</th><th>Meaning</th></tr>
 <tr>
-  <td> <b>foreach</b>(<i>func</i>) </td>
+  <td> <b>foreachRDD</b>(<i>func</i>) </td>
   <td> The fundamental output operator. Applies a function, <i>func</i>, to each RDD generated from the stream. This function should have side effects, such as printing output, saving the RDD to external files, or writing it over the network to an external system. </td>
 </tr>
 
@@ -375,7 +375,7 @@ There are two failure behaviors based on which input sources are used.
 1. _Using HDFS files as input source_ - Since the data is reliably stored on HDFS, all data can re-computed and therefore no data will be lost due to any failure.
 1. _Using any input source that receives data through a network_ - For network-based data sources like Kafka and Flume, the received input data is replicated in memory between nodes of the cluster (default replication factor is 2). So if a worker node fails, then the system can recompute the lost from the the left over copy of the input data. However, if the worker node where a network receiver was running fails, then a tiny bit of data may be lost, that is, the data received by the system but not yet replicated to other node(s). The receiver will be started on a different node and it will continue to receive data.
 
-Since all data is modeled as RDDs with their lineage of deterministic operations, any recomputation always leads to the same result. As a result, all DStream transformations are guaranteed to have _exactly-once_ semantics. That is, the final transformed result will be same even if there were was a worker node failure. However, output operations (like `foreach`) have _at-least once_ semantics, that is, the transformed data may get written to an external entity more than once in the event of a worker failure. While this is acceptable for saving to HDFS using the `saveAs*Files` operations (as the file will simply get over-written by the same data), additional transactions-like mechanisms may be necessary to achieve exactly-once semantics for output operations.
+Since all data is modeled as RDDs with their lineage of deterministic operations, any recomputation always leads to the same result. As a result, all DStream transformations are guaranteed to have _exactly-once_ semantics. That is, the final transformed result will be same even if there were was a worker node failure. However, output operations (like `foreachRDD`) have _at-least once_ semantics, that is, the transformed data may get written to an external entity more than once in the event of a worker failure. While this is acceptable for saving to HDFS using the `saveAs*Files` operations (as the file will simply get over-written by the same data), additional transactions-like mechanisms may be necessary to achieve exactly-once semantics for output operations.
 
 ## Failure of the Driver Node
 A system that is required to operate 24/7 needs to be able tolerate the failure of the driver node as well. Spark Streaming does this by saving the state of the DStream computation periodically to a HDFS file, that can be used to restart the streaming computation in the event of a failure of the driver node. This checkpointing is enabled by setting a HDFS directory for checkpointing using `ssc.checkpoint(<checkpoint directory>)` as described [earlier](#rdd-checkpointing-within-dstreams). To elaborate, the following state is periodically saved to a file.

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/examples/src/main/scala/org/apache/spark/streaming/examples/RawNetworkGrep.scala
----------------------------------------------------------------------
diff --git a/examples/src/main/scala/org/apache/spark/streaming/examples/RawNetworkGrep.scala b/examples/src/main/scala/org/apache/spark/streaming/examples/RawNetworkGrep.scala
index 3d08d86..99b79c3 100644
--- a/examples/src/main/scala/org/apache/spark/streaming/examples/RawNetworkGrep.scala
+++ b/examples/src/main/scala/org/apache/spark/streaming/examples/RawNetworkGrep.scala
@@ -58,7 +58,7 @@ object RawNetworkGrep {
     val rawStreams = (1 to numStreams).map(_ =>
       ssc.rawSocketStream[String](host, port, StorageLevel.MEMORY_ONLY_SER_2)).toArray
     val union = ssc.union(rawStreams)
-    union.filter(_.contains("the")).count().foreach(r =>
+    union.filter(_.contains("the")).count().foreachRDD(r =>
       println("Grep count: " + r.collect().mkString))
     ssc.start()
   }

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdCMS.scala
----------------------------------------------------------------------
diff --git a/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdCMS.scala b/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdCMS.scala
index 80b5a98..483c4d3 100644
--- a/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdCMS.scala
+++ b/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdCMS.scala
@@ -81,7 +81,7 @@ object TwitterAlgebirdCMS {
     val exactTopUsers = users.map(id => (id, 1))
       .reduceByKey((a, b) => a + b)
 
-    approxTopUsers.foreach(rdd => {
+    approxTopUsers.foreachRDD(rdd => {
       if (rdd.count() != 0) {
         val partial = rdd.first()
         val partialTopK = partial.heavyHitters.map(id =>
@@ -96,7 +96,7 @@ object TwitterAlgebirdCMS {
       }
     })
 
-    exactTopUsers.foreach(rdd => {
+    exactTopUsers.foreachRDD(rdd => {
       if (rdd.count() != 0) {
         val partialMap = rdd.collect().toMap
         val partialTopK = rdd.map(

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdHLL.scala
----------------------------------------------------------------------
diff --git a/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdHLL.scala b/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdHLL.scala
index cb2f2c5..94c2bf2 100644
--- a/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdHLL.scala
+++ b/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterAlgebirdHLL.scala
@@ -67,7 +67,7 @@ object TwitterAlgebirdHLL {
 
     val exactUsers = users.map(id => Set(id)).reduce(_ ++ _)
 
-    approxUsers.foreach(rdd => {
+    approxUsers.foreachRDD(rdd => {
       if (rdd.count() != 0) {
         val partial = rdd.first()
         globalHll += partial
@@ -76,7 +76,7 @@ object TwitterAlgebirdHLL {
       }
     })
 
-    exactUsers.foreach(rdd => {
+    exactUsers.foreachRDD(rdd => {
       if (rdd.count() != 0) {
         val partial = rdd.first()
         userSet ++= partial

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterPopularTags.scala
----------------------------------------------------------------------
diff --git a/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterPopularTags.scala b/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterPopularTags.scala
index 16c10fe..8a70d4a 100644
--- a/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterPopularTags.scala
+++ b/examples/src/main/scala/org/apache/spark/streaming/examples/TwitterPopularTags.scala
@@ -56,13 +56,13 @@ object TwitterPopularTags {
 
 
     // Print popular hashtags
-    topCounts60.foreach(rdd => {
+    topCounts60.foreachRDD(rdd => {
       val topList = rdd.take(5)
       println("\nPopular topics in last 60 seconds (%s total):".format(rdd.count()))
       topList.foreach{case (count, tag) => println("%s (%s tweets)".format(tag, count))}
     })
 
-    topCounts10.foreach(rdd => {
+    topCounts10.foreachRDD(rdd => {
       val topList = rdd.take(5)
       println("\nPopular topics in last 10 seconds (%s total):".format(rdd.count()))
       topList.foreach{case (count, tag) => println("%s (%s tweets)".format(tag, count))}

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/examples/src/main/scala/org/apache/spark/streaming/examples/clickstream/PageViewStream.scala
----------------------------------------------------------------------
diff --git a/examples/src/main/scala/org/apache/spark/streaming/examples/clickstream/PageViewStream.scala b/examples/src/main/scala/org/apache/spark/streaming/examples/clickstream/PageViewStream.scala
index da6b67b..bb44bc3 100644
--- a/examples/src/main/scala/org/apache/spark/streaming/examples/clickstream/PageViewStream.scala
+++ b/examples/src/main/scala/org/apache/spark/streaming/examples/clickstream/PageViewStream.scala
@@ -91,7 +91,7 @@ object PageViewStream {
       case "popularUsersSeen" =>
         // Look for users in our existing dataset and print it out if we have a match
         pageViews.map(view => (view.userID, 1))
-          .foreach((rdd, time) => rdd.join(userList)
+          .foreachRDD((rdd, time) => rdd.join(userList)
             .map(_._2._2)
             .take(10)
             .foreach(u => println("Saw user %s at time %s".format(u, time))))

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
----------------------------------------------------------------------
diff --git a/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala b/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
index b98f4a5..93d57db 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
@@ -487,15 +487,15 @@ abstract class DStream[T: ClassTag] (
    * Apply a function to each RDD in this DStream. This is an output operator, so
    * 'this' DStream will be registered as an output stream and therefore materialized.
    */
-  def foreach(foreachFunc: RDD[T] => Unit) {
-    this.foreach((r: RDD[T], t: Time) => foreachFunc(r))
+  def foreachRDD(foreachFunc: RDD[T] => Unit) {
+    this.foreachRDD((r: RDD[T], t: Time) => foreachFunc(r))
   }
 
   /**
    * Apply a function to each RDD in this DStream. This is an output operator, so
    * 'this' DStream will be registered as an output stream and therefore materialized.
    */
-  def foreach(foreachFunc: (RDD[T], Time) => Unit) {
+  def foreachRDD(foreachFunc: (RDD[T], Time) => Unit) {
     ssc.registerOutputStream(new ForEachDStream(this, context.sparkContext.clean(foreachFunc)))
   }
 
@@ -719,7 +719,7 @@ abstract class DStream[T: ClassTag] (
       val file = rddToFileName(prefix, suffix, time)
       rdd.saveAsObjectFile(file)
     }
-    this.foreach(saveFunc)
+    this.foreachRDD(saveFunc)
   }
 
   /**
@@ -732,7 +732,7 @@ abstract class DStream[T: ClassTag] (
       val file = rddToFileName(prefix, suffix, time)
       rdd.saveAsTextFile(file)
     }
-    this.foreach(saveFunc)
+    this.foreachRDD(saveFunc)
   }
 
   def register() {

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/streaming/src/main/scala/org/apache/spark/streaming/PairDStreamFunctions.scala
----------------------------------------------------------------------
diff --git a/streaming/src/main/scala/org/apache/spark/streaming/PairDStreamFunctions.scala b/streaming/src/main/scala/org/apache/spark/streaming/PairDStreamFunctions.scala
index 56dbcbd..69d80c3 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/PairDStreamFunctions.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/PairDStreamFunctions.scala
@@ -582,7 +582,7 @@ extends Serializable {
       val file = rddToFileName(prefix, suffix, time)
       rdd.saveAsHadoopFile(file, keyClass, valueClass, outputFormatClass, conf)
     }
-    self.foreach(saveFunc)
+    self.foreachRDD(saveFunc)
   }
 
   /**
@@ -612,7 +612,7 @@ extends Serializable {
       val file = rddToFileName(prefix, suffix, time)
       rdd.saveAsNewAPIHadoopFile(file, keyClass, valueClass, outputFormatClass, conf)
     }
-    self.foreach(saveFunc)
+    self.foreachRDD(saveFunc)
   }
 
   private def getKeyClass() = implicitly[ClassTag[K]].runtimeClass

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
----------------------------------------------------------------------
diff --git a/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala b/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
index 64f38ce..4b5d5ec 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
@@ -244,16 +244,16 @@ trait JavaDStreamLike[T, This <: JavaDStreamLike[T, This, R], R <: JavaRDDLike[T
    * Apply a function to each RDD in this DStream. This is an output operator, so
    * 'this' DStream will be registered as an output stream and therefore materialized.
    */
-  def foreach(foreachFunc: JFunction[R, Void]) {
-    dstream.foreach(rdd => foreachFunc.call(wrapRDD(rdd)))
+  def foreachRDD(foreachFunc: JFunction[R, Void]) {
+    dstream.foreachRDD(rdd => foreachFunc.call(wrapRDD(rdd)))
   }
 
   /**
    * Apply a function to each RDD in this DStream. This is an output operator, so
    * 'this' DStream will be registered as an output stream and therefore materialized.
    */
-  def foreach(foreachFunc: JFunction2[R, Time, Void]) {
-    dstream.foreach((rdd, time) => foreachFunc.call(wrapRDD(rdd), time))
+  def foreachRDD(foreachFunc: JFunction2[R, Time, Void]) {
+    dstream.foreachRDD((rdd, time) => foreachFunc.call(wrapRDD(rdd), time))
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/f4d77f8c/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala
----------------------------------------------------------------------
diff --git a/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala b/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala
index ee6b433..9a187ce 100644
--- a/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala
+++ b/streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala
@@ -383,7 +383,7 @@ class BasicOperationsSuite extends TestSuiteBase {
     val input = Seq(Seq(1), Seq(2), Seq(3), Seq(4))
     val stream = new TestInputStream[Int](ssc, input, 2)
     ssc.registerInputStream(stream)
-    stream.foreach(_ => {})  // Dummy output stream
+    stream.foreachRDD(_ => {})  // Dummy output stream
     ssc.start()
     Thread.sleep(2000)
     def getInputFromSlice(fromMillis: Long, toMillis: Long) = {


[2/3] git commit: Adding deprecated versions of old code

Posted by pw...@apache.org.
Adding deprecated versions of old code


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/e6e20cee
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/e6e20cee
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/e6e20cee

Branch: refs/heads/master
Commit: e6e20ceee0f7edc161be611ea903c0e1609f9069
Parents: f4d77f8
Author: Patrick Wendell <pw...@gmail.com>
Authored: Sun Jan 12 18:54:03 2014 -0800
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Sun Jan 12 18:54:03 2014 -0800

----------------------------------------------------------------------
 .../org/apache/spark/streaming/DStream.scala    | 31 +++++++++++++++-----
 .../streaming/api/java/JavaDStreamLike.scala    | 22 ++++++++++++++
 2 files changed, 45 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/e6e20cee/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
----------------------------------------------------------------------
diff --git a/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala b/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
index 93d57db..9432a70 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/DStream.scala
@@ -17,19 +17,20 @@
 
 package org.apache.spark.streaming
 
-import StreamingContext._
-import org.apache.spark.streaming.dstream._
-import org.apache.spark.streaming.scheduler.Job
-import org.apache.spark.Logging
-import org.apache.spark.rdd.RDD
-import org.apache.spark.storage.StorageLevel
-import org.apache.spark.util.MetadataCleaner
+import java.io.{IOException, ObjectInputStream, ObjectOutputStream}
 
+import scala.deprecated
 import scala.collection.mutable.HashMap
 import scala.reflect.ClassTag
 
-import java.io.{ObjectInputStream, IOException, ObjectOutputStream}
+import StreamingContext._
 
+import org.apache.spark.Logging
+import org.apache.spark.rdd.RDD
+import org.apache.spark.storage.StorageLevel
+import org.apache.spark.streaming.dstream._
+import org.apache.spark.streaming.scheduler.Job
+import org.apache.spark.util.MetadataCleaner
 
 /**
  * A Discretized Stream (DStream), the basic abstraction in Spark Streaming, is a continuous
@@ -487,6 +488,20 @@ abstract class DStream[T: ClassTag] (
    * Apply a function to each RDD in this DStream. This is an output operator, so
    * 'this' DStream will be registered as an output stream and therefore materialized.
    */
+  @deprecated("use foreachRDD", "0.9.0")
+  def foreach(foreachFunc: RDD[T] => Unit) = this.foreachRDD(foreachFunc)
+
+  /**
+   * Apply a function to each RDD in this DStream. This is an output operator, so
+   * 'this' DStream will be registered as an output stream and therefore materialized.
+   */
+  @deprecated("use foreachRDD", "0.9.0")
+  def foreach(foreachFunc: (RDD[T], Time) => Unit) = this.foreachRDD(foreachFunc)
+
+  /**
+   * Apply a function to each RDD in this DStream. This is an output operator, so
+   * 'this' DStream will be registered as an output stream and therefore materialized.
+   */
   def foreachRDD(foreachFunc: RDD[T] => Unit) {
     this.foreachRDD((r: RDD[T], t: Time) => foreachFunc(r))
   }

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/e6e20cee/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
----------------------------------------------------------------------
diff --git a/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala b/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
index 4b5d5ec..cea4795 100644
--- a/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
+++ b/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaDStreamLike.scala
@@ -243,6 +243,28 @@ trait JavaDStreamLike[T, This <: JavaDStreamLike[T, This, R], R <: JavaRDDLike[T
   /**
    * Apply a function to each RDD in this DStream. This is an output operator, so
    * 'this' DStream will be registered as an output stream and therefore materialized.
+   *
+   * @deprecated  As of release 0.9.0, replaced by foreachRDD
+   */
+  @Deprecated
+  def foreach(foreachFunc: JFunction[R, Void]) {
+    foreachRDD(foreachFunc)
+  }
+
+  /**
+   * Apply a function to each RDD in this DStream. This is an output operator, so
+   * 'this' DStream will be registered as an output stream and therefore materialized.
+   *
+   * @deprecated  As of release 0.9.0, replaced by foreachRDD
+   */
+  @Deprecated
+  def foreach(foreachFunc: JFunction2[R, Time, Void]) {
+    foreachRDD(foreachFunc)
+  }
+
+  /**
+   * Apply a function to each RDD in this DStream. This is an output operator, so
+   * 'this' DStream will be registered as an output stream and therefore materialized.
    */
   def foreachRDD(foreachFunc: JFunction[R, Void]) {
     dstream.foreachRDD(rdd => foreachFunc.call(wrapRDD(rdd)))


[3/3] git commit: Merge pull request #398 from pwendell/streaming-api

Posted by pw...@apache.org.
Merge pull request #398 from pwendell/streaming-api

Rename DStream.foreach to DStream.foreachRDD

`foreachRDD` makes it clear that the granularity of this operator is per-RDD.
As it stands, `foreach` is inconsistent with with `map`, `filter`, and the other
DStream operators which get pushed down to individual records within each RDD.


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/28a6b0cd
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/28a6b0cd
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/28a6b0cd

Branch: refs/heads/master
Commit: 28a6b0cdbc75d58e36b1da3dcf257c61e44b0f7a
Parents: 074f502 e6e20ce
Author: Patrick Wendell <pw...@gmail.com>
Authored: Sun Jan 12 19:49:36 2014 -0800
Committer: Patrick Wendell <pw...@gmail.com>
Committed: Sun Jan 12 19:49:36 2014 -0800

----------------------------------------------------------------------
 docs/streaming-programming-guide.md             |  4 +-
 .../streaming/examples/RawNetworkGrep.scala     |  2 +-
 .../streaming/examples/TwitterAlgebirdCMS.scala |  4 +-
 .../streaming/examples/TwitterAlgebirdHLL.scala |  4 +-
 .../streaming/examples/TwitterPopularTags.scala |  4 +-
 .../examples/clickstream/PageViewStream.scala   |  2 +-
 .../org/apache/spark/streaming/DStream.scala    | 41 +++++++++++++-------
 .../spark/streaming/PairDStreamFunctions.scala  |  4 +-
 .../streaming/api/java/JavaDStreamLike.scala    | 26 ++++++++++++-
 .../spark/streaming/BasicOperationsSuite.scala  |  2 +-
 10 files changed, 65 insertions(+), 28 deletions(-)
----------------------------------------------------------------------