You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by jaceklaskowski <gi...@git.apache.org> on 2014/05/19 23:58:10 UTC

[GitHub] spark pull request: Small updates to Streaming Programming Guide

GitHub user jaceklaskowski opened a pull request:

    https://github.com/apache/spark/pull/830

    Small updates to Streaming Programming Guide

    Please merge. More update will come soon.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jaceklaskowski/spark docs-streaming-guide

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/830.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #830
    
----
commit 3ccc9ce7e07310dddb937a3a2df685e6f77f27f0
Author: Jacek Laskowski <ja...@japila.pl>
Date:   2014-05-19T21:56:09Z

    Small updates to Streaming Programming Guide

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12822328
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -105,23 +104,22 @@ generating multiple new records from each record in the source DStream. In this
     each line will be split into multiple words and the stream of words is represented as the
     `words` DStream.  Next, we want to count these words.
     
    +The `words` DStream is further mapped (one-to-one transformation) to a DStream of `(word,
    +1)` pairs, which is then reduced to get the frequency of words in each batch of data.
    +Finally, `wordCounts.print()` will print the first ten counts generated every second.
    +
     {% highlight scala %}
    -import org.apache.spark.streaming.StreamingContext._
     // Count each word in each batch
    -val pairs = words.map(word => (word, 1))
    -val wordCounts = pairs.reduceByKey(_ + _)
    +val pairs: DStream[(String, Int)] = words.map((_, 1))
    +val wordCounts: DStream[(String, Int)] = pairs.reduceByKey(_ + _)
     
    -// Print a few of the counts to the console
    +// Print the first ten elements of each RDD generated in this DStream to the console
    --- End diff --
    
    Can you make this changes for the Java example as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski closed the pull request at:

    https://github.com/apache/spark/pull/830


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12870325
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -306,12 +304,16 @@ need to know to write your streaming applications.
     ## Linking
     
     To write your own Spark Streaming program, you will have to add the following dependency to your
    - SBT or Maven project:
    + sbt or Maven project:
    --- End diff --
    
    It should be all lowercase as is on [the website](http://www.scala-sbt.org/):
    
    > sbt is a build tool for Scala, Java, and more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12822349
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -306,12 +304,16 @@ need to know to write your streaming applications.
     ## Linking
     
     To write your own Spark Streaming program, you will have to add the following dependency to your
    - SBT or Maven project:
    + sbt or Maven project:
    --- End diff --
    
    Can you double check that lowercase sbt is consistent with other mentions of the SBT in other places in the Spark docs?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12822294
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._
     val ssc = new StreamingContext("local", "NetworkWordCount", Seconds(1))
     {% endhighlight %}
     
    -Using this context, we then create a new DStream
    -by specifying the IP address and port of the data server.
    +Using this context, we can create a DStream that represents streaming data from TCP 
    +source hostname (`localhost`) and port (`9999`).
     
     {% highlight scala %}
     // Create a DStream that will connect to serverIP:serverPort, like localhost:9999
    -val lines = ssc.socketTextStream("localhost", 9999)
    +import org.apache.spark.streaming.dstream._
    +val lines: DStream[String] = ssc.socketTextStream("localhost", 9999)
     {% endhighlight %}
     
    -This `lines` DStream represents the stream of data that will be received from the data
    -server. Each record in this DStream is a line of text. Next, we want to split the lines by
    +Each record in this DStream is a line of text. Next, we want to split the lines by
     space into words.
     
     {% highlight scala %}
     // Split each line into words
    -val words = lines.flatMap(_.split(" "))
    +val words: DStream[String] = lines.flatMap(_.split(" "))
    --- End diff --
    
    Same as above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12822287
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._
     val ssc = new StreamingContext("local", "NetworkWordCount", Seconds(1))
     {% endhighlight %}
     
    -Using this context, we then create a new DStream
    -by specifying the IP address and port of the data server.
    +Using this context, we can create a DStream that represents streaming data from TCP 
    +source hostname (`localhost`) and port (`9999`).
     
     {% highlight scala %}
     // Create a DStream that will connect to serverIP:serverPort, like localhost:9999
    -val lines = ssc.socketTextStream("localhost", 9999)
    +import org.apache.spark.streaming.dstream._
    +val lines: DStream[String] = ssc.socketTextStream("localhost", 9999)
    --- End diff --
    
    I dont recommend this change. Not specifying the type and letting scala compiler figure it, is the usual scala way of doing things.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43816303
  
    Please review the changes that were introduced after @tdas's comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43571707
  
     Merged build triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-54369605
  
    Apologies for dropping the ball on this PR, but I have updated the streaming guide and incorporated most of your suggestions.
    
    Mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43564252
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43571716
  
    Merged build started. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12878638
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._
     val ssc = new StreamingContext("local", "NetworkWordCount", Seconds(1))
     {% endhighlight %}
     
    -Using this context, we then create a new DStream
    -by specifying the IP address and port of the data server.
    +Using this context, we can create a DStream that represents streaming data from TCP 
    +source hostname (`localhost`) and port (`9999`).
     
     {% highlight scala %}
     // Create a DStream that will connect to serverIP:serverPort, like localhost:9999
    -val lines = ssc.socketTextStream("localhost", 9999)
    +import org.apache.spark.streaming.dstream._
    +val lines: DStream[String] = ssc.socketTextStream("localhost", 9999)
    --- End diff --
    
    I do get the point. I am a little conflicted. On one hand we would like to keep it consistent with other guides (for example: the [MLLib guide] does not show types, probably all of the others dont show). On the other hand, I do get your logic that this is easier for new comers to understand the code better. Let me think about it a little bit more. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43575323
  
    Merged build finished. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12877300
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -306,12 +305,16 @@ need to know to write your streaming applications.
     ## Linking
     
     To write your own Spark Streaming program, you will have to add the following dependency to your
    - SBT or Maven project:
    + sbt or Maven project:
     
         groupId = org.apache.spark
         artifactId = spark-streaming_{{site.SCALA_BINARY_VERSION}}
         version = {{site.SPARK_VERSION}}
     
    +For sbt, in `build.sbt` use the following:
    +
    +    libraryDependencies += "org.apache.spark" %% "spark-streaming" % "{{site.SPARK_VERSION}}"
    +
    --- End diff --
    
    Can you also please add this for Maven as well, to keep things consistent? :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12822250
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._
     val ssc = new StreamingContext("local", "NetworkWordCount", Seconds(1))
     {% endhighlight %}
     
    -Using this context, we then create a new DStream
    -by specifying the IP address and port of the data server.
    +Using this context, we can create a DStream that represents streaming data from TCP 
    +source hostname (`localhost`) and port (`9999`).
    --- End diff --
    
    streaming data from [a] TCP source. Also ... hostname (say, `localhost`)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-54374855
  
    No worries. Do what you think is going to be the best solution for the project. I don't mind closing the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12870111
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._
     val ssc = new StreamingContext("local", "NetworkWordCount", Seconds(1))
     {% endhighlight %}
     
    -Using this context, we then create a new DStream
    -by specifying the IP address and port of the data server.
    +Using this context, we can create a DStream that represents streaming data from TCP 
    +source hostname (`localhost`) and port (`9999`).
     
     {% highlight scala %}
     // Create a DStream that will connect to serverIP:serverPort, like localhost:9999
    -val lines = ssc.socketTextStream("localhost", 9999)
    +import org.apache.spark.streaming.dstream._
    +val lines: DStream[String] = ssc.socketTextStream("localhost", 9999)
    --- End diff --
    
    I fully agree and I do follow the rule while developing Scala applications, but since Scala is a statically typed language knowing the type while reading the docs helps comprehending what types are in play. That was the only reason to include them to let users open the scaladoc and search for more information with the types explicitly described.
    
    I myself was wondering what types should I be reading about and although I had started with `ssc` and followed along, I found it a bit troublesome for newcomers to Spark and Scala. *The easier the better* was the idea behind the change.
    
    In Spark's [Quick Start](http://spark.apache.org/docs/latest/quick-start.html) it's quite different where the types are presented with the results.
    
    In either case, I needed types while reading along without access to Spark's shell/REPL.
    
    Would you agree with the reasoning?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12822300
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -105,23 +104,22 @@ generating multiple new records from each record in the source DStream. In this
     each line will be split into multiple words and the stream of words is represented as the
     `words` DStream.  Next, we want to count these words.
     
    +The `words` DStream is further mapped (one-to-one transformation) to a DStream of `(word,
    +1)` pairs, which is then reduced to get the frequency of words in each batch of data.
    +Finally, `wordCounts.print()` will print the first ten counts generated every second.
    +
     {% highlight scala %}
    -import org.apache.spark.streaming.StreamingContext._
     // Count each word in each batch
    -val pairs = words.map(word => (word, 1))
    -val wordCounts = pairs.reduceByKey(_ + _)
    +val pairs: DStream[(String, Int)] = words.map((_, 1))
    --- End diff --
    
    Same as above.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12877361
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -355,21 +358,21 @@ object has to be created, which is the main entry point of all Spark Streaming f
     A `JavaStreamingContext` object can be created by using
     
     {% highlight scala %}
    -new JavaStreamingContext(master, appName, batchInterval, [sparkHome], [jars])
    +new JavaStreamingContext(master, appName, batchDuration, [sparkHome], [jars])
     {% endhighlight %}
     </div>
     </div>
     
     The `master` parameter is a standard [Spark cluster URL](scala-programming-guide.html#master-urls)
    -and can be "local" for local testing. The `appName` is a name of your program,
    -which will be shown on your cluster's web UI. The `batchInterval` is the size of the batches,
    +and can be `local` for local testing. The `appName` is a name of your program,
    --- End diff --
    
    I just realized that saying "local" is a bad idea because "local" ends up having only one slot, which is bad for Spark Streaming (as receiver takes up one core leaving none for computation). Can you replace this `local` with `local[*]` (which detects the number of cores in the local system).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12878644
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -306,12 +304,16 @@ need to know to write your streaming applications.
     ## Linking
     
     To write your own Spark Streaming program, you will have to add the following dependency to your
    - SBT or Maven project:
    + sbt or Maven project:
    --- End diff --
    
    Cool. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12869294
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -83,21 +82,21 @@ import org.apache.spark.streaming.api._
     val ssc = new StreamingContext("local", "NetworkWordCount", Seconds(1))
     {% endhighlight %}
     
    -Using this context, we then create a new DStream
    -by specifying the IP address and port of the data server.
    +Using this context, we can create a DStream that represents streaming data from TCP 
    +source hostname (`localhost`) and port (`9999`).
    --- End diff --
    
    That was the exact copy from scaladoc for [org.apache.spark.streaming.StreamingContext#socketTextStream](http://people.apache.org/~pwendell/spark-1.0.0-rc9-docs/api/scala/index.html#org.apache.spark.streaming.StreamingContext) as I thought it could've built...a consistent learning environment. I'll think that and the scaladoc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by jaceklaskowski <gi...@git.apache.org>.

Github user jaceklaskowski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12870184
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -105,23 +104,22 @@ generating multiple new records from each record in the source DStream. In this
     each line will be split into multiple words and the stream of words is represented as the
     `words` DStream.  Next, we want to count these words.
     
    +The `words` DStream is further mapped (one-to-one transformation) to a DStream of `(word,
    +1)` pairs, which is then reduced to get the frequency of words in each batch of data.
    +Finally, `wordCounts.print()` will print the first ten counts generated every second.
    +
     {% highlight scala %}
    -import org.apache.spark.streaming.StreamingContext._
     // Count each word in each batch
    -val pairs = words.map(word => (word, 1))
    -val wordCounts = pairs.reduceByKey(_ + _)
    +val pairs: DStream[(String, Int)] = words.map((_, 1))
    +val wordCounts: DStream[(String, Int)] = pairs.reduceByKey(_ + _)
     
    -// Print a few of the counts to the console
    +// Print the first ten elements of each RDD generated in this DStream to the console
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-54373915
  
    I added you as an author in one of commits in this PR https://github.com/apache/spark/pull/2254/commits
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43571533
  
    Jenkins, test this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on a diff in the pull request:

    https://github.com/apache/spark/pull/830#discussion_r12877362
  
    --- Diff: docs/streaming-programming-guide.md ---
    @@ -579,7 +582,7 @@ This is applied on a DStream containing words (say, the `pairs` DStream containi
     1)` pairs in the [earlier example](#a-quick-example)).
     
     {% highlight scala %}
    -val runningCounts = pairs.updateStateByKey[Int](updateFunction _)
    +val runningCounts = pairs updateStateByKey updateFunction
    --- End diff --
    
    I wouldnt recommend this change either. We try to balance between readability of code with Scala way of doing things. And maybe it is me but this way of writing (without periods) is more Scala-like but harder to read for new scala users.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by tdas <gi...@git.apache.org>.

Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-54375006
  
    Thanks, and sorry for dropping the ball on this PR :(


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: Small updates to Streaming Programming Guide

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/830#issuecomment-43575325
  
    
    Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15088/


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---