You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by fe...@apache.org on 2018/01/09 06:08:23 UTC
spark git commit: [SPARK-21293][SPARKR][DOCS] structured streaming
doc update
Repository: spark
Updated Branches:
refs/heads/master 8486ad419 -> 02214b094
[SPARK-21293][SPARKR][DOCS] structured streaming doc update
## What changes were proposed in this pull request?
doc update
Author: Felix Cheung <fe...@hotmail.com>
Closes #20197 from felixcheung/rwadoc.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/02214b09
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/02214b09
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/02214b09
Branch: refs/heads/master
Commit: 02214b094390e913f52e71d55c9bb8a81c9e7ef9
Parents: 8486ad419
Author: Felix Cheung <fe...@hotmail.com>
Authored: Mon Jan 8 22:08:19 2018 -0800
Committer: Felix Cheung <fe...@apache.org>
Committed: Mon Jan 8 22:08:19 2018 -0800
----------------------------------------------------------------------
R/pkg/vignettes/sparkr-vignettes.Rmd | 2 +-
docs/sparkr.md | 2 +-
docs/structured-streaming-programming-guide.md | 32 +++++++++++++++++++--
3 files changed, 32 insertions(+), 4 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/02214b09/R/pkg/vignettes/sparkr-vignettes.Rmd
----------------------------------------------------------------------
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd b/R/pkg/vignettes/sparkr-vignettes.Rmd
index 2e66242..feca617 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -1042,7 +1042,7 @@ unlink(modelPath)
## Structured Streaming
-SparkR supports the Structured Streaming API (experimental).
+SparkR supports the Structured Streaming API.
You can check the Structured Streaming Programming Guide for [an introduction](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#programming-model) to its programming model and basic concepts.
http://git-wip-us.apache.org/repos/asf/spark/blob/02214b09/docs/sparkr.md
----------------------------------------------------------------------
diff --git a/docs/sparkr.md b/docs/sparkr.md
index 997ea60..6685b58 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -596,7 +596,7 @@ The following example shows how to save/load a MLlib model by SparkR.
# Structured Streaming
-SparkR supports the Structured Streaming API (experimental). Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. For more information see the R API on the [Structured Streaming Programming Guide](structured-streaming-programming-guide.html)
+SparkR supports the Structured Streaming API. Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. For more information see the R API on the [Structured Streaming Programming Guide](structured-streaming-programming-guide.html)
# R Function Name Conflicts
http://git-wip-us.apache.org/repos/asf/spark/blob/02214b09/docs/structured-streaming-programming-guide.md
----------------------------------------------------------------------
diff --git a/docs/structured-streaming-programming-guide.md b/docs/structured-streaming-programming-guide.md
index 31fcfab..de13e28 100644
--- a/docs/structured-streaming-programming-guide.md
+++ b/docs/structured-streaming-programming-guide.md
@@ -827,8 +827,8 @@ df.isStreaming()
{% endhighlight %}
</div>
<div data-lang="r" markdown="1">
-{% highlight bash %}
-Not available.
+{% highlight r %}
+isStreaming(df)
{% endhighlight %}
</div>
</div>
@@ -886,6 +886,19 @@ windowedCounts = words.groupBy(
{% endhighlight %}
</div>
+<div data-lang="r" markdown="1">
+{% highlight r %}
+words <- ... # streaming DataFrame of schema { timestamp: Timestamp, word: String }
+
+# Group the data by window and word and compute the count of each group
+windowedCounts <- count(
+ groupBy(
+ words,
+ window(words$timestamp, "10 minutes", "5 minutes"),
+ words$word))
+{% endhighlight %}
+
+</div>
</div>
@@ -960,6 +973,21 @@ windowedCounts = words \
{% endhighlight %}
</div>
+<div data-lang="r" markdown="1">
+{% highlight r %}
+words <- ... # streaming DataFrame of schema { timestamp: Timestamp, word: String }
+
+# Group the data by window and word and compute the count of each group
+
+words <- withWatermark(words, "timestamp", "10 minutes")
+windowedCounts <- count(
+ groupBy(
+ words,
+ window(words$timestamp, "10 minutes", "5 minutes"),
+ words$word))
+{% endhighlight %}
+
+</div>
</div>
In this example, we are defining the watermark of the query on the value of the column "timestamp",
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org