You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Mario Briggs (JIRA)" <ji...@apache.org> on 2016/03/03 14:10:18 UTC
[jira] [Created] (SPARK-13650) Usage of the window() function on
DStream
Mario Briggs created SPARK-13650:
------------------------------------
Summary: Usage of the window() function on DStream
Key: SPARK-13650
URL: https://issues.apache.org/jira/browse/SPARK-13650
Project: Spark
Issue Type: Bug
Components: Streaming
Affects Versions: 1.6.0, 1.5.2, 2.0.0
Reporter: Mario Briggs
Priority: Minor
Is there some guidance of the usage of the Window() function on DStream. Here is my academic use-case for which it fails.
Standard word count
val ssc = new StreamingContext(sparkConf, Seconds(6))
val messages = KafkaUtils.createDirectStream(...)
val words = messages.map(_._2).flatMap(_.split(" "))
val window = words.window(Seconds(12), Seconds(6))
window.count().print()
For the first batch interval it gives the count and then it hangs (inside the unionRDD)
I say the above use-case is academic since one can achieve similar fuctionality by using instead the more compact API
words.countByWindow(Seconds(12), Seconds(6))
which works fine.
Is the first approach above not the intended way of using the .window() API
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org