You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Mario Briggs (JIRA)" <ji...@apache.org> on 2016/03/03 14:10:18 UTC

[jira] [Created] (SPARK-13650) Usage of the window() function on DStream

Mario Briggs created SPARK-13650:
------------------------------------

             Summary: Usage of the window() function on DStream
                 Key: SPARK-13650
                 URL: https://issues.apache.org/jira/browse/SPARK-13650
             Project: Spark
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.6.0, 1.5.2, 2.0.0
            Reporter: Mario Briggs
            Priority: Minor


Is there some guidance of the usage of the Window() function on DStream. Here is my academic use-case for which it fails.

Standard word count

 val ssc = new StreamingContext(sparkConf, Seconds(6))
 val messages = KafkaUtils.createDirectStream(...)
 val words = messages.map(_._2).flatMap(_.split(" "))
 val window = words.window(Seconds(12), Seconds(6)) 
 window.count().print()

For the first batch interval it gives the count and then it hangs (inside the unionRDD)

I say the above use-case is academic since one can achieve similar fuctionality by using instead the more compact API
       words.countByWindow(Seconds(12), Seconds(6))
which works fine. 

Is the first approach above not the intended way of using the .window() API




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org