You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tathagata Das (JIRA)" <ji...@apache.org> on 2015/04/09 20:26:16 UTC

[jira] [Commented] (SPARK-6803) [SparkR] Support SparkR Streaming

    [ https://issues.apache.org/jira/browse/SPARK-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14487872#comment-14487872 ] 

Tathagata Das commented on SPARK-6803:
--------------------------------------

I know of many use cases where python is desired language for using streaming. Especially in the devops domain (devops love python, as I have heard), streaming machine data and processing them. I do not have any knowledge about the need for writing streaming applications in R. None the less it is very cool. :D

BTW, the main challenge in the building python API for streaming was to make the streaming scheduler in Java call back into Python to run an arbitrary python function (a RDD-to-RDD transformation). Setting up this callback through Py4j was interesting. I am curious to know how this was solved with R in this prototype. 



> [SparkR] Support SparkR Streaming
> ---------------------------------
>
>                 Key: SPARK-6803
>                 URL: https://issues.apache.org/jira/browse/SPARK-6803
>             Project: Spark
>          Issue Type: New Feature
>          Components: SparkR, Streaming
>            Reporter: Hao
>             Fix For: 1.4.0
>
>
> Adds R API for Spark Streaming.
> A experimental version is presented in repo [1]. which follows the PySpark streaming design. Also, this PR can be further broken down into sub task issues.
> [1] https://github.com/hlin09/spark/tree/SparkR-streaming/ 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org