You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Russell Jurney (JIRA)" <ji...@apache.org> on 2016/12/27 04:08:58 UTC
[jira] [Commented] (SPARK-18955) Add ability to emit kafka events
to DStream or KafkaDStream
[ https://issues.apache.org/jira/browse/SPARK-18955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15779554#comment-15779554 ]
Russell Jurney commented on SPARK-18955:
----------------------------------------
Can I please get feedback as to whether this patch would be accepted? I don't want to do the work if it isn't even something that would be merged.
> Add ability to emit kafka events to DStream or KafkaDStream
> -----------------------------------------------------------
>
> Key: SPARK-18955
> URL: https://issues.apache.org/jira/browse/SPARK-18955
> Project: Spark
> Issue Type: New Feature
> Components: DStreams, PySpark
> Affects Versions: 2.0.2
> Reporter: Russell Jurney
> Labels: features, newbie
>
> Any I/O that needs doing in Spark Streaming seems to have to be done in a DStream.foreachRDD loop. For instance, in PySpark if I want to emit Kafka events for each record... I have to DStream.foreachRDD and use kafka-python to emit a Kafka event for each record.
> This really seems like I/O like this should be part of the pyspark.streaming or pyspark.streaming.kafka API and the equivalent Scala APIs. Something like DStream.emitKafkaEvents or KafkaDStream.emitKafkaEvents would seem to make sense.
> If this is a good idea, and it seems feasible, I'd like to take a crack at it as my first patch for Spark. Advice would be appreciated. What would need to be modified to make this happen?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org