You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dan Dutrow (JIRA)" <ji...@apache.org> on 2015/12/09 17:04:11 UTC

[jira] [Commented] (SPARK-2388) Streaming from multiple different Kafka topics is problematic

    [ https://issues.apache.org/jira/browse/SPARK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048885#comment-15048885 ] 

Dan Dutrow commented on SPARK-2388:
-----------------------------------

The Kafka Direct API lets you insert a callback function that gives you access to the topic name and other metadata besides they key.

> Streaming from multiple different Kafka topics is problematic
> -------------------------------------------------------------
>
>                 Key: SPARK-2388
>                 URL: https://issues.apache.org/jira/browse/SPARK-2388
>             Project: Spark
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 1.0.0
>            Reporter: Sergey
>             Fix For: 1.0.1
>
>
> Default way of creating stream out of Kafka source would be as
>     val stream = KafkaUtils.createStream(ssc,"localhost:2181","logs", Map("retarget" -> 2,"datapair" -> 2))
> However, if two topics - in this case "retarget" and "datapair" - are very different, there is no way to set up different filter, mapping functions, etc), as they are effectively merged.
> However, instance of KafkaInputDStream, created with this call internally calls ConsumerConnector.createMessageStream() which returns *map* of KafkaStreams, keyed by topic. It would be great if this map would be exposed somehow, so aforementioned call 
>     val streamS = KafkaUtils.createStreamS(...)
> returned map of streams.
> Regards,
> Sergey Malov
> Collective Media



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org