You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Chen Song <ch...@gmail.com> on 2015/01/23 20:25:20 UTC

Spark Streaming action not triggered with Kafka inputs

I am running into some problems with Spark Streaming when reading from
Kafka.I used Spark 1.2.0 built on CDH5.
The example is based on:
https://github.com/apache/spark/blob/master/examples/scala-2.10/src/main/scala/org/apache/spark/examples/streaming/KafkaWordCount.scala
* It works with default implementation.
val topicMap = topics.split(",").map((_,numThreads.toInt)).toMap
val lines = KafkaUtils.createStream(ssc, zkQuorum, group,
topicMap).map(_._2)

* However, when I changed it to parallel receiving, like shown below

val topicMap = topics.split(",").map((_, 1)).toMap
val parallelInputs = (1 to numThreads.toInt) map { _ =>         KafkaUtils.
createStream(ssc, zkQuorum, group, topicMap)

}

ssc.union(parallelInputs)
After the change, the job stage just hang there and never finish. It looks
like no action is triggered on the streaming job. When I check the
"Streaming" tab, it show messages below:
Batch Processing Statistics

   No statistics have been generated yet.


Am I doing anything wrong on the parallel receiving part?
-- 
Chen Song

Re: Spark Streaming action not triggered with Kafka inputs

Posted by Chen Song <ch...@gmail.com>.
Sorry this is meant to go to spark users. Ignore this thread.

On Fri, Jan 23, 2015 at 2:25 PM, Chen Song <ch...@gmail.com> wrote:

> I am running into some problems with Spark Streaming when reading from
> Kafka.I used Spark 1.2.0 built on CDH5.
> The example is based on:
>
> https://github.com/apache/spark/blob/master/examples/scala-2.10/src/main/scala/org/apache/spark/examples/streaming/KafkaWordCount.scala
> * It works with default implementation.
> val topicMap = topics.split(",").map((_,numThreads.toInt)).toMap
> val lines = KafkaUtils.createStream(ssc, zkQuorum, group,
> topicMap).map(_._2)
>
> * However, when I changed it to parallel receiving, like shown below
>
> val topicMap = topics.split(",").map((_, 1)).toMap
> val parallelInputs = (1 to numThreads.toInt) map { _ =>         KafkaUtils
> .createStream(ssc, zkQuorum, group, topicMap)
>
> }
>
> ssc.union(parallelInputs)
> After the change, the job stage just hang there and never finish. It looks
> like no action is triggered on the streaming job. When I check the
> "Streaming" tab, it show messages below:
> Batch Processing Statistics
>
>    No statistics have been generated yet.
>
>
> Am I doing anything wrong on the parallel receiving part?
> --
> Chen Song
>
>


-- 
Chen Song