You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Chen Wang <ch...@gmail.com> on 2014/08/08 01:37:58 UTC

Bolt to read from kafka

Folks,
My user case is a bit different: I need to read from a specific kafka topic
at specific time. Thus I have a signal spout signaling the downstream bolt
to start read from the topic.(and stop reading if timeout) So apparently I
will have multiple bolt threads read from the same kafka topic.

I am currently using the High Level API(consumer group) to read from kafka.
i am most concerned about error recovery. If one processing bolt dies, will
other running bolt threads picks up the failed message? Or I have to start
another thread in order to pick up the failed message? What would be
a good practice to ensure the message can be processed at least once,
without depending on the kakfa spout?

Note that all threads are using the same group id.

Thanks,
Chen

Re: Bolt to read from kafka

Posted by Stephen Armstrong <st...@linqia.com>.
Would this work for your use-case?

Have the regular KafkaSpout reading from that topic at all times. Keep your
signal spout as well. Have a "valve-bolt" that listens to both spouts.
Normally, the bolt throws away all tuples from the KafkaSpout. When your
signal spout says so, the stateful bolt switches to "passthrough mode", and
passes the KafkaSpout tuples downstream to the rest of your topology. Then
you get the same kind of switchable Kafka behaviour without having to
rewrite KafkaSpout as a bolt.

Steve


On Thu, Aug 7, 2014 at 4:37 PM, Chen Wang <ch...@gmail.com>
wrote:

> Folks,
> My user case is a bit different: I need to read from a specific kafka
> topic at specific time. Thus I have a signal spout signaling the downstream
> bolt to start read from the topic.(and stop reading if timeout) So
> apparently I will have multiple bolt threads read from the same kafka topic.
>
> I am currently using the High Level API(consumer group) to read from
> kafka. i am most concerned about error recovery. If one processing bolt
> dies, will other running bolt threads picks up the failed message? Or I
> have to start another thread in order to pick up the failed message? What
> would be
> a good practice to ensure the message can be processed at least once,
> without depending on the kakfa spout?
>
> Note that all threads are using the same group id.
>
> Thanks,
> Chen
>
>