You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Daniela Stoiber <da...@gmail.com> on 2016/05/12 20:50:30 UTC

Kafka, Storm and Redis

Hello

 

I have a question regarding the combination of Kafka, Storm and Redis.

 

I have created a Kafka producer, which produces messages like this:

1000 100 on

2000 150 off

 

The first two values are IDs, the third value is the information about the
state of a device "on" or "off".

 

I have also created a Kafka spout in Storm and I already receive these
messages in my Storm topology.

 

Now I would like to store the "on" messages in Redis and to delete the "off"
messages from Redis. There should always be an actual list of all "on"
devices in Redis, which I can use afterwards for my analysis. 

Unfortunately I have no idea how to realize this. Do I have to split the
messages or can I store/delete them as they are? How could this be realized?

 

Thank you very much in advance.

 

Regards,

Daniela 


Re: AW: Kafka, Storm and Redis

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
What do you mean by a list of on devices in redis?  Are you talking about http://redis.io/commands#list lists?  Or is it just a set of keys that you can scan to see which devices are on?
How exact do you need this list to be?  Is it OK if one of the states is wrong until a new message for that device is sent?  How frequently are messages sent? The issue is that this type of a problem will require the events to be processed in order and at least once.
Storm does not guarantee the in order aspect of this, except in a few specific situations.  So unless you are very careful it is entirely possible that two messages sent close to one another in time could be processed out of order by storm.  If they were both for the same device then the state would be switched.  What is more if you are doing at least once processing storm replays out of order.  It will not role back and start from the last success it will just replay the one message that failed. 

So in my opinion if all you care about is keeping the state up to date you probably don't need/want storm (or really any stream processing for that matter).
The first thing you need to do is to guarantee that for the event ingestion that all events associated with a given device will go through the same partition.  If you don't do this kafka also does not guarantee order so you are dead in the water.
Once you have that write a very simple piece of code that will read messages from one or more kafka partitions, parses the data, updates redis, then informs kafka that you are done with the message and repeat.
If you want to make it more efficient you can read a "batch"of messages from kafka, and then do a batch write to redis and batch ack the messages to kafka.
 - Bobby 

    On Saturday, May 14, 2016 2:58 AM, Daniela Stoiber <da...@gmail.com> wrote:
 

 Hello

 

Does no one have any idea regarding this topic? Would a tokenizer bolt be
helpful? But with the tokenizer bolt I can only split the string, right? How
can I assign the splitted string to fields?

 

Thank you very much in advance.

 

Regards,

Daniela 

 

Von: Daniela Stoiber [mailto:daniela.stoiber@gmail.com] 
Gesendet: Donnerstag, 12. Mai 2016 22:51
An: dev@storm.apache.org
Betreff: Kafka, Storm and Redis

 

Hello

 

I have a question regarding the combination of Kafka, Storm and Redis.

 

I have created a Kafka producer, which produces messages like this:

1000 100 on

2000 150 off

 

The first two values are IDs, the third value is the information about the
state of a device "on" or "off".

 

I have also created a Kafka spout in Storm and I already receive these
messages in my Storm topology.

 

Now I would like to store the "on" messages in Redis and to delete the "off"
messages from Redis. There should always be an actual list of all "on"
devices in Redis, which I can use afterwards for my analysis. 

Unfortunately I have no idea how to realize this. Do I have to split the
messages or can I store/delete them as they are? How could this be realized?

 

Thank you very much in advance.

 

Regards,

Daniela 



  

AW: Kafka, Storm and Redis

Posted by Daniela Stoiber <da...@gmail.com>.
Hello

 

Does no one have any idea regarding this topic? Would a tokenizer bolt be
helpful? But with the tokenizer bolt I can only split the string, right? How
can I assign the splitted string to fields?

 

Thank you very much in advance.

 

Regards,

Daniela 

 

Von: Daniela Stoiber [mailto:daniela.stoiber@gmail.com] 
Gesendet: Donnerstag, 12. Mai 2016 22:51
An: dev@storm.apache.org
Betreff: Kafka, Storm and Redis

 

Hello

 

I have a question regarding the combination of Kafka, Storm and Redis.

 

I have created a Kafka producer, which produces messages like this:

1000 100 on

2000 150 off

 

The first two values are IDs, the third value is the information about the
state of a device "on" or "off".

 

I have also created a Kafka spout in Storm and I already receive these
messages in my Storm topology.

 

Now I would like to store the "on" messages in Redis and to delete the "off"
messages from Redis. There should always be an actual list of all "on"
devices in Redis, which I can use afterwards for my analysis. 

Unfortunately I have no idea how to realize this. Do I have to split the
messages or can I store/delete them as they are? How could this be realized?

 

Thank you very much in advance.

 

Regards,

Daniela