You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Ritesh Sinha <ku...@gmail.com> on 2015/08/07 07:24:24 UTC

Adding Kafka Topics dynamically to storm as it is created while the topology is running

I have a topology which runs in the following way :
It reads data from the Kafka topic and pass it to storm.Storm processes the
data and stores it into two different DBs(mongo & cassandra).

Here, the kafka topic is the name of the customer also the database name in
mongodb and cassandra is same as the name of the kafka topic.

Now, Suppose i have submitted the topology and it is running and i get a
new customer.

I will add a topic name in my kafka .So, is it possible to make storm read
data from that kafka topic when the cluster is running.

Thanks

Re: Adding Kafka Topics dynamically to storm as it is created while the topology is running

Posted by Ritesh Sinha <ku...@gmail.com>.
Here the basic table structure for all the customers will be similar.I will
distinguish them by creating different dbs.And the name of the dbs will be
same as the name of the topic created in kafka.While inserting or updating
the table in db i will use the topic name to create the connection.

On Fri, Aug 7, 2015 at 12:11 PM, Kishore Senji <ks...@gmail.com> wrote:

> >>I will add a topic name in my kafka .So, is it possible to make storm
> read data from that kafka topic when the cluster is running.
>
> I assume that you are referring to making the Spout read the data from the
> new topic that is created at runtime (and not really adding a new Spout
> thereby changing the topology as that is not possible). If so, this is not
> possible by the KafkaSpout that ships with Storm. You would have to extend
> it or create your own Spout which can do that.
>
> But why are you creating a new topic for every customer and expecting to
> read from them as and when they are created. Assume for a moment even if
> this is possible from KafkaSpout that ships with Storm, your down stream
> bolt which stores data in to Mongodb has to be aware of which message
> belongs to which user to appropriately store the record.
>
> Why wouldn't you have only one topic and identify the user as part of the
> message?
>
>
> On Thu, Aug 6, 2015 at 10:24 PM, Ritesh Sinha <
> kumarriteshranjansinha@gmail.com> wrote:
>
>> I have a topology which runs in the following way :
>> It reads data from the Kafka topic and pass it to storm.Storm processes
>> the data and stores it into two different DBs(mongo & cassandra).
>>
>> Here, the kafka topic is the name of the customer also the database name
>> in mongodb and cassandra is same as the name of the kafka topic.
>>
>> Now, Suppose i have submitted the topology and it is running and i get a
>> new customer.
>>
>> I will add a topic name in my kafka .So, is it possible to make storm
>> read data from that kafka topic when the cluster is running.
>>
>> Thanks
>>
>
>

Re: Adding Kafka Topics dynamically to storm as it is created while the topology is running

Posted by Kishore Senji <ks...@gmail.com>.
>>I will add a topic name in my kafka .So, is it possible to make storm
read data from that kafka topic when the cluster is running.

I assume that you are referring to making the Spout read the data from the
new topic that is created at runtime (and not really adding a new Spout
thereby changing the topology as that is not possible). If so, this is not
possible by the KafkaSpout that ships with Storm. You would have to extend
it or create your own Spout which can do that.

But why are you creating a new topic for every customer and expecting to
read from them as and when they are created. Assume for a moment even if
this is possible from KafkaSpout that ships with Storm, your down stream
bolt which stores data in to Mongodb has to be aware of which message
belongs to which user to appropriately store the record.

Why wouldn't you have only one topic and identify the user as part of the
message?


On Thu, Aug 6, 2015 at 10:24 PM, Ritesh Sinha <
kumarriteshranjansinha@gmail.com> wrote:

> I have a topology which runs in the following way :
> It reads data from the Kafka topic and pass it to storm.Storm processes
> the data and stores it into two different DBs(mongo & cassandra).
>
> Here, the kafka topic is the name of the customer also the database name
> in mongodb and cassandra is same as the name of the kafka topic.
>
> Now, Suppose i have submitted the topology and it is running and i get a
> new customer.
>
> I will add a topic name in my kafka .So, is it possible to make storm read
> data from that kafka topic when the cluster is running.
>
> Thanks
>

Re: Adding Kafka Topics dynamically to storm as it is created while the topology is running

Posted by Ritesh Sinha <ku...@gmail.com>.
I have a doubt regarding performance.
Here are the two solution i can think of

   1. I will write a bolt which gets trigged whenever a new customer name
   gets added into a Kafka topic .That bolt will have code  which will submit
   the new topology.
   2. I will create independent jars for all the customer and submit the
   topology.

Which approach it better and efficient ?

Does deploying multiple jars will create any problem or in first case what
if the storm nimbus fails, will it be a problem ?


Thanks !!



On Fri, Aug 7, 2015 at 12:07 PM, Abhishek Agarwal <ab...@gmail.com>
wrote:

> Yes. Since kafka takes the topic in the configuration, it means you will
> have to add a new spout with different config. Either you resubmit the
> topology (along with jar) or you can just have different topology for
> different consumer.
>
> On Fri, Aug 7, 2015 at 11:52 AM, Ritesh Sinha <
> kumarriteshranjansinha@gmail.com> wrote:
>
>> Do you mean creating a new jar for different customers and deploying it
>> on the cluster?
>>
>> On Fri, Aug 7, 2015 at 11:45 AM, Abhishek Agarwal <ab...@gmail.com>
>> wrote:
>>
>>> You will have to re-deploy your topology, with a new kafka spout for
>>> another topic.
>>>
>>> On Fri, Aug 7, 2015 at 10:54 AM, Ritesh Sinha <
>>> kumarriteshranjansinha@gmail.com> wrote:
>>>
>>>> I have a topology which runs in the following way :
>>>> It reads data from the Kafka topic and pass it to storm.Storm processes
>>>> the data and stores it into two different DBs(mongo & cassandra).
>>>>
>>>> Here, the kafka topic is the name of the customer also the database
>>>> name in mongodb and cassandra is same as the name of the kafka topic.
>>>>
>>>> Now, Suppose i have submitted the topology and it is running and i get
>>>> a new customer.
>>>>
>>>> I will add a topic name in my kafka .So, is it possible to make storm
>>>> read data from that kafka topic when the cluster is running.
>>>>
>>>> Thanks
>>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Abhishek Agarwal
>>>
>>>
>>
>
>
> --
> Regards,
> Abhishek Agarwal
>
>

Re: Adding Kafka Topics dynamically to storm as it is created while the topology is running

Posted by Abhishek Agarwal <ab...@gmail.com>.
Yes. Since kafka takes the topic in the configuration, it means you will
have to add a new spout with different config. Either you resubmit the
topology (along with jar) or you can just have different topology for
different consumer.

On Fri, Aug 7, 2015 at 11:52 AM, Ritesh Sinha <
kumarriteshranjansinha@gmail.com> wrote:

> Do you mean creating a new jar for different customers and deploying it on
> the cluster?
>
> On Fri, Aug 7, 2015 at 11:45 AM, Abhishek Agarwal <ab...@gmail.com>
> wrote:
>
>> You will have to re-deploy your topology, with a new kafka spout for
>> another topic.
>>
>> On Fri, Aug 7, 2015 at 10:54 AM, Ritesh Sinha <
>> kumarriteshranjansinha@gmail.com> wrote:
>>
>>> I have a topology which runs in the following way :
>>> It reads data from the Kafka topic and pass it to storm.Storm processes
>>> the data and stores it into two different DBs(mongo & cassandra).
>>>
>>> Here, the kafka topic is the name of the customer also the database name
>>> in mongodb and cassandra is same as the name of the kafka topic.
>>>
>>> Now, Suppose i have submitted the topology and it is running and i get a
>>> new customer.
>>>
>>> I will add a topic name in my kafka .So, is it possible to make storm
>>> read data from that kafka topic when the cluster is running.
>>>
>>> Thanks
>>>
>>
>>
>>
>> --
>> Regards,
>> Abhishek Agarwal
>>
>>
>


-- 
Regards,
Abhishek Agarwal

Re: Adding Kafka Topics dynamically to storm as it is created while the topology is running

Posted by Ritesh Sinha <ku...@gmail.com>.
Do you mean creating a new jar for different customers and deploying it on
the cluster?

On Fri, Aug 7, 2015 at 11:45 AM, Abhishek Agarwal <ab...@gmail.com>
wrote:

> You will have to re-deploy your topology, with a new kafka spout for
> another topic.
>
> On Fri, Aug 7, 2015 at 10:54 AM, Ritesh Sinha <
> kumarriteshranjansinha@gmail.com> wrote:
>
>> I have a topology which runs in the following way :
>> It reads data from the Kafka topic and pass it to storm.Storm processes
>> the data and stores it into two different DBs(mongo & cassandra).
>>
>> Here, the kafka topic is the name of the customer also the database name
>> in mongodb and cassandra is same as the name of the kafka topic.
>>
>> Now, Suppose i have submitted the topology and it is running and i get a
>> new customer.
>>
>> I will add a topic name in my kafka .So, is it possible to make storm
>> read data from that kafka topic when the cluster is running.
>>
>> Thanks
>>
>
>
>
> --
> Regards,
> Abhishek Agarwal
>
>

Re: Adding Kafka Topics dynamically to storm as it is created while the topology is running

Posted by Abhishek Agarwal <ab...@gmail.com>.
You will have to re-deploy your topology, with a new kafka spout for
another topic.

On Fri, Aug 7, 2015 at 10:54 AM, Ritesh Sinha <
kumarriteshranjansinha@gmail.com> wrote:

> I have a topology which runs in the following way :
> It reads data from the Kafka topic and pass it to storm.Storm processes
> the data and stores it into two different DBs(mongo & cassandra).
>
> Here, the kafka topic is the name of the customer also the database name
> in mongodb and cassandra is same as the name of the kafka topic.
>
> Now, Suppose i have submitted the topology and it is running and i get a
> new customer.
>
> I will add a topic name in my kafka .So, is it possible to make storm read
> data from that kafka topic when the cluster is running.
>
> Thanks
>



-- 
Regards,
Abhishek Agarwal