You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Adaryl \"Bob\" Wakefield, MBA" <ad...@hotmail.com> on 2015/03/09 07:24:30 UTC

Lamda architecture and multi-tenancy

Let’s say you put together a real time streaming solution using Storm, Kafka, and the necessary Zookeeper and whatever storage tech you decide. Is it true that these applications are so resource intensive that they all need to live by themselves on their own machine? Put another way, for the ingestion portion, is the minimum number of machines required here 9?

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData 

Re: Lamda architecture and multi-tenancy

Posted by Krishnanand Khambadkone <kk...@yahoo.com>.
No. 

     On Sunday, March 8, 2015 11:26 PM, "Adaryl "Bob" Wakefield, MBA" <ad...@hotmail.com> wrote:
   

 Let’s say you put together a real time streaming solution using Storm, Kafka, and the necessary Zookeeper and whatever storage tech you decide. Is it true that these applications are so resource intensive that they all need to live by themselves on their own machine? Put another way, for the ingestion portion, is the minimum number of machines required here 9? Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData 

   

Re: Lamda architecture and multi-tenancy

Posted by Krishnanand Khambadkone <kk...@yahoo.com>.
It is obviously not a one size fits all. It depends on a lot of factors.   How much data will you be ingesting, what is the data source, is it a firehose, a web front end or an app that is batching the messages.  How much processing will you be doing in the storm/kafka layer and obviously what will be the rate at which you will persist data to your sink.   So all these factors will determine your topology.  Storm/Spark are memory intensive but if you are streaming as would be the case with Kafka then it should not be much of an issue.
 

     On Sunday, March 8, 2015 11:26 PM, "Adaryl "Bob" Wakefield, MBA" <ad...@hotmail.com> wrote:
   

 Let’s say you put together a real time streaming solution using Storm, Kafka, and the necessary Zookeeper and whatever storage tech you decide. Is it true that these applications are so resource intensive that they all need to live by themselves on their own machine? Put another way, for the ingestion portion, is the minimum number of machines required here 9? Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData 

   

Re: Lamda architecture and multi-tenancy

Posted by Michael Ritsema <mi...@gmail.com>.
I don't use Kafka, but I run Storm on a single node deployment in
production with my main app. DRPC takes up about 25% of 1 core at idle
(which I can disable if I want) but nothing else comes up on the Radar.

You can deploy Storm on EC2 small instances with no problem.

On Mon, Mar 9, 2015 at 1:24 AM, Adaryl "Bob" Wakefield, MBA <
adaryl.wakefield@hotmail.com> wrote:

>   Let’s say you put together a real time streaming solution using Storm,
> Kafka, and the necessary Zookeeper and whatever storage tech you decide. Is
> it true that these applications are so resource intensive that they all
> need to live by themselves on their own machine? Put another way, for the
> ingestion portion, is the minimum number of machines required here 9?
>
> Adaryl "Bob" Wakefield, MBA
> Principal
> Mass Street Analytics, LLC
> 913.938.6685
> www.linkedin.com/in/bobwakefieldmba
> Twitter: @BobLovesData
>

Re: Lamda architecture and multi-tenancy

Posted by Brian O'Neill <bo...@alumni.brown.edu>.
+1.

We actually went through this painŠ
Not only did we separate them out, but we moved Zookeeper to physical
machines. (instead of virtual)
If this is a production system, I would keep them separate and stick with
the 9.

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile € @boneill42 <http://www.twitter.com/boneill42>


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Harsha <st...@harsha.io>
Reply-To:  <us...@storm.apache.org>
Date:  Monday, March 9, 2015 at 11:05 AM
To:  "<Adaryl Bob  Wakefield>", MBA <ad...@hotmail.com>,
<us...@storm.apache.org>
Subject:  Re: Lamda architecture and multi-tenancy

Adaryl,
            IMO I would suggest you to go with 9 machines . The reason being
both Kafka and Zookeeper are disk intensive putting both on the same machine
is risky and not a best practice. Also Storm & Kafka  both are heavy users
zookeeper i.e frequent updates and reads from zookeeper. Having all of them
on the same machines is risky and performance will suffer.

-Harsha
 

> 
> 
> 
> On March 8, 2015 at 11:26:05 PM, Adaryl Bob Wakefield, MBA
> (adaryl.wakefield@hotmail.com) wrote:
>> 
>> Let¹s say you put together a real time streaming solution using Storm, Kafka,
>> and the necessary Zookeeper and whatever storage tech you decide. Is it true
>> that these applications are so resource intensive that they all need to live
>> by themselves on their own machine? Put another way, for the ingestion
>> portion, is the minimum number of machines required here 9?
>>  
>> Adaryl "Bob" Wakefield, MBA
>> Principal
>> Mass Street Analytics, LLC
>> 913.938.6685
>> www.linkedin.com/in/bobwakefieldmba
>> Twitter: @BobLovesData



Re: Lamda architecture and multi-tenancy

Posted by Harsha <st...@harsha.io>.
Adaryl,
            IMO I would suggest you to go with 9 machines . The reason being both Kafka and Zookeeper are disk intensive putting both on the same machine is risky and not a best practice. Also Storm & Kafka  both are heavy users zookeeper i.e frequent updates and reads from zookeeper. Having all of them on the same machines is risky and performance will suffer.

-Harsha



On March 8, 2015 at 11:26:05 PM, Adaryl Bob Wakefield, MBA (adaryl.wakefield@hotmail.com) wrote:

Let’s say you put together a real time streaming solution using Storm, Kafka, and the necessary Zookeeper and whatever storage tech you decide. Is it true that these applications are so resource intensive that they all need to live by themselves on their own machine? Put another way, for the ingestion portion, is the minimum number of machines required here 9?
 
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData