You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Em <ma...@yahoo.de> on 2011/11/14 07:54:54 UTC

HBase Stack

Hello list,

I was asked whether it is a good idea to replace the M in LAMP with
Hbase as well as the P with Java-Servlet (i.e. Tomcat) so that you run
your webserver, your hbase-instance, hadoop etc. on the same machine.

Are the differences compared to a LAMP-Stack in terms of performance large?

It is clear that a lot of benefits like redundancy etc. are not
available in this setup. However if the idea and userbase grows you can
quickly add these features to the environment by just setting up new
machines and connect them with eachother.

When I was asked about this I had no answer.
Hopefully you can bring light into this!

Kind regards,
Em

Re: HBase Stack

Posted by Ian Varley <iv...@salesforce.com>.
Em,

To add to what Joey said, consider that there are very significant trade-offs you make when building something on HBase (or any of the new generation of non-relational databases). For starters, you don't get:

 - A declarative query language like SQL that can build optimal physical access plans from arbitrarily complex logical queries
 - Secondary indexing (so if you want to look things up by something other than the primary key, you can't do it without a full table scan)
 - Multi-row or multi-object suspended transactions (so you can't just "roll back" a set of changes like you can in a relational database, nor can you keep operations isolated from other concurrent readers until they commit)

Scalable data storage systems like HBase may eventually make up for these deficiencies, but that hasn't happened yet. Today, using HBase is only appropriate if you have a really large amount of data and you can predict and design for pretty much all of your access patterns up front.

Ian

On Nov 14, 2011, at 8:14 AM, Joey Echeverria wrote:

> I don't think I would try to use a single-node HBase cluster to
> replace a MySQL database. HBase has a sweet spot, both in terms of
> scale and data access patterns. In general, it should not be viewed as
> a drop in replacement to MySQL. My questions to you would be:
> 
> 1) How much data do you need to store?
> 2) What are your access patterns? Lots of joins, individual row
> lookups, range scans, etc.
> 
> -Joey
> 
> On Mon, Nov 14, 2011 at 1:54 AM, Em <ma...@yahoo.de> wrote:
>> Hello list,
>> 
>> I was asked whether it is a good idea to replace the M in LAMP with
>> Hbase as well as the P with Java-Servlet (i.e. Tomcat) so that you run
>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>> 
>> Are the differences compared to a LAMP-Stack in terms of performance large?
>> 
>> It is clear that a lot of benefits like redundancy etc. are not
>> available in this setup. However if the idea and userbase grows you can
>> quickly add these features to the environment by just setting up new
>> machines and connect them with eachother.
>> 
>> When I was asked about this I had no answer.
>> Hopefully you can bring light into this!
>> 
>> Kind regards,
>> Em
>> 
> 
> 
> 
> -- 
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434


Re: HBase Stack

Posted by Joey Echeverria <jo...@cloudera.com>.
I don't think I would try to use a single-node HBase cluster to
replace a MySQL database. HBase has a sweet spot, both in terms of
scale and data access patterns. In general, it should not be viewed as
a drop in replacement to MySQL. My questions to you would be:

1) How much data do you need to store?
2) What are your access patterns? Lots of joins, individual row
lookups, range scans, etc.

-Joey

On Mon, Nov 14, 2011 at 1:54 AM, Em <ma...@yahoo.de> wrote:
> Hello list,
>
> I was asked whether it is a good idea to replace the M in LAMP with
> Hbase as well as the P with Java-Servlet (i.e. Tomcat) so that you run
> your webserver, your hbase-instance, hadoop etc. on the same machine.
>
> Are the differences compared to a LAMP-Stack in terms of performance large?
>
> It is clear that a lot of benefits like redundancy etc. are not
> available in this setup. However if the idea and userbase grows you can
> quickly add these features to the environment by just setting up new
> machines and connect them with eachother.
>
> When I was asked about this I had no answer.
> Hopefully you can bring light into this!
>
> Kind regards,
> Em
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: HBase Stack

Posted by Ayon Sinha <ay...@yahoo.com>.
HBase runs a lot of components as java processes and it is based on the premise that they will be on separate commodity machines. Now if you take one commodity hardware with not that much mem/cpu/disk etc., you end up clogging the system with all these processes on the same hardware.

Now if say, you think your application will be so hot that within a few months of launching you will need to move to HBase, then you may be better off starting with a single BIG server running HBase and then scale out. But this is hypothetical and I do not have personal experience to compare the pain of migrating from MySQL to Hbase vs pain of scaling out single HBase.

BTW, Lars George's book is very thick and comprehensive. (I am not affiliated with Lars or the book publisher. I'm just an owner/reader of that book.)
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
From: Em <ma...@yahoo.de>
To: common-user@hadoop.apache.org
Sent: Tuesday, November 15, 2011 11:59 AM
Subject: Re: HBase Stack

That was exactly my idea if the answer is "don't use HBase until time
will tell you to do so".

Can you go a little bit more in detail why you think HBase for small
project is not a good choice? Understanding the reasons (or reading some
references) will help a lot.

Btw.: Is this only relevant for HBase or for Hadoop, too?

Regards,
Em

Am 15.11.2011 20:55, schrieb Joey Echeverria:
> You can certainly run HBase on a single server, but I don't think
> you'd want to. Very few projects ever reach a scale where a single
> MySQL server can't handle it. In my opinion, you should start with the
> easy solution (MySQL) and only bring HBase into the mix when your
> scale really demands it. If you're worried about being locked into a
> specific technology, then spend your design efforts building
> interfaces that are easy to swap out when the time comes.
> 
> -Joey
> 
> On Tue, Nov 15, 2011 at 2:17 PM, Em <ma...@yahoo.de> wrote:
>> Thank you both for your answers.
>>
>> There is no real project.
>> But the scenario we talked about was something like having a community,
>> browsergame or something like that while having a low-budget (low in
>> terms of he is a student and thinks about how to realize some of his
>> ideas from a technical perspective).
>> When it starts the amount of data is relativley small and a large
>> percentage will fit into RAM. However if the project is becoming more
>> successfull, one wants to focus on making the project more awesome
>> (adding/improving features) instead of refactoring the whole
>> data-management and -architecture.
>>
>> The question he asked was what I think about using HBase right from the
>> beginning and where I expect problems. Since I have no experiences with
>> HBase I had no answer.
>> I even don't know whether a good advice would be to take a refactoring
>> of the data-management into consideration, if HBase is no choice for a
>> single-server-project (not even in the beginning).
>>
>> Regards,
>> Em
>>
>> Am 15.11.2011 19:08, schrieb Travis Camechis:
>>> agreed, What is your current size of your data?
>>>
>>> On Tue, Nov 15, 2011 at 12:54 PM, Ayon Sinha <ay...@yahoo.com> wrote:
>>>
>>>> I believe one of the biggest problem you will face with HBase in a small
>>>> setup is that MySQL is happy with single machine setup (less maintenance
>>>> headache for small scale projects) compared to HBase running in
>>>> pseudo-ditrib mode. In the pseudo-distib mode single HBase machine will
>>>> have too much overhead. It will really shine when you grow really big and
>>>> need to scale out. THats when HBase will pull-out from MySQL really fast.
>>>>
>>>> This particular scenario is very well described in the HBase: The Def
>>>> Guide book. When you have to grow, LAMP stack need things like memcached +
>>>> sharding (lots of headache).. compared to HBase (headache growing smaller
>>>> with more community support and stability).
>>>>
>>>> -Ayon
>>>> See My Photos on Flickr
>>>> Also check out my Blog for answers to commonly asked questions.
>>>>
>>>>
>>>>
>>>> ________________________________
>>>> From: Em <ma...@yahoo.de>
>>>> To: common-user@hadoop.apache.org
>>>> Sent: Tuesday, November 15, 2011 9:38 AM
>>>> Subject: Re: HBase Stack
>>>>
>>>> Hi Travis,
>>>>
>>>> I think I wasn't very clear about my question:
>>>> If the project grows, you will be able to have machines optimized for
>>>> special things (hbase-servers and tomcat-servers, maybe devided into
>>>> sub-groups with special hardware-requirements for more efficiency).
>>>> And this is what you should do, if your project grows and you gain the
>>>> revenue neccessary to pay for it.
>>>>
>>>> My question was more targeted at the starting point of a (small) project:
>>>> How does a machine with Linux, Java (Tomcat) and MySQL competes with the
>>>> same setup with HBase beeing the database server?
>>>> Given this example one can assume that you access your data in MySQL by PK.
>>>>
>>>> Regards,
>>>> Em
>>>>
>>>> Am 15.11.2011 17:41, schrieb Travis Camechis:
>>>>> I don't think you would want to run all of this on the same machine,
>>>>> especially if your application/ data requirements are fairly large.
>>>>>
>>>>> On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de>
>>>> wrote:
>>>>>
>>>>>> Hello folks,
>>>>>>
>>>>>> seems like you deal here with HBase-questions.
>>>>>>
>>>>>> Below you will find my question.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Em
>>>>>>
>>>>>> -------- original message --------
>>>>>> Hello list,
>>>>>>
>>>>>> I was asked whether it is a good idea to replace the M in LAMP with
>>>>>> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
>>>>>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>>>>>>
>>>>>> Are the differences compared to a LAMP-Stack in terms of performance
>>>> large?
>>>>>>
>>>>>> It is clear that a lot of benefits like redundancy etc. are not
>>>>>> available in this setup. However if the idea and userbase grows you can
>>>>>> quickly add these features to the environment by just setting up new
>>>>>> machines and connect them with eachother.
>>>>>>
>>>>>> When I was asked about this I had no answer.
>>>>>> Hopefully you can bring light into this!
>>>>>>
>>>>>> Kind regards,
>>>>>> Em
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 
> 
> 

Re: HBase Stack

Posted by Em <ma...@yahoo.de>.
That was exactly my idea if the answer is "don't use HBase until time
will tell you to do so".

Can you go a little bit more in detail why you think HBase for small
project is not a good choice? Understanding the reasons (or reading some
references) will help a lot.

Btw.: Is this only relevant for HBase or for Hadoop, too?

Regards,
Em

Am 15.11.2011 20:55, schrieb Joey Echeverria:
> You can certainly run HBase on a single server, but I don't think
> you'd want to. Very few projects ever reach a scale where a single
> MySQL server can't handle it. In my opinion, you should start with the
> easy solution (MySQL) and only bring HBase into the mix when your
> scale really demands it. If you're worried about being locked into a
> specific technology, then spend your design efforts building
> interfaces that are easy to swap out when the time comes.
> 
> -Joey
> 
> On Tue, Nov 15, 2011 at 2:17 PM, Em <ma...@yahoo.de> wrote:
>> Thank you both for your answers.
>>
>> There is no real project.
>> But the scenario we talked about was something like having a community,
>> browsergame or something like that while having a low-budget (low in
>> terms of he is a student and thinks about how to realize some of his
>> ideas from a technical perspective).
>> When it starts the amount of data is relativley small and a large
>> percentage will fit into RAM. However if the project is becoming more
>> successfull, one wants to focus on making the project more awesome
>> (adding/improving features) instead of refactoring the whole
>> data-management and -architecture.
>>
>> The question he asked was what I think about using HBase right from the
>> beginning and where I expect problems. Since I have no experiences with
>> HBase I had no answer.
>> I even don't know whether a good advice would be to take a refactoring
>> of the data-management into consideration, if HBase is no choice for a
>> single-server-project (not even in the beginning).
>>
>> Regards,
>> Em
>>
>> Am 15.11.2011 19:08, schrieb Travis Camechis:
>>> agreed, What is your current size of your data?
>>>
>>> On Tue, Nov 15, 2011 at 12:54 PM, Ayon Sinha <ay...@yahoo.com> wrote:
>>>
>>>> I believe one of the biggest problem you will face with HBase in a small
>>>> setup is that MySQL is happy with single machine setup (less maintenance
>>>> headache for small scale projects) compared to HBase running in
>>>> pseudo-ditrib mode. In the pseudo-distib mode single HBase machine will
>>>> have too much overhead. It will really shine when you grow really big and
>>>> need to scale out. THats when HBase will pull-out from MySQL really fast.
>>>>
>>>> This particular scenario is very well described in the HBase: The Def
>>>> Guide book. When you have to grow, LAMP stack need things like memcached +
>>>> sharding (lots of headache).. compared to HBase (headache growing smaller
>>>> with more community support and stability).
>>>>
>>>> -Ayon
>>>> See My Photos on Flickr
>>>> Also check out my Blog for answers to commonly asked questions.
>>>>
>>>>
>>>>
>>>> ________________________________
>>>> From: Em <ma...@yahoo.de>
>>>> To: common-user@hadoop.apache.org
>>>> Sent: Tuesday, November 15, 2011 9:38 AM
>>>> Subject: Re: HBase Stack
>>>>
>>>> Hi Travis,
>>>>
>>>> I think I wasn't very clear about my question:
>>>> If the project grows, you will be able to have machines optimized for
>>>> special things (hbase-servers and tomcat-servers, maybe devided into
>>>> sub-groups with special hardware-requirements for more efficiency).
>>>> And this is what you should do, if your project grows and you gain the
>>>> revenue neccessary to pay for it.
>>>>
>>>> My question was more targeted at the starting point of a (small) project:
>>>> How does a machine with Linux, Java (Tomcat) and MySQL competes with the
>>>> same setup with HBase beeing the database server?
>>>> Given this example one can assume that you access your data in MySQL by PK.
>>>>
>>>> Regards,
>>>> Em
>>>>
>>>> Am 15.11.2011 17:41, schrieb Travis Camechis:
>>>>> I don't think you would want to run all of this on the same machine,
>>>>> especially if your application/ data requirements are fairly large.
>>>>>
>>>>> On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de>
>>>> wrote:
>>>>>
>>>>>> Hello folks,
>>>>>>
>>>>>> seems like you deal here with HBase-questions.
>>>>>>
>>>>>> Below you will find my question.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Em
>>>>>>
>>>>>> -------- original message --------
>>>>>> Hello list,
>>>>>>
>>>>>> I was asked whether it is a good idea to replace the M in LAMP with
>>>>>> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
>>>>>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>>>>>>
>>>>>> Are the differences compared to a LAMP-Stack in terms of performance
>>>> large?
>>>>>>
>>>>>> It is clear that a lot of benefits like redundancy etc. are not
>>>>>> available in this setup. However if the idea and userbase grows you can
>>>>>> quickly add these features to the environment by just setting up new
>>>>>> machines and connect them with eachother.
>>>>>>
>>>>>> When I was asked about this I had no answer.
>>>>>> Hopefully you can bring light into this!
>>>>>>
>>>>>> Kind regards,
>>>>>> Em
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 
> 
> 

Re: HBase Stack

Posted by Joey Echeverria <jo...@cloudera.com>.
You can certainly run HBase on a single server, but I don't think
you'd want to. Very few projects ever reach a scale where a single
MySQL server can't handle it. In my opinion, you should start with the
easy solution (MySQL) and only bring HBase into the mix when your
scale really demands it. If you're worried about being locked into a
specific technology, then spend your design efforts building
interfaces that are easy to swap out when the time comes.

-Joey

On Tue, Nov 15, 2011 at 2:17 PM, Em <ma...@yahoo.de> wrote:
> Thank you both for your answers.
>
> There is no real project.
> But the scenario we talked about was something like having a community,
> browsergame or something like that while having a low-budget (low in
> terms of he is a student and thinks about how to realize some of his
> ideas from a technical perspective).
> When it starts the amount of data is relativley small and a large
> percentage will fit into RAM. However if the project is becoming more
> successfull, one wants to focus on making the project more awesome
> (adding/improving features) instead of refactoring the whole
> data-management and -architecture.
>
> The question he asked was what I think about using HBase right from the
> beginning and where I expect problems. Since I have no experiences with
> HBase I had no answer.
> I even don't know whether a good advice would be to take a refactoring
> of the data-management into consideration, if HBase is no choice for a
> single-server-project (not even in the beginning).
>
> Regards,
> Em
>
> Am 15.11.2011 19:08, schrieb Travis Camechis:
>> agreed, What is your current size of your data?
>>
>> On Tue, Nov 15, 2011 at 12:54 PM, Ayon Sinha <ay...@yahoo.com> wrote:
>>
>>> I believe one of the biggest problem you will face with HBase in a small
>>> setup is that MySQL is happy with single machine setup (less maintenance
>>> headache for small scale projects) compared to HBase running in
>>> pseudo-ditrib mode. In the pseudo-distib mode single HBase machine will
>>> have too much overhead. It will really shine when you grow really big and
>>> need to scale out. THats when HBase will pull-out from MySQL really fast.
>>>
>>> This particular scenario is very well described in the HBase: The Def
>>> Guide book. When you have to grow, LAMP stack need things like memcached +
>>> sharding (lots of headache).. compared to HBase (headache growing smaller
>>> with more community support and stability).
>>>
>>> -Ayon
>>> See My Photos on Flickr
>>> Also check out my Blog for answers to commonly asked questions.
>>>
>>>
>>>
>>> ________________________________
>>> From: Em <ma...@yahoo.de>
>>> To: common-user@hadoop.apache.org
>>> Sent: Tuesday, November 15, 2011 9:38 AM
>>> Subject: Re: HBase Stack
>>>
>>> Hi Travis,
>>>
>>> I think I wasn't very clear about my question:
>>> If the project grows, you will be able to have machines optimized for
>>> special things (hbase-servers and tomcat-servers, maybe devided into
>>> sub-groups with special hardware-requirements for more efficiency).
>>> And this is what you should do, if your project grows and you gain the
>>> revenue neccessary to pay for it.
>>>
>>> My question was more targeted at the starting point of a (small) project:
>>> How does a machine with Linux, Java (Tomcat) and MySQL competes with the
>>> same setup with HBase beeing the database server?
>>> Given this example one can assume that you access your data in MySQL by PK.
>>>
>>> Regards,
>>> Em
>>>
>>> Am 15.11.2011 17:41, schrieb Travis Camechis:
>>>> I don't think you would want to run all of this on the same machine,
>>>> especially if your application/ data requirements are fairly large.
>>>>
>>>> On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de>
>>> wrote:
>>>>
>>>>> Hello folks,
>>>>>
>>>>> seems like you deal here with HBase-questions.
>>>>>
>>>>> Below you will find my question.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Em
>>>>>
>>>>> -------- original message --------
>>>>> Hello list,
>>>>>
>>>>> I was asked whether it is a good idea to replace the M in LAMP with
>>>>> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
>>>>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>>>>>
>>>>> Are the differences compared to a LAMP-Stack in terms of performance
>>> large?
>>>>>
>>>>> It is clear that a lot of benefits like redundancy etc. are not
>>>>> available in this setup. However if the idea and userbase grows you can
>>>>> quickly add these features to the environment by just setting up new
>>>>> machines and connect them with eachother.
>>>>>
>>>>> When I was asked about this I had no answer.
>>>>> Hopefully you can bring light into this!
>>>>>
>>>>> Kind regards,
>>>>> Em
>>>>>
>>>>>
>>>>
>>>
>>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: HBase Stack

Posted by Em <ma...@yahoo.de>.
Thank you both for your answers.

There is no real project.
But the scenario we talked about was something like having a community,
browsergame or something like that while having a low-budget (low in
terms of he is a student and thinks about how to realize some of his
ideas from a technical perspective).
When it starts the amount of data is relativley small and a large
percentage will fit into RAM. However if the project is becoming more
successfull, one wants to focus on making the project more awesome
(adding/improving features) instead of refactoring the whole
data-management and -architecture.

The question he asked was what I think about using HBase right from the
beginning and where I expect problems. Since I have no experiences with
HBase I had no answer.
I even don't know whether a good advice would be to take a refactoring
of the data-management into consideration, if HBase is no choice for a
single-server-project (not even in the beginning).

Regards,
Em

Am 15.11.2011 19:08, schrieb Travis Camechis:
> agreed, What is your current size of your data?
> 
> On Tue, Nov 15, 2011 at 12:54 PM, Ayon Sinha <ay...@yahoo.com> wrote:
> 
>> I believe one of the biggest problem you will face with HBase in a small
>> setup is that MySQL is happy with single machine setup (less maintenance
>> headache for small scale projects) compared to HBase running in
>> pseudo-ditrib mode. In the pseudo-distib mode single HBase machine will
>> have too much overhead. It will really shine when you grow really big and
>> need to scale out. THats when HBase will pull-out from MySQL really fast.
>>
>> This particular scenario is very well described in the HBase: The Def
>> Guide book. When you have to grow, LAMP stack need things like memcached +
>> sharding (lots of headache).. compared to HBase (headache growing smaller
>> with more community support and stability).
>>
>> -Ayon
>> See My Photos on Flickr
>> Also check out my Blog for answers to commonly asked questions.
>>
>>
>>
>> ________________________________
>> From: Em <ma...@yahoo.de>
>> To: common-user@hadoop.apache.org
>> Sent: Tuesday, November 15, 2011 9:38 AM
>> Subject: Re: HBase Stack
>>
>> Hi Travis,
>>
>> I think I wasn't very clear about my question:
>> If the project grows, you will be able to have machines optimized for
>> special things (hbase-servers and tomcat-servers, maybe devided into
>> sub-groups with special hardware-requirements for more efficiency).
>> And this is what you should do, if your project grows and you gain the
>> revenue neccessary to pay for it.
>>
>> My question was more targeted at the starting point of a (small) project:
>> How does a machine with Linux, Java (Tomcat) and MySQL competes with the
>> same setup with HBase beeing the database server?
>> Given this example one can assume that you access your data in MySQL by PK.
>>
>> Regards,
>> Em
>>
>> Am 15.11.2011 17:41, schrieb Travis Camechis:
>>> I don't think you would want to run all of this on the same machine,
>>> especially if your application/ data requirements are fairly large.
>>>
>>> On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de>
>> wrote:
>>>
>>>> Hello folks,
>>>>
>>>> seems like you deal here with HBase-questions.
>>>>
>>>> Below you will find my question.
>>>>
>>>> Thanks!
>>>>
>>>> Em
>>>>
>>>> -------- original message --------
>>>> Hello list,
>>>>
>>>> I was asked whether it is a good idea to replace the M in LAMP with
>>>> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
>>>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>>>>
>>>> Are the differences compared to a LAMP-Stack in terms of performance
>> large?
>>>>
>>>> It is clear that a lot of benefits like redundancy etc. are not
>>>> available in this setup. However if the idea and userbase grows you can
>>>> quickly add these features to the environment by just setting up new
>>>> machines and connect them with eachother.
>>>>
>>>> When I was asked about this I had no answer.
>>>> Hopefully you can bring light into this!
>>>>
>>>> Kind regards,
>>>> Em
>>>>
>>>>
>>>
>>
> 

Re: HBase Stack

Posted by Travis Camechis <ca...@gmail.com>.
agreed, What is your current size of your data?

On Tue, Nov 15, 2011 at 12:54 PM, Ayon Sinha <ay...@yahoo.com> wrote:

> I believe one of the biggest problem you will face with HBase in a small
> setup is that MySQL is happy with single machine setup (less maintenance
> headache for small scale projects) compared to HBase running in
> pseudo-ditrib mode. In the pseudo-distib mode single HBase machine will
> have too much overhead. It will really shine when you grow really big and
> need to scale out. THats when HBase will pull-out from MySQL really fast.
>
> This particular scenario is very well described in the HBase: The Def
> Guide book. When you have to grow, LAMP stack need things like memcached +
> sharding (lots of headache).. compared to HBase (headache growing smaller
> with more community support and stability).
>
> -Ayon
> See My Photos on Flickr
> Also check out my Blog for answers to commonly asked questions.
>
>
>
> ________________________________
> From: Em <ma...@yahoo.de>
> To: common-user@hadoop.apache.org
> Sent: Tuesday, November 15, 2011 9:38 AM
> Subject: Re: HBase Stack
>
> Hi Travis,
>
> I think I wasn't very clear about my question:
> If the project grows, you will be able to have machines optimized for
> special things (hbase-servers and tomcat-servers, maybe devided into
> sub-groups with special hardware-requirements for more efficiency).
> And this is what you should do, if your project grows and you gain the
> revenue neccessary to pay for it.
>
> My question was more targeted at the starting point of a (small) project:
> How does a machine with Linux, Java (Tomcat) and MySQL competes with the
> same setup with HBase beeing the database server?
> Given this example one can assume that you access your data in MySQL by PK.
>
> Regards,
> Em
>
> Am 15.11.2011 17:41, schrieb Travis Camechis:
> > I don't think you would want to run all of this on the same machine,
> > especially if your application/ data requirements are fairly large.
> >
> > On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de>
> wrote:
> >
> >> Hello folks,
> >>
> >> seems like you deal here with HBase-questions.
> >>
> >> Below you will find my question.
> >>
> >> Thanks!
> >>
> >> Em
> >>
> >> -------- original message --------
> >> Hello list,
> >>
> >> I was asked whether it is a good idea to replace the M in LAMP with
> >> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
> >> your webserver, your hbase-instance, hadoop etc. on the same machine.
> >>
> >> Are the differences compared to a LAMP-Stack in terms of performance
> large?
> >>
> >> It is clear that a lot of benefits like redundancy etc. are not
> >> available in this setup. However if the idea and userbase grows you can
> >> quickly add these features to the environment by just setting up new
> >> machines and connect them with eachother.
> >>
> >> When I was asked about this I had no answer.
> >> Hopefully you can bring light into this!
> >>
> >> Kind regards,
> >> Em
> >>
> >>
> >
>

Re: HBase Stack

Posted by Ayon Sinha <ay...@yahoo.com>.
I believe one of the biggest problem you will face with HBase in a small setup is that MySQL is happy with single machine setup (less maintenance headache for small scale projects) compared to HBase running in pseudo-ditrib mode. In the pseudo-distib mode single HBase machine will have too much overhead. It will really shine when you grow really big and need to scale out. THats when HBase will pull-out from MySQL really fast.

This particular scenario is very well described in the HBase: The Def Guide book. When you have to grow, LAMP stack need things like memcached + sharding (lots of headache).. compared to HBase (headache growing smaller with more community support and stability).
 
-Ayon
See My Photos on Flickr
Also check out my Blog for answers to commonly asked questions.



________________________________
From: Em <ma...@yahoo.de>
To: common-user@hadoop.apache.org
Sent: Tuesday, November 15, 2011 9:38 AM
Subject: Re: HBase Stack

Hi Travis,

I think I wasn't very clear about my question:
If the project grows, you will be able to have machines optimized for
special things (hbase-servers and tomcat-servers, maybe devided into
sub-groups with special hardware-requirements for more efficiency).
And this is what you should do, if your project grows and you gain the
revenue neccessary to pay for it.

My question was more targeted at the starting point of a (small) project:
How does a machine with Linux, Java (Tomcat) and MySQL competes with the
same setup with HBase beeing the database server?
Given this example one can assume that you access your data in MySQL by PK.

Regards,
Em

Am 15.11.2011 17:41, schrieb Travis Camechis:
> I don't think you would want to run all of this on the same machine,
> especially if your application/ data requirements are fairly large.
> 
> On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de> wrote:
> 
>> Hello folks,
>>
>> seems like you deal here with HBase-questions.
>>
>> Below you will find my question.
>>
>> Thanks!
>>
>> Em
>>
>> -------- original message --------
>> Hello list,
>>
>> I was asked whether it is a good idea to replace the M in LAMP with
>> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>>
>> Are the differences compared to a LAMP-Stack in terms of performance large?
>>
>> It is clear that a lot of benefits like redundancy etc. are not
>> available in this setup. However if the idea and userbase grows you can
>> quickly add these features to the environment by just setting up new
>> machines and connect them with eachother.
>>
>> When I was asked about this I had no answer.
>> Hopefully you can bring light into this!
>>
>> Kind regards,
>> Em
>>
>>
> 

Re: HBase Stack

Posted by Em <ma...@yahoo.de>.
Hi Travis,

I think I wasn't very clear about my question:
If the project grows, you will be able to have machines optimized for
special things (hbase-servers and tomcat-servers, maybe devided into
sub-groups with special hardware-requirements for more efficiency).
And this is what you should do, if your project grows and you gain the
revenue neccessary to pay for it.

My question was more targeted at the starting point of a (small) project:
How does a machine with Linux, Java (Tomcat) and MySQL competes with the
same setup with HBase beeing the database server?
Given this example one can assume that you access your data in MySQL by PK.

Regards,
Em

Am 15.11.2011 17:41, schrieb Travis Camechis:
> I don't think you would want to run all of this on the same machine,
> especially if your application/ data requirements are fairly large.
> 
> On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de> wrote:
> 
>> Hello folks,
>>
>> seems like you deal here with HBase-questions.
>>
>> Below you will find my question.
>>
>> Thanks!
>>
>> Em
>>
>> -------- original message --------
>> Hello list,
>>
>> I was asked whether it is a good idea to replace the M in LAMP with
>> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
>> your webserver, your hbase-instance, hadoop etc. on the same machine.
>>
>> Are the differences compared to a LAMP-Stack in terms of performance large?
>>
>> It is clear that a lot of benefits like redundancy etc. are not
>> available in this setup. However if the idea and userbase grows you can
>> quickly add these features to the environment by just setting up new
>> machines and connect them with eachother.
>>
>> When I was asked about this I had no answer.
>> Hopefully you can bring light into this!
>>
>> Kind regards,
>> Em
>>
>>
> 

Re: HBase Stack

Posted by Travis Camechis <ca...@gmail.com>.
I don't think you would want to run all of this on the same machine,
especially if your application/ data requirements are fairly large.

On Tue, Nov 15, 2011 at 11:27 AM, Em <ma...@yahoo.de> wrote:

> Hello folks,
>
> seems like you deal here with HBase-questions.
>
> Below you will find my question.
>
> Thanks!
>
> Em
>
> -------- original message --------
> Hello list,
>
> I was asked whether it is a good idea to replace the M in LAMP with
> Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
> your webserver, your hbase-instance, hadoop etc. on the same machine.
>
> Are the differences compared to a LAMP-Stack in terms of performance large?
>
> It is clear that a lot of benefits like redundancy etc. are not
> available in this setup. However if the idea and userbase grows you can
> quickly add these features to the environment by just setting up new
> machines and connect them with eachother.
>
> When I was asked about this I had no answer.
> Hopefully you can bring light into this!
>
> Kind regards,
> Em
>
>

HBase Stack

Posted by Em <ma...@yahoo.de>.
Hello folks,

seems like you deal here with HBase-questions.

Below you will find my question.

Thanks!

Em

-------- original message --------
Hello list,

I was asked whether it is a good idea to replace the M in LAMP with
Hbase as well as the P with a Java-Servlet (i.e. Tomcat) so that you run
your webserver, your hbase-instance, hadoop etc. on the same machine.

Are the differences compared to a LAMP-Stack in terms of performance large?

It is clear that a lot of benefits like redundancy etc. are not
available in this setup. However if the idea and userbase grows you can
quickly add these features to the environment by just setting up new
machines and connect them with eachother.

When I was asked about this I had no answer.
Hopefully you can bring light into this!

Kind regards,
Em