You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2010/01/15 21:54:42 UTC

HBase on 1 box? how big?

Hello,

I understand running HBase on a single box is kind of
pointless (thanks Andrew Purtell for the reply about numbers of
boxes)... but I was wondering what kind of box might one need to
host/run various HBase/Hadoop processes?

Imagine I just need to have "HBase in a box", so to speak. :)

I understand it depends on the volume on data, DB structure, request rates...
I don't have those numbers, but say I want HBase to have 100M rows with 
data from Apache logs and want to run the common web analytics/stats 
reports on a nightly basis.

* Would an EC2 Large Instance suffice?
-- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit platform

* How about EC2 Small Instance?
-- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core
with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform

Thanks,
Otis
P.S.
hw specs from http://aws.amazon.com/ec2/#instance

Re: HBase on 1 box? how big?

Posted by stack <st...@duboce.net>.
On Fri, Jan 15, 2010 at 2:30 PM, Otis Gospodnetic <
otis_gospodnetic@yahoo.com> wrote:

> For example, you may have an app that you want to demo to a customer, and
> you can't ask them for N boxes for the demo.  But you can ask them for 1 box
> to install something on.
>
> Can't you do this now?  Just do ./bin/start-hbase.sh with the default
config?  It requires ssh'ing to localhost but that ain't too hard to set up?



> Or maybe you can run everything from a memory stick? ;)  Hey, is there a
> technical reason why having all jars, scripts, configs, etc. on a stick, and
> have the configs point to dirs on the stick for holding data?  I'm not
> joking, really! :)
>
>
You can point hbase to non-standard location for configs -- see hbase-env.sh
-- and same for logging so I don't see reason why you couldn't do
hbase-on-a-stick (Could go nicely with a few of those jumbo shrimp).

St.Ack




> Thanks,
> Otis
>
>
> ----- Original Message ----
> > From: Andrew Purtell <ap...@apache.org>
> > To: hbase-user@hadoop.apache.org
> > Sent: Fri, January 15, 2010 4:17:35 PM
> > Subject: Re: HBase on 1 box? how big?
> >
> > On that scale, why not use MySQL or Postgres?
> >
> > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> > "jumbo shrimp"... :-)
> >
> >   - Andy
> >
> >
> >
> > ----- Original Message ----
> > > From: Otis Gospodnetic
> > > To: hbase-user@hadoop.apache.org
> > > Sent: Fri, January 15, 2010 12:54:42 PM
> > > Subject: HBase on 1 box? how big?
> > >
> > > Hello,
> > >
> > > I understand running HBase on a single box is kind of
> > > pointless (thanks Andrew Purtell for the reply about numbers of
> > > boxes)... but I was wondering what kind of box might one need to
> > > host/run various HBase/Hadoop processes?
> > >
> > > Imagine I just need to have "HBase in a box", so to speak. :)
> > >
> > > I understand it depends on the volume on data, DB structure, request
> rates...
> > > I don't have those numbers, but say I want HBase to have 100M rows with
> > > data from Apache logs and want to run the common web analytics/stats
> > > reports on a nightly basis.
> > >
> > > * Would an EC2 Large Instance suffice?
> > > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual
> cores
> > > with 2 EC2 Compute Units each), 850 GB of local instance storage,
> 64-bit
> > > platform
> > >
> > > * How about EC2 Small Instance?
> > > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1
> virtual
> > core
> > > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit
> platform
> > >
> > > Thanks,
> > > Otis
> > > P.S.
> > > hw specs from http://aws.amazon.com/ec2/#instance
>
>

RE: HBase on 1 box? how big?

Posted by Jonathan Gray <jl...@streamy.com>.
A bit late to the party but my two cents...

I am currently using a single node HBase instance in production (beta) for a
client.

The use case is simply to add random access capabilities atop some large
HDFS files.  It's static data (rebuilt every few weeks) and close to 1TB or
so (with plans to be more than 10X that within months).  Attempts at loading
it into simpler KV stores or MySQL proved to be very time consuming.
Instead I simply converted from the existing MapFiles into HFiles using
HFileOutputFormat, and am serving it using a single node instance of HBase.
There is no attempt at high availability, obviously.

The lookups are fast enough (slowest is 10s of ms), there is no significant
concurrency (10 req/sec at the high end) so this is not a concern right now,
I can rebuild the entire DB in a few minutes, and hot data gets cached via
the LRU block cache.

It's also ready to scale out as necessary and gives us more capacity than we
would ever need just by adding more nodes.

But be warned... Not only can this kind of setup not give you high
availability, if you aren't careful, you'll get quite low availability.  The
kinds of shops that might run a single node of HBase might also be sharing
that node with other processes/services.  Be careful not to cause CPU/IO
starvation as GC pauses, ZK timeouts, etc... can take down a single node of
HBase rather easily.

I'm not sure single nodes of HBase make much sense for MapReduce/analytics
workloads.  The reason it works well in this situation is the concurrency is
very low and they are fully random, single key reads.  HBase is really just
adding a small layer above HDFS, acting as an indexed HFile reader and block
cacher.  Streaming data to/from HBase will always be less efficient than
to/from HDFS.

JG

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Saturday, February 06, 2010 2:11 PM
To: hbase-user@hadoop.apache.org
Subject: Re: HBase on 1 box? how big?

On Thu, Feb 4, 2010 at 12:40 PM, Ross Rick <ri...@semanticresearch.com>
wrote:
> I would want to use a properly scaled HBase for the desktop, and then, as
is appropriate for my app, push that data to cluster.   This process is
nearly seamless when the underlying language is the same.  Success in the
future isn't likely just about how big you can get.  Dare I say it, it's
probably more like 'rightsizing' your data.
>

You are right.  We should have a better one-box story than we do.
Development is generally off at the other end of the scale, on making
the multinode installs run smooth with a working-one-box getting only
as much attention as it takes to run unit tests and simple loadings.
Our other-than-single-box -- or two to three nodes even -- focus has
probably hurt us over time since thats what noobs start on.  If they
don't get a good feeling running on a small cluster, why would they
expect it to be different when they move beyond that.

Its my sense that If someone was up for working on our one-box story,
they'd only get encouragement.

Thanks for writing,
St.Ack



Re: HBase on 1 box? how big?

Posted by Stack <st...@duboce.net>.
On Thu, Feb 4, 2010 at 12:40 PM, Ross Rick <ri...@semanticresearch.com> wrote:
> I would want to use a properly scaled HBase for the desktop, and then, as is appropriate for my app, push that data to cluster.   This process is nearly seamless when the underlying language is the same.  Success in the future isn't likely just about how big you can get.  Dare I say it, it's probably more like 'rightsizing' your data.
>

You are right.  We should have a better one-box story than we do.
Development is generally off at the other end of the scale, on making
the multinode installs run smooth with a working-one-box getting only
as much attention as it takes to run unit tests and simple loadings.
Our other-than-single-box -- or two to three nodes even -- focus has
probably hurt us over time since thats what noobs start on.  If they
don't get a good feeling running on a small cluster, why would they
expect it to be different when they move beyond that.

Its my sense that If someone was up for working on our one-box story,
they'd only get encouragement.

Thanks for writing,
St.Ack

Re: HBase on 1 box? how big?

Posted by Hubert Chang <hu...@gmail.com>.
Agree with you.

One person can deploy Wordpress blog system as his site and one big
enterprise can deploy Wordpress blog system as the enterprise blog platform.

But with HBase, you could not develop a Wordpress-like product because it's
suited for 5 more nodes and not for 1 node. 


Ross Rick-2 wrote:
> 
> Allow me to disagree and take a few arrows here.  Not picking at what is
> currently being proposed, but rather with a perception of the future.   
> 
> In my mind, Applications are going to continue to blur the lines from
> phone to cloud.   Users are not normally inclined to accept answers like,
> "Well this is a different platform, you have to do something completely
> different".   They expect differences between platforms, but the successes
> of the future will likely smooth those differences rather than accentuate
> them. 
> 
> Developers will surely want to use the same storage paradigm for all
> scales and let the different system manage their scales.   For example, my
> application uses HSQLDB for the desktop and Oracle/Whatever for
> Enterprise, but damn near the same model is used. 
> 
> I would want to use a properly scaled HBase for the desktop, and then, as
> is appropriate for my app, push that data to cluster.   This process is
> nearly seamless when the underlying language is the same.  Success in the
> future isn't likely just about how big you can get.  Dare I say it, it's
> probably more like 'rightsizing' your data.
> 
> Rick
> 
>  
> On Jan 15, 2010, at 4:45 PM, Andrew Purtell wrote:
> 
>> As long as we are all clear about the usefulness of a single host system.
>> For map-reduce over BigTable, nothing more than development, functional
>> testing, and toy demo scenarios. 
>> 
>>   - Andy
>> 
>> 
>> 
>> ----- Original Message ----
>>> From: Otis Gospodnetic <ot...@yahoo.com>
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Fri, January 15, 2010 2:30:34 PM
>>> Subject: Re: HBase on 1 box? how big?
>>> 
>>> Heh, I like the analogies! :)
>>> Yes, it makes no sense to use HBase for production data volumes, etc.,
>>> but this 
>>> might be handy for development.
>>> Or for a demo that needs to consists of the same pieces (daemons,
>>> configs, etc.) 
>>> on 1 box, so that one can easily move it to a proper, big cluster,
>>> without 
>>> re-engineering or replacing any of the components.
>>> 
>>> For example, you may have an app that you want to demo to a customer,
>>> and you 
>>> can't ask them for N boxes for the demo.  But you can ask them for 1 box
>>> to 
>>> install something on.
>>> 
>>> Or maybe you can run everything from a memory stick? ;)  Hey, is there a 
>>> technical reason why having all jars, scripts, configs, etc. on a stick,
>>> and 
>>> have the configs point to dirs on the stick for holding data?  I'm not
>>> joking, 
>>> really! :)
>>> 
>>> Thanks,
>>> Otis
>>> 
>>> 
>>> ----- Original Message ----
>>>> From: Andrew Purtell 
>>>> To: hbase-user@hadoop.apache.org
>>>> Sent: Fri, January 15, 2010 4:17:35 PM
>>>> Subject: Re: HBase on 1 box? how big?
>>>> 
>>>> On that scale, why not use MySQL or Postgres?
>>>> 
>>>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality",
>>>> or
>>>> "jumbo shrimp"... :-)
>>>> 
>>>>  - Andy
>>>> 
>>>> 
>>>> 
>>>> ----- Original Message ----
>>>>> From: Otis Gospodnetic 
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Sent: Fri, January 15, 2010 12:54:42 PM
>>>>> Subject: HBase on 1 box? how big?
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> I understand running HBase on a single box is kind of
>>>>> pointless (thanks Andrew Purtell for the reply about numbers of
>>>>> boxes)... but I was wondering what kind of box might one need to
>>>>> host/run various HBase/Hadoop processes?
>>>>> 
>>>>> Imagine I just need to have "HBase in a box", so to speak. :)
>>>>> 
>>>>> I understand it depends on the volume on data, DB structure, request 
>>> rates...
>>>>> I don't have those numbers, but say I want HBase to have 100M rows
>>>>> with 
>>>>> data from Apache logs and want to run the common web analytics/stats 
>>>>> reports on a nightly basis.
>>>>> 
>>>>> * Would an EC2 Large Instance suffice?
>>>>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual
>>>>> cores
>>>>> with 2 EC2 Compute Units each), 850 GB of local instance storage,
>>>>> 64-bit 
>>>>> platform
>>>>> 
>>>>> * How about EC2 Small Instance?
>>>>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1
>>>>> virtual 
>>>> core
>>>>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit
>>>>> platform
>>>>> 
>>>>> Thanks,
>>>>> Otis
>>>>> P.S.
>>>>> hw specs from http://aws.amazon.com/ec2/#instance
>> 
>> 
>> 
>> 
>> 
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/HBase-on-1-box--how-big--tp27183442p27481523.html
Sent from the HBase User mailing list archive at Nabble.com.


Re: HBase on 1 box? how big?

Posted by Ross Rick <ri...@semanticresearch.com>.
Allow me to disagree and take a few arrows here.  Not picking at what is currently being proposed, but rather with a perception of the future.   

In my mind, Applications are going to continue to blur the lines from phone to cloud.   Users are not normally inclined to accept answers like, "Well this is a different platform, you have to do something completely different".   They expect differences between platforms, but the successes of the future will likely smooth those differences rather than accentuate them. 

Developers will surely want to use the same storage paradigm for all scales and let the different system manage their scales.   For example, my application uses HSQLDB for the desktop and Oracle/Whatever for Enterprise, but damn near the same model is used. 

I would want to use a properly scaled HBase for the desktop, and then, as is appropriate for my app, push that data to cluster.   This process is nearly seamless when the underlying language is the same.  Success in the future isn't likely just about how big you can get.  Dare I say it, it's probably more like 'rightsizing' your data.

Rick

 
On Jan 15, 2010, at 4:45 PM, Andrew Purtell wrote:

> As long as we are all clear about the usefulness of a single host system.
> For map-reduce over BigTable, nothing more than development, functional
> testing, and toy demo scenarios. 
> 
>   - Andy
> 
> 
> 
> ----- Original Message ----
>> From: Otis Gospodnetic <ot...@yahoo.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Fri, January 15, 2010 2:30:34 PM
>> Subject: Re: HBase on 1 box? how big?
>> 
>> Heh, I like the analogies! :)
>> Yes, it makes no sense to use HBase for production data volumes, etc., but this 
>> might be handy for development.
>> Or for a demo that needs to consists of the same pieces (daemons, configs, etc.) 
>> on 1 box, so that one can easily move it to a proper, big cluster, without 
>> re-engineering or replacing any of the components.
>> 
>> For example, you may have an app that you want to demo to a customer, and you 
>> can't ask them for N boxes for the demo.  But you can ask them for 1 box to 
>> install something on.
>> 
>> Or maybe you can run everything from a memory stick? ;)  Hey, is there a 
>> technical reason why having all jars, scripts, configs, etc. on a stick, and 
>> have the configs point to dirs on the stick for holding data?  I'm not joking, 
>> really! :)
>> 
>> Thanks,
>> Otis
>> 
>> 
>> ----- Original Message ----
>>> From: Andrew Purtell 
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Fri, January 15, 2010 4:17:35 PM
>>> Subject: Re: HBase on 1 box? how big?
>>> 
>>> On that scale, why not use MySQL or Postgres?
>>> 
>>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
>>> "jumbo shrimp"... :-)
>>> 
>>>  - Andy
>>> 
>>> 
>>> 
>>> ----- Original Message ----
>>>> From: Otis Gospodnetic 
>>>> To: hbase-user@hadoop.apache.org
>>>> Sent: Fri, January 15, 2010 12:54:42 PM
>>>> Subject: HBase on 1 box? how big?
>>>> 
>>>> Hello,
>>>> 
>>>> I understand running HBase on a single box is kind of
>>>> pointless (thanks Andrew Purtell for the reply about numbers of
>>>> boxes)... but I was wondering what kind of box might one need to
>>>> host/run various HBase/Hadoop processes?
>>>> 
>>>> Imagine I just need to have "HBase in a box", so to speak. :)
>>>> 
>>>> I understand it depends on the volume on data, DB structure, request 
>> rates...
>>>> I don't have those numbers, but say I want HBase to have 100M rows with 
>>>> data from Apache logs and want to run the common web analytics/stats 
>>>> reports on a nightly basis.
>>>> 
>>>> * Would an EC2 Large Instance suffice?
>>>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
>>>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit 
>>>> platform
>>>> 
>>>> * How about EC2 Small Instance?
>>>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual 
>>> core
>>>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
>>>> 
>>>> Thanks,
>>>> Otis
>>>> P.S.
>>>> hw specs from http://aws.amazon.com/ec2/#instance
> 
> 
> 
> 
> 


Re: HBase on 1 box? how big?

Posted by Andrew Purtell <ap...@apache.org>.
As long as we are all clear about the usefulness of a single host system.
For map-reduce over BigTable, nothing more than development, functional
testing, and toy demo scenarios. 

   - Andy



----- Original Message ----
> From: Otis Gospodnetic <ot...@yahoo.com>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, January 15, 2010 2:30:34 PM
> Subject: Re: HBase on 1 box? how big?
> 
> Heh, I like the analogies! :)
> Yes, it makes no sense to use HBase for production data volumes, etc., but this 
> might be handy for development.
> Or for a demo that needs to consists of the same pieces (daemons, configs, etc.) 
> on 1 box, so that one can easily move it to a proper, big cluster, without 
> re-engineering or replacing any of the components.
> 
> For example, you may have an app that you want to demo to a customer, and you 
> can't ask them for N boxes for the demo.  But you can ask them for 1 box to 
> install something on.
> 
> Or maybe you can run everything from a memory stick? ;)  Hey, is there a 
> technical reason why having all jars, scripts, configs, etc. on a stick, and 
> have the configs point to dirs on the stick for holding data?  I'm not joking, 
> really! :)
> 
> Thanks,
> Otis
> 
> 
> ----- Original Message ----
> > From: Andrew Purtell 
> > To: hbase-user@hadoop.apache.org
> > Sent: Fri, January 15, 2010 4:17:35 PM
> > Subject: Re: HBase on 1 box? how big?
> > 
> > On that scale, why not use MySQL or Postgres?
> > 
> > "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> > "jumbo shrimp"... :-)
> > 
> >   - Andy
> > 
> > 
> > 
> > ----- Original Message ----
> > > From: Otis Gospodnetic 
> > > To: hbase-user@hadoop.apache.org
> > > Sent: Fri, January 15, 2010 12:54:42 PM
> > > Subject: HBase on 1 box? how big?
> > > 
> > > Hello,
> > > 
> > > I understand running HBase on a single box is kind of
> > > pointless (thanks Andrew Purtell for the reply about numbers of
> > > boxes)... but I was wondering what kind of box might one need to
> > > host/run various HBase/Hadoop processes?
> > > 
> > > Imagine I just need to have "HBase in a box", so to speak. :)
> > > 
> > > I understand it depends on the volume on data, DB structure, request 
> rates...
> > > I don't have those numbers, but say I want HBase to have 100M rows with 
> > > data from Apache logs and want to run the common web analytics/stats 
> > > reports on a nightly basis.
> > > 
> > > * Would an EC2 Large Instance suffice?
> > > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
> > > with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit 
> > > platform
> > > 
> > > * How about EC2 Small Instance?
> > > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual 
> > core
> > > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
> > > 
> > > Thanks,
> > > Otis
> > > P.S.
> > > hw specs from http://aws.amazon.com/ec2/#instance



      


Re: HBase on 1 box? how big?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Heh, I like the analogies! :)
Yes, it makes no sense to use HBase for production data volumes, etc., but this might be handy for development.
Or for a demo that needs to consists of the same pieces (daemons, configs, etc.) on 1 box, so that one can easily move it to a proper, big cluster, without re-engineering or replacing any of the components.

For example, you may have an app that you want to demo to a customer, and you can't ask them for N boxes for the demo.  But you can ask them for 1 box to install something on.

Or maybe you can run everything from a memory stick? ;)  Hey, is there a technical reason why having all jars, scripts, configs, etc. on a stick, and have the configs point to dirs on the stick for holding data?  I'm not joking, really! :)

Thanks,
Otis


----- Original Message ----
> From: Andrew Purtell <ap...@apache.org>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, January 15, 2010 4:17:35 PM
> Subject: Re: HBase on 1 box? how big?
> 
> On that scale, why not use MySQL or Postgres?
> 
> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> "jumbo shrimp"... :-)
> 
>   - Andy
> 
> 
> 
> ----- Original Message ----
> > From: Otis Gospodnetic 
> > To: hbase-user@hadoop.apache.org
> > Sent: Fri, January 15, 2010 12:54:42 PM
> > Subject: HBase on 1 box? how big?
> > 
> > Hello,
> > 
> > I understand running HBase on a single box is kind of
> > pointless (thanks Andrew Purtell for the reply about numbers of
> > boxes)... but I was wondering what kind of box might one need to
> > host/run various HBase/Hadoop processes?
> > 
> > Imagine I just need to have "HBase in a box", so to speak. :)
> > 
> > I understand it depends on the volume on data, DB structure, request rates...
> > I don't have those numbers, but say I want HBase to have 100M rows with 
> > data from Apache logs and want to run the common web analytics/stats 
> > reports on a nightly basis.
> > 
> > * Would an EC2 Large Instance suffice?
> > -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
> > with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit 
> > platform
> > 
> > * How about EC2 Small Instance?
> > -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual 
> core
> > with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
> > 
> > Thanks,
> > Otis
> > P.S.
> > hw specs from http://aws.amazon.com/ec2/#instance


Re: HBase on 1 box? how big?

Posted by Chris Staszak <cs...@gmail.com>.
+1 for this feature.

I understand some of the questioning along the lines of "why not use
PostgreSQL/MySQL" for a data store that just runs on one host.

However, the driver for me (and I suspect for a growing number of
people) is to write one piece of code that runs at any scale. For some
uses a single host/jvm makes perfect sense: development, demos or
limited production data size and transaction volume.

Furthermore, this could greatly simplify demos or small scale
deployments on Windows (removing the ssh requirement).

On Fri, Jan 15, 2010 at 2:42 PM, stack <st...@duboce.net> wrote:
> How about we add a 'standalone' argument to bin/hbase?  It'd check the
> hbase-site.xml to see it has right standalone basic config. and then it'd
> pass switches to start all up in the one JVM?
> St.Ack
>
> On Fri, Jan 15, 2010 at 2:27 PM, Ryan Rawson <ry...@gmail.com> wrote:
>
>> i hadda resolve like 40 files of conflicts :-/
>>
>> what i really need though is a tool so that start-hbase.sh wont do the
>> 'normal' thing and just do hbase-daemon.sh start master when running
>> in standalone mode.
>>
>> -ryan
>>
>> On Fri, Jan 15, 2010 at 2:23 PM, Otis Gospodnetic
>> <ot...@yahoo.com> wrote:
>> > Sounds like a yummy patch, Ryan, if you need another nudge. :)
>> >
>> > Otis
>> >
>> >
>> >
>> > ----- Original Message ----
>> >> From: Ryan Rawson <ry...@gmail.com>
>> >> To: hbase-user@hadoop.apache.org
>> >> Sent: Fri, January 15, 2010 5:00:42 PM
>> >> Subject: Re: HBase on 1 box? how big?
>> >>
>> >> Yes I do plan on releasing a patch, but i need to rebase it to trunk.
>> >> It moves a class from test -> java (ie; the ZK in JVM startup class).
>> >>
>> >> maybe soon?
>> >> -ryan
>> >>
>> >> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote:
>> >> > That would be good for developing disconnected against the API. Any
>> >> > plan on releasing a patch Ryan?
>> >> >
>> >> >   - Andy
>> >> >
>> >> >
>> >> >
>> >> > ----- Original Message ----
>> >> >> From: Ryan Rawson
>> >> >> To: hbase-user@hadoop.apache.org
>> >> >> Sent: Fri, January 15, 2010 1:50:32 PM
>> >> >> Subject: Re: HBase on 1 box? how big?
>> >> >>
>> >> >> You can run HBase on any size of machine all single node, by default
>> >> >> when you start hbase it will store files in /tmp and everything is in
>> >> >> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
>> >> >> probably a lot more than you might think, but again limited by the
>> >> >> disk.  I run it on my mac laptop for example.
>> >> >>
>> >> >> I have a patch that will allow a single JVM including zookeeper, but
>> >> >> it is locked up in my private git for now. This would get rid of the
>> >> >> need to ssh localhost just to start local hbase.
>> >> >>
>> >> >> -ryan
>> >> >>
>> >> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote:
>> >> >> > I agree.  HBase in a box is essentially MySQL.  HBase is built for
>> a
>> >> cluster.
>> >> >> >
>> >> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote:
>> >> >> >> On that scale, why not use MySQL or Postgres?
>> >> >> >>
>> >> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual
>> reality", or
>> >> >> >> "jumbo shrimp"... :-)
>> >> >> >>
>> >> >> >>  - Andy
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> ----- Original Message ----
>> >> >> >>> From: Otis Gospodnetic
>> >> >> >>> To: hbase-user@hadoop.apache.org
>> >> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM
>> >> >> >>> Subject: HBase on 1 box? how big?
>> >> >> >>>
>> >> >> >>> Hello,
>> >> >> >>>
>> >> >> >>> I understand running HBase on a single box is kind of
>> >> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of
>> >> >> >>> boxes)... but I was wondering what kind of box might one need to
>> >> >> >>> host/run various HBase/Hadoop processes?
>> >> >> >>>
>> >> >> >>> Imagine I just need to have "HBase in a box", so to speak. :)
>> >> >> >>>
>> >> >> >>> I understand it depends on the volume on data, DB structure,
>> request
>> >> >> rates...
>> >> >> >>> I don't have those numbers, but say I want HBase to have 100M
>> rows with
>> >> >> >>> data from Apache logs and want to run the common web
>> analytics/stats
>> >> >> >>> reports on a nightly basis.
>> >> >> >>>
>> >> >> >>> * Would an EC2 Large Instance suffice?
>> >> >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2
>> virtual cores
>> >> >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage,
>> 64-bit
>> >> >> >>> platform
>> >> >> >>>
>> >> >> >>> * How about EC2 Small Instance?
>> >> >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit
>> (1
>> >> virtual
>> >> >> core
>> >> >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage,
>> 32-bit
>> >> platform
>> >> >> >>>
>> >> >> >>> Thanks,
>> >> >> >>> Otis
>> >> >> >>> P.S.
>> >> >> >>> hw specs from http://aws.amazon.com/ec2/#instance
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >
>> >
>>
>

Re: HBase on 1 box? how big?

Posted by stack <st...@duboce.net>.
How about we add a 'standalone' argument to bin/hbase?  It'd check the
hbase-site.xml to see it has right standalone basic config. and then it'd
pass switches to start all up in the one JVM?
St.Ack

On Fri, Jan 15, 2010 at 2:27 PM, Ryan Rawson <ry...@gmail.com> wrote:

> i hadda resolve like 40 files of conflicts :-/
>
> what i really need though is a tool so that start-hbase.sh wont do the
> 'normal' thing and just do hbase-daemon.sh start master when running
> in standalone mode.
>
> -ryan
>
> On Fri, Jan 15, 2010 at 2:23 PM, Otis Gospodnetic
> <ot...@yahoo.com> wrote:
> > Sounds like a yummy patch, Ryan, if you need another nudge. :)
> >
> > Otis
> >
> >
> >
> > ----- Original Message ----
> >> From: Ryan Rawson <ry...@gmail.com>
> >> To: hbase-user@hadoop.apache.org
> >> Sent: Fri, January 15, 2010 5:00:42 PM
> >> Subject: Re: HBase on 1 box? how big?
> >>
> >> Yes I do plan on releasing a patch, but i need to rebase it to trunk.
> >> It moves a class from test -> java (ie; the ZK in JVM startup class).
> >>
> >> maybe soon?
> >> -ryan
> >>
> >> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote:
> >> > That would be good for developing disconnected against the API. Any
> >> > plan on releasing a patch Ryan?
> >> >
> >> >   - Andy
> >> >
> >> >
> >> >
> >> > ----- Original Message ----
> >> >> From: Ryan Rawson
> >> >> To: hbase-user@hadoop.apache.org
> >> >> Sent: Fri, January 15, 2010 1:50:32 PM
> >> >> Subject: Re: HBase on 1 box? how big?
> >> >>
> >> >> You can run HBase on any size of machine all single node, by default
> >> >> when you start hbase it will store files in /tmp and everything is in
> >> >> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
> >> >> probably a lot more than you might think, but again limited by the
> >> >> disk.  I run it on my mac laptop for example.
> >> >>
> >> >> I have a patch that will allow a single JVM including zookeeper, but
> >> >> it is locked up in my private git for now. This would get rid of the
> >> >> need to ssh localhost just to start local hbase.
> >> >>
> >> >> -ryan
> >> >>
> >> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote:
> >> >> > I agree.  HBase in a box is essentially MySQL.  HBase is built for
> a
> >> cluster.
> >> >> >
> >> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote:
> >> >> >> On that scale, why not use MySQL or Postgres?
> >> >> >>
> >> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual
> reality", or
> >> >> >> "jumbo shrimp"... :-)
> >> >> >>
> >> >> >>  - Andy
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ----- Original Message ----
> >> >> >>> From: Otis Gospodnetic
> >> >> >>> To: hbase-user@hadoop.apache.org
> >> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM
> >> >> >>> Subject: HBase on 1 box? how big?
> >> >> >>>
> >> >> >>> Hello,
> >> >> >>>
> >> >> >>> I understand running HBase on a single box is kind of
> >> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of
> >> >> >>> boxes)... but I was wondering what kind of box might one need to
> >> >> >>> host/run various HBase/Hadoop processes?
> >> >> >>>
> >> >> >>> Imagine I just need to have "HBase in a box", so to speak. :)
> >> >> >>>
> >> >> >>> I understand it depends on the volume on data, DB structure,
> request
> >> >> rates...
> >> >> >>> I don't have those numbers, but say I want HBase to have 100M
> rows with
> >> >> >>> data from Apache logs and want to run the common web
> analytics/stats
> >> >> >>> reports on a nightly basis.
> >> >> >>>
> >> >> >>> * Would an EC2 Large Instance suffice?
> >> >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2
> virtual cores
> >> >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage,
> 64-bit
> >> >> >>> platform
> >> >> >>>
> >> >> >>> * How about EC2 Small Instance?
> >> >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit
> (1
> >> virtual
> >> >> core
> >> >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage,
> 32-bit
> >> platform
> >> >> >>>
> >> >> >>> Thanks,
> >> >> >>> Otis
> >> >> >>> P.S.
> >> >> >>> hw specs from http://aws.amazon.com/ec2/#instance
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >
> >
>

Re: HBase on 1 box? how big?

Posted by Ryan Rawson <ry...@gmail.com>.
i hadda resolve like 40 files of conflicts :-/

what i really need though is a tool so that start-hbase.sh wont do the
'normal' thing and just do hbase-daemon.sh start master when running
in standalone mode.

-ryan

On Fri, Jan 15, 2010 at 2:23 PM, Otis Gospodnetic
<ot...@yahoo.com> wrote:
> Sounds like a yummy patch, Ryan, if you need another nudge. :)
>
> Otis
>
>
>
> ----- Original Message ----
>> From: Ryan Rawson <ry...@gmail.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Fri, January 15, 2010 5:00:42 PM
>> Subject: Re: HBase on 1 box? how big?
>>
>> Yes I do plan on releasing a patch, but i need to rebase it to trunk.
>> It moves a class from test -> java (ie; the ZK in JVM startup class).
>>
>> maybe soon?
>> -ryan
>>
>> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote:
>> > That would be good for developing disconnected against the API. Any
>> > plan on releasing a patch Ryan?
>> >
>> >   - Andy
>> >
>> >
>> >
>> > ----- Original Message ----
>> >> From: Ryan Rawson
>> >> To: hbase-user@hadoop.apache.org
>> >> Sent: Fri, January 15, 2010 1:50:32 PM
>> >> Subject: Re: HBase on 1 box? how big?
>> >>
>> >> You can run HBase on any size of machine all single node, by default
>> >> when you start hbase it will store files in /tmp and everything is in
>> >> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
>> >> probably a lot more than you might think, but again limited by the
>> >> disk.  I run it on my mac laptop for example.
>> >>
>> >> I have a patch that will allow a single JVM including zookeeper, but
>> >> it is locked up in my private git for now. This would get rid of the
>> >> need to ssh localhost just to start local hbase.
>> >>
>> >> -ryan
>> >>
>> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote:
>> >> > I agree.  HBase in a box is essentially MySQL.  HBase is built for a
>> cluster.
>> >> >
>> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote:
>> >> >> On that scale, why not use MySQL or Postgres?
>> >> >>
>> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
>> >> >> "jumbo shrimp"... :-)
>> >> >>
>> >> >>  - Andy
>> >> >>
>> >> >>
>> >> >>
>> >> >> ----- Original Message ----
>> >> >>> From: Otis Gospodnetic
>> >> >>> To: hbase-user@hadoop.apache.org
>> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM
>> >> >>> Subject: HBase on 1 box? how big?
>> >> >>>
>> >> >>> Hello,
>> >> >>>
>> >> >>> I understand running HBase on a single box is kind of
>> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of
>> >> >>> boxes)... but I was wondering what kind of box might one need to
>> >> >>> host/run various HBase/Hadoop processes?
>> >> >>>
>> >> >>> Imagine I just need to have "HBase in a box", so to speak. :)
>> >> >>>
>> >> >>> I understand it depends on the volume on data, DB structure, request
>> >> rates...
>> >> >>> I don't have those numbers, but say I want HBase to have 100M rows with
>> >> >>> data from Apache logs and want to run the common web analytics/stats
>> >> >>> reports on a nightly basis.
>> >> >>>
>> >> >>> * Would an EC2 Large Instance suffice?
>> >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
>> >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
>> >> >>> platform
>> >> >>>
>> >> >>> * How about EC2 Small Instance?
>> >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1
>> virtual
>> >> core
>> >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit
>> platform
>> >> >>>
>> >> >>> Thanks,
>> >> >>> Otis
>> >> >>> P.S.
>> >> >>> hw specs from http://aws.amazon.com/ec2/#instance
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >
>> >
>> >
>> >
>> >
>> >
>
>

Re: HBase on 1 box? how big?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Sounds like a yummy patch, Ryan, if you need another nudge. :)

Otis



----- Original Message ----
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, January 15, 2010 5:00:42 PM
> Subject: Re: HBase on 1 box? how big?
> 
> Yes I do plan on releasing a patch, but i need to rebase it to trunk.
> It moves a class from test -> java (ie; the ZK in JVM startup class).
> 
> maybe soon?
> -ryan
> 
> On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell wrote:
> > That would be good for developing disconnected against the API. Any
> > plan on releasing a patch Ryan?
> >
> >   - Andy
> >
> >
> >
> > ----- Original Message ----
> >> From: Ryan Rawson 
> >> To: hbase-user@hadoop.apache.org
> >> Sent: Fri, January 15, 2010 1:50:32 PM
> >> Subject: Re: HBase on 1 box? how big?
> >>
> >> You can run HBase on any size of machine all single node, by default
> >> when you start hbase it will store files in /tmp and everything is in
> >> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
> >> probably a lot more than you might think, but again limited by the
> >> disk.  I run it on my mac laptop for example.
> >>
> >> I have a patch that will allow a single JVM including zookeeper, but
> >> it is locked up in my private git for now. This would get rid of the
> >> need to ssh localhost just to start local hbase.
> >>
> >> -ryan
> >>
> >> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote:
> >> > I agree.  HBase in a box is essentially MySQL.  HBase is built for a 
> cluster.
> >> >
> >> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote:
> >> >> On that scale, why not use MySQL or Postgres?
> >> >>
> >> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> >> >> "jumbo shrimp"... :-)
> >> >>
> >> >>  - Andy
> >> >>
> >> >>
> >> >>
> >> >> ----- Original Message ----
> >> >>> From: Otis Gospodnetic
> >> >>> To: hbase-user@hadoop.apache.org
> >> >>> Sent: Fri, January 15, 2010 12:54:42 PM
> >> >>> Subject: HBase on 1 box? how big?
> >> >>>
> >> >>> Hello,
> >> >>>
> >> >>> I understand running HBase on a single box is kind of
> >> >>> pointless (thanks Andrew Purtell for the reply about numbers of
> >> >>> boxes)... but I was wondering what kind of box might one need to
> >> >>> host/run various HBase/Hadoop processes?
> >> >>>
> >> >>> Imagine I just need to have "HBase in a box", so to speak. :)
> >> >>>
> >> >>> I understand it depends on the volume on data, DB structure, request
> >> rates...
> >> >>> I don't have those numbers, but say I want HBase to have 100M rows with
> >> >>> data from Apache logs and want to run the common web analytics/stats
> >> >>> reports on a nightly basis.
> >> >>>
> >> >>> * Would an EC2 Large Instance suffice?
> >> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
> >> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
> >> >>> platform
> >> >>>
> >> >>> * How about EC2 Small Instance?
> >> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 
> virtual
> >> core
> >> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit 
> platform
> >> >>>
> >> >>> Thanks,
> >> >>> Otis
> >> >>> P.S.
> >> >>> hw specs from http://aws.amazon.com/ec2/#instance
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >
> >
> >
> >
> >
> >
> >


Re: HBase on 1 box? how big?

Posted by Ryan Rawson <ry...@gmail.com>.
Yes I do plan on releasing a patch, but i need to rebase it to trunk.
It moves a class from test -> java (ie; the ZK in JVM startup class).

maybe soon?
-ryan

On Fri, Jan 15, 2010 at 1:57 PM, Andrew Purtell <ap...@apache.org> wrote:
> That would be good for developing disconnected against the API. Any
> plan on releasing a patch Ryan?
>
>   - Andy
>
>
>
> ----- Original Message ----
>> From: Ryan Rawson <ry...@gmail.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Fri, January 15, 2010 1:50:32 PM
>> Subject: Re: HBase on 1 box? how big?
>>
>> You can run HBase on any size of machine all single node, by default
>> when you start hbase it will store files in /tmp and everything is in
>> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
>> probably a lot more than you might think, but again limited by the
>> disk.  I run it on my mac laptop for example.
>>
>> I have a patch that will allow a single JVM including zookeeper, but
>> it is locked up in my private git for now. This would get rid of the
>> need to ssh localhost just to start local hbase.
>>
>> -ryan
>>
>> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote:
>> > I agree.  HBase in a box is essentially MySQL.  HBase is built for a cluster.
>> >
>> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote:
>> >> On that scale, why not use MySQL or Postgres?
>> >>
>> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
>> >> "jumbo shrimp"... :-)
>> >>
>> >>  - Andy
>> >>
>> >>
>> >>
>> >> ----- Original Message ----
>> >>> From: Otis Gospodnetic
>> >>> To: hbase-user@hadoop.apache.org
>> >>> Sent: Fri, January 15, 2010 12:54:42 PM
>> >>> Subject: HBase on 1 box? how big?
>> >>>
>> >>> Hello,
>> >>>
>> >>> I understand running HBase on a single box is kind of
>> >>> pointless (thanks Andrew Purtell for the reply about numbers of
>> >>> boxes)... but I was wondering what kind of box might one need to
>> >>> host/run various HBase/Hadoop processes?
>> >>>
>> >>> Imagine I just need to have "HBase in a box", so to speak. :)
>> >>>
>> >>> I understand it depends on the volume on data, DB structure, request
>> rates...
>> >>> I don't have those numbers, but say I want HBase to have 100M rows with
>> >>> data from Apache logs and want to run the common web analytics/stats
>> >>> reports on a nightly basis.
>> >>>
>> >>> * Would an EC2 Large Instance suffice?
>> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
>> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
>> >>> platform
>> >>>
>> >>> * How about EC2 Small Instance?
>> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual
>> core
>> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
>> >>>
>> >>> Thanks,
>> >>> Otis
>> >>> P.S.
>> >>> hw specs from http://aws.amazon.com/ec2/#instance
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>
>
>
>
>
>

Re: HBase on 1 box? how big?

Posted by Andrew Purtell <ap...@apache.org>.
That would be good for developing disconnected against the API. Any
plan on releasing a patch Ryan? 

   - Andy



----- Original Message ----
> From: Ryan Rawson <ry...@gmail.com>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, January 15, 2010 1:50:32 PM
> Subject: Re: HBase on 1 box? how big?
> 
> You can run HBase on any size of machine all single node, by default
> when you start hbase it will store files in /tmp and everything is in
> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
> probably a lot more than you might think, but again limited by the
> disk.  I run it on my mac laptop for example.
> 
> I have a patch that will allow a single JVM including zookeeper, but
> it is locked up in my private git for now. This would get rid of the
> need to ssh localhost just to start local hbase.
> 
> -ryan
> 
> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd wrote:
> > I agree.  HBase in a box is essentially MySQL.  HBase is built for a cluster.
> >
> > On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell wrote:
> >> On that scale, why not use MySQL or Postgres?
> >>
> >> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> >> "jumbo shrimp"... :-)
> >>
> >>  - Andy
> >>
> >>
> >>
> >> ----- Original Message ----
> >>> From: Otis Gospodnetic 
> >>> To: hbase-user@hadoop.apache.org
> >>> Sent: Fri, January 15, 2010 12:54:42 PM
> >>> Subject: HBase on 1 box? how big?
> >>>
> >>> Hello,
> >>>
> >>> I understand running HBase on a single box is kind of
> >>> pointless (thanks Andrew Purtell for the reply about numbers of
> >>> boxes)... but I was wondering what kind of box might one need to
> >>> host/run various HBase/Hadoop processes?
> >>>
> >>> Imagine I just need to have "HBase in a box", so to speak. :)
> >>>
> >>> I understand it depends on the volume on data, DB structure, request 
> rates...
> >>> I don't have those numbers, but say I want HBase to have 100M rows with
> >>> data from Apache logs and want to run the common web analytics/stats
> >>> reports on a nightly basis.
> >>>
> >>> * Would an EC2 Large Instance suffice?
> >>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
> >>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
> >>> platform
> >>>
> >>> * How about EC2 Small Instance?
> >>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual 
> core
> >>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
> >>>
> >>> Thanks,
> >>> Otis
> >>> P.S.
> >>> hw specs from http://aws.amazon.com/ec2/#instance
> >>
> >>
> >>
> >>
> >>
> >>
> >



      


Re: HBase on 1 box? how big?

Posted by Jean-Daniel Cryans <jd...@apache.org>.
What a tease Ryan! ;)

J-D

On Fri, Jan 15, 2010 at 1:50 PM, Ryan Rawson <ry...@gmail.com> wrote:
> You can run HBase on any size of machine all single node, by default
> when you start hbase it will store files in /tmp and everything is in
> 1 JVM.  How much data can you jam in there?  I'm not totally sure,
> probably a lot more than you might think, but again limited by the
> disk.  I run it on my mac laptop for example.
>
> I have a patch that will allow a single JVM including zookeeper, but
> it is locked up in my private git for now. This would get rid of the
> need to ssh localhost just to start local hbase.
>
> -ryan
>
> On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd <se...@gmail.com> wrote:
>> I agree.  HBase in a box is essentially MySQL.  HBase is built for a cluster.
>>
>> On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell <ap...@apache.org> wrote:
>>> On that scale, why not use MySQL or Postgres?
>>>
>>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
>>> "jumbo shrimp"... :-)
>>>
>>>  - Andy
>>>
>>>
>>>
>>> ----- Original Message ----
>>>> From: Otis Gospodnetic <ot...@yahoo.com>
>>>> To: hbase-user@hadoop.apache.org
>>>> Sent: Fri, January 15, 2010 12:54:42 PM
>>>> Subject: HBase on 1 box? how big?
>>>>
>>>> Hello,
>>>>
>>>> I understand running HBase on a single box is kind of
>>>> pointless (thanks Andrew Purtell for the reply about numbers of
>>>> boxes)... but I was wondering what kind of box might one need to
>>>> host/run various HBase/Hadoop processes?
>>>>
>>>> Imagine I just need to have "HBase in a box", so to speak. :)
>>>>
>>>> I understand it depends on the volume on data, DB structure, request rates...
>>>> I don't have those numbers, but say I want HBase to have 100M rows with
>>>> data from Apache logs and want to run the common web analytics/stats
>>>> reports on a nightly basis.
>>>>
>>>> * Would an EC2 Large Instance suffice?
>>>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
>>>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
>>>> platform
>>>>
>>>> * How about EC2 Small Instance?
>>>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core
>>>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
>>>>
>>>> Thanks,
>>>> Otis
>>>> P.S.
>>>> hw specs from http://aws.amazon.com/ec2/#instance
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: HBase on 1 box? how big?

Posted by Ryan Rawson <ry...@gmail.com>.
You can run HBase on any size of machine all single node, by default
when you start hbase it will store files in /tmp and everything is in
1 JVM.  How much data can you jam in there?  I'm not totally sure,
probably a lot more than you might think, but again limited by the
disk.  I run it on my mac laptop for example.

I have a patch that will allow a single JVM including zookeeper, but
it is locked up in my private git for now. This would get rid of the
need to ssh localhost just to start local hbase.

-ryan

On Fri, Jan 15, 2010 at 1:22 PM, Seth Ladd <se...@gmail.com> wrote:
> I agree.  HBase in a box is essentially MySQL.  HBase is built for a cluster.
>
> On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell <ap...@apache.org> wrote:
>> On that scale, why not use MySQL or Postgres?
>>
>> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
>> "jumbo shrimp"... :-)
>>
>>  - Andy
>>
>>
>>
>> ----- Original Message ----
>>> From: Otis Gospodnetic <ot...@yahoo.com>
>>> To: hbase-user@hadoop.apache.org
>>> Sent: Fri, January 15, 2010 12:54:42 PM
>>> Subject: HBase on 1 box? how big?
>>>
>>> Hello,
>>>
>>> I understand running HBase on a single box is kind of
>>> pointless (thanks Andrew Purtell for the reply about numbers of
>>> boxes)... but I was wondering what kind of box might one need to
>>> host/run various HBase/Hadoop processes?
>>>
>>> Imagine I just need to have "HBase in a box", so to speak. :)
>>>
>>> I understand it depends on the volume on data, DB structure, request rates...
>>> I don't have those numbers, but say I want HBase to have 100M rows with
>>> data from Apache logs and want to run the common web analytics/stats
>>> reports on a nightly basis.
>>>
>>> * Would an EC2 Large Instance suffice?
>>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
>>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
>>> platform
>>>
>>> * How about EC2 Small Instance?
>>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core
>>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
>>>
>>> Thanks,
>>> Otis
>>> P.S.
>>> hw specs from http://aws.amazon.com/ec2/#instance
>>
>>
>>
>>
>>
>>
>

Re: HBase on 1 box? how big?

Posted by Seth Ladd <se...@gmail.com>.
I agree.  HBase in a box is essentially MySQL.  HBase is built for a cluster.

On Fri, Jan 15, 2010 at 11:17 AM, Andrew Purtell <ap...@apache.org> wrote:
> On that scale, why not use MySQL or Postgres?
>
> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> "jumbo shrimp"... :-)
>
>  - Andy
>
>
>
> ----- Original Message ----
>> From: Otis Gospodnetic <ot...@yahoo.com>
>> To: hbase-user@hadoop.apache.org
>> Sent: Fri, January 15, 2010 12:54:42 PM
>> Subject: HBase on 1 box? how big?
>>
>> Hello,
>>
>> I understand running HBase on a single box is kind of
>> pointless (thanks Andrew Purtell for the reply about numbers of
>> boxes)... but I was wondering what kind of box might one need to
>> host/run various HBase/Hadoop processes?
>>
>> Imagine I just need to have "HBase in a box", so to speak. :)
>>
>> I understand it depends on the volume on data, DB structure, request rates...
>> I don't have those numbers, but say I want HBase to have 100M rows with
>> data from Apache logs and want to run the common web analytics/stats
>> reports on a nightly basis.
>>
>> * Would an EC2 Large Instance suffice?
>> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
>> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit
>> platform
>>
>> * How about EC2 Small Instance?
>> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core
>> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
>>
>> Thanks,
>> Otis
>> P.S.
>> hw specs from http://aws.amazon.com/ec2/#instance
>
>
>
>
>
>

Re: HBase on 1 box? how big?

Posted by stack <st...@duboce.net>.
On Fri, Jan 15, 2010 at 1:17 PM, Andrew Purtell <ap...@apache.org> wrote:

>
> "HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
> "jumbo shrimp"... :-)
>
>
Andrew, thats funny.
St.Ack

Re: HBase on 1 box? how big?

Posted by Andrew Purtell <ap...@apache.org>.
On that scale, why not use MySQL or Postgres?

"HBase in a box" is like "dynamic equilibrium", or "virtual reality", or
"jumbo shrimp"... :-)

  - Andy



----- Original Message ----
> From: Otis Gospodnetic <ot...@yahoo.com>
> To: hbase-user@hadoop.apache.org
> Sent: Fri, January 15, 2010 12:54:42 PM
> Subject: HBase on 1 box? how big?
> 
> Hello,
> 
> I understand running HBase on a single box is kind of
> pointless (thanks Andrew Purtell for the reply about numbers of
> boxes)... but I was wondering what kind of box might one need to
> host/run various HBase/Hadoop processes?
> 
> Imagine I just need to have "HBase in a box", so to speak. :)
> 
> I understand it depends on the volume on data, DB structure, request rates...
> I don't have those numbers, but say I want HBase to have 100M rows with 
> data from Apache logs and want to run the common web analytics/stats 
> reports on a nightly basis.
> 
> * Would an EC2 Large Instance suffice?
> -- Large Instance 7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores
> with 2 EC2 Compute Units each), 850 GB of local instance storage, 64-bit 
> platform
> 
> * How about EC2 Small Instance?
> -- Small Instance (Default) 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core
> with 1 EC2 Compute Unit), 160 GB of local instance storage, 32-bit platform
> 
> Thanks,
> Otis
> P.S.
> hw specs from http://aws.amazon.com/ec2/#instance