You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Julian Wissmann <ju...@gmail.com> on 2014/11/21 15:19:51 UTC

Current Deployment Sizes

Hi,

I'm currently writing my thesis, in part it is about HBase. I was wondering
if there are some current numbers for large deployments, i.e Facebook or
Yahoo. I'm particularly interested in things like number of nodes, amount
of data managed and (if available) query throughput.


The most recent information I was able to find is from 2012. If anyone has
some more recent numbers, or knows where to find them, I'd be happy for
some hints in the right direction (or a let me google that for you link, if
in fact I was just too stupid to search the right thing ;-) ).

Regards,
Julian

Re: Current Deployment Sizes

Posted by Julian Wissmann <ju...@gmail.com>.
Exactly!

Regards,

Julian

2014-11-21 17:16 GMT+01:00 Birdsall, Dave <da...@hp.com>:

> Hi Julian,
>
> I don't have an answer to your question, but I want to better understand
> your question: You are looking for data on the largest HBase deployments in
> practice, correct?
>
> Regards,
>
> Dave
>
> -----Original Message-----
> From: Julian Wissmann [mailto:julianwissmann@gmail.com]
> Sent: Friday, November 21, 2014 7:43 AM
> To: user@hbase.apache.org
> Subject: Re: Current Deployment Sizes
>
> Hi,
>
> thank you! The meetup link comes in handy. However this is not the answer
> to the question I asked (or maybe I wasn't clear enough).
>
> I am well aware of the sizing notes etc. However what I am looking for are
> some hard numbers considering actual scale in the rela world. I can write a
> lot about how far in theory hbase could scale, however I'd like to have
> some hard numbers as to how far big deployments push this, today. For
> example the number of nodes Facebook or Yahoo are running and what amount
> of data they are crunching on those. This is where I have trouble finding
> current information!
>
> Regards
> Julian
>
> 2014-11-21 16:20 GMT+01:00 Ted Yu <yu...@gmail.com>:
>
> > Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
> >
> > I think the following talks are relevant to your thesis:
> > HBase-at-twitter
> > <http://files.meetup.com/1350427/HBase-at-twitter.pdf>
> > HBase Sizing Notes
> > <http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>
> >
> > On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann
> > <julianwissmann@gmail.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I'm currently writing my thesis, in part it is about HBase. I was
> > wondering
> > > if there are some current numbers for large deployments, i.e
> > > Facebook or Yahoo. I'm particularly interested in things like number
> > > of nodes, amount of data managed and (if available) query throughput.
> > >
> > >
> > > The most recent information I was able to find is from 2012. If
> > > anyone
> > has
> > > some more recent numbers, or knows where to find them, I'd be happy
> > > for some hints in the right direction (or a let me google that for
> > > you link,
> > if
> > > in fact I was just too stupid to search the right thing ;-) ).
> > >
> > > Regards,
> > > Julian
> > >
> >
>

Re: Current Deployment Sizes

Posted by Shahab Yunus <sh...@gmail.com>.
I think your best bet, to get the latest and accurate as possible data,
would be to directly contact the companies (through their Engineering
channels) which are known to host large clusters. Most of these companies
have public blogs and such so should not be hard to find an appropriate
contact.

Regards,
Shahab

On Fri, Nov 21, 2014 at 11:16 AM, Birdsall, Dave <da...@hp.com>
wrote:

> Hi Julian,
>
> I don't have an answer to your question, but I want to better understand
> your question: You are looking for data on the largest HBase deployments in
> practice, correct?
>
> Regards,
>
> Dave
>
> -----Original Message-----
> From: Julian Wissmann [mailto:julianwissmann@gmail.com]
> Sent: Friday, November 21, 2014 7:43 AM
> To: user@hbase.apache.org
> Subject: Re: Current Deployment Sizes
>
> Hi,
>
> thank you! The meetup link comes in handy. However this is not the answer
> to the question I asked (or maybe I wasn't clear enough).
>
> I am well aware of the sizing notes etc. However what I am looking for are
> some hard numbers considering actual scale in the rela world. I can write a
> lot about how far in theory hbase could scale, however I'd like to have
> some hard numbers as to how far big deployments push this, today. For
> example the number of nodes Facebook or Yahoo are running and what amount
> of data they are crunching on those. This is where I have trouble finding
> current information!
>
> Regards
> Julian
>
> 2014-11-21 16:20 GMT+01:00 Ted Yu <yu...@gmail.com>:
>
> > Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
> >
> > I think the following talks are relevant to your thesis:
> > HBase-at-twitter
> > <http://files.meetup.com/1350427/HBase-at-twitter.pdf>
> > HBase Sizing Notes
> > <http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>
> >
> > On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann
> > <julianwissmann@gmail.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I'm currently writing my thesis, in part it is about HBase. I was
> > wondering
> > > if there are some current numbers for large deployments, i.e
> > > Facebook or Yahoo. I'm particularly interested in things like number
> > > of nodes, amount of data managed and (if available) query throughput.
> > >
> > >
> > > The most recent information I was able to find is from 2012. If
> > > anyone
> > has
> > > some more recent numbers, or knows where to find them, I'd be happy
> > > for some hints in the right direction (or a let me google that for
> > > you link,
> > if
> > > in fact I was just too stupid to search the right thing ;-) ).
> > >
> > > Regards,
> > > Julian
> > >
> >
>

RE: Current Deployment Sizes

Posted by "Birdsall, Dave" <da...@hp.com>.
Hi Julian,

I don't have an answer to your question, but I want to better understand your question: You are looking for data on the largest HBase deployments in practice, correct?

Regards,

Dave

-----Original Message-----
From: Julian Wissmann [mailto:julianwissmann@gmail.com] 
Sent: Friday, November 21, 2014 7:43 AM
To: user@hbase.apache.org
Subject: Re: Current Deployment Sizes

Hi,

thank you! The meetup link comes in handy. However this is not the answer to the question I asked (or maybe I wasn't clear enough).

I am well aware of the sizing notes etc. However what I am looking for are some hard numbers considering actual scale in the rela world. I can write a lot about how far in theory hbase could scale, however I'd like to have some hard numbers as to how far big deployments push this, today. For example the number of nodes Facebook or Yahoo are running and what amount of data they are crunching on those. This is where I have trouble finding current information!

Regards
Julian

2014-11-21 16:20 GMT+01:00 Ted Yu <yu...@gmail.com>:

> Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
>
> I think the following talks are relevant to your thesis:
> HBase-at-twitter 
> <http://files.meetup.com/1350427/HBase-at-twitter.pdf>
> HBase Sizing Notes
> <http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>
>
> On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann 
> <julianwissmann@gmail.com
> >
> wrote:
>
> > Hi,
> >
> > I'm currently writing my thesis, in part it is about HBase. I was
> wondering
> > if there are some current numbers for large deployments, i.e 
> > Facebook or Yahoo. I'm particularly interested in things like number 
> > of nodes, amount of data managed and (if available) query throughput.
> >
> >
> > The most recent information I was able to find is from 2012. If 
> > anyone
> has
> > some more recent numbers, or knows where to find them, I'd be happy 
> > for some hints in the right direction (or a let me google that for 
> > you link,
> if
> > in fact I was just too stupid to search the right thing ;-) ).
> >
> > Regards,
> > Julian
> >
>

Re: Current Deployment Sizes

Posted by Julian Wissmann <ju...@gmail.com>.
Cool!

Thanks a lot. That is exactly what I was looking for.

Cheers and a nice weekend

Julian
Am 21.11.2014 18:29 schrieb "Ted Yu" <yu...@gmail.com>:

> Take a look at slide #4 in this talk:
> http://www.slideshare.net/ddlatham/hbase-at-flurry
>
> Cheers
>
> On Fri, Nov 21, 2014 at 7:43 AM, Julian Wissmann <julianwissmann@gmail.com
> >
> wrote:
>
> > Hi,
> >
> > thank you! The meetup link comes in handy. However this is not the answer
> > to the question I asked (or maybe I wasn't clear enough).
> >
> > I am well aware of the sizing notes etc. However what I am looking for
> are
> > some hard numbers considering actual scale in the rela world. I can
> write a
> > lot about how far in theory hbase could scale, however I'd like to have
> > some hard numbers as to how far big deployments push this, today. For
> > example the number of nodes Facebook or Yahoo are running and what amount
> > of data they are crunching on those. This is where I have trouble finding
> > current information!
> >
> > Regards
> > Julian
> >
> > 2014-11-21 16:20 GMT+01:00 Ted Yu <yu...@gmail.com>:
> >
> > > Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
> > >
> > > I think the following talks are relevant to your thesis:
> > > HBase-at-twitter <http://files.meetup.com/1350427/HBase-at-twitter.pdf
> >
> > > HBase Sizing Notes
> > > <http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>
> > >
> > > On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann <
> > julianwissmann@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm currently writing my thesis, in part it is about HBase. I was
> > > wondering
> > > > if there are some current numbers for large deployments, i.e Facebook
> > or
> > > > Yahoo. I'm particularly interested in things like number of nodes,
> > amount
> > > > of data managed and (if available) query throughput.
> > > >
> > > >
> > > > The most recent information I was able to find is from 2012. If
> anyone
> > > has
> > > > some more recent numbers, or knows where to find them, I'd be happy
> for
> > > > some hints in the right direction (or a let me google that for you
> > link,
> > > if
> > > > in fact I was just too stupid to search the right thing ;-) ).
> > > >
> > > > Regards,
> > > > Julian
> > > >
> > >
> >
>

Re: Current Deployment Sizes

Posted by Ted Yu <yu...@gmail.com>.
Take a look at slide #4 in this talk:
http://www.slideshare.net/ddlatham/hbase-at-flurry

Cheers

On Fri, Nov 21, 2014 at 7:43 AM, Julian Wissmann <ju...@gmail.com>
wrote:

> Hi,
>
> thank you! The meetup link comes in handy. However this is not the answer
> to the question I asked (or maybe I wasn't clear enough).
>
> I am well aware of the sizing notes etc. However what I am looking for are
> some hard numbers considering actual scale in the rela world. I can write a
> lot about how far in theory hbase could scale, however I'd like to have
> some hard numbers as to how far big deployments push this, today. For
> example the number of nodes Facebook or Yahoo are running and what amount
> of data they are crunching on those. This is where I have trouble finding
> current information!
>
> Regards
> Julian
>
> 2014-11-21 16:20 GMT+01:00 Ted Yu <yu...@gmail.com>:
>
> > Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
> >
> > I think the following talks are relevant to your thesis:
> > HBase-at-twitter <http://files.meetup.com/1350427/HBase-at-twitter.pdf>
> > HBase Sizing Notes
> > <http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>
> >
> > On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann <
> julianwissmann@gmail.com
> > >
> > wrote:
> >
> > > Hi,
> > >
> > > I'm currently writing my thesis, in part it is about HBase. I was
> > wondering
> > > if there are some current numbers for large deployments, i.e Facebook
> or
> > > Yahoo. I'm particularly interested in things like number of nodes,
> amount
> > > of data managed and (if available) query throughput.
> > >
> > >
> > > The most recent information I was able to find is from 2012. If anyone
> > has
> > > some more recent numbers, or knows where to find them, I'd be happy for
> > > some hints in the right direction (or a let me google that for you
> link,
> > if
> > > in fact I was just too stupid to search the right thing ;-) ).
> > >
> > > Regards,
> > > Julian
> > >
> >
>

Re: Current Deployment Sizes

Posted by Julian Wissmann <ju...@gmail.com>.
Hi,

thank you! The meetup link comes in handy. However this is not the answer
to the question I asked (or maybe I wasn't clear enough).

I am well aware of the sizing notes etc. However what I am looking for are
some hard numbers considering actual scale in the rela world. I can write a
lot about how far in theory hbase could scale, however I'd like to have
some hard numbers as to how far big deployments push this, today. For
example the number of nodes Facebook or Yahoo are running and what amount
of data they are crunching on those. This is where I have trouble finding
current information!

Regards
Julian

2014-11-21 16:20 GMT+01:00 Ted Yu <yu...@gmail.com>:

> Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?
>
> I think the following talks are relevant to your thesis:
> HBase-at-twitter <http://files.meetup.com/1350427/HBase-at-twitter.pdf>
> HBase Sizing Notes
> <http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>
>
> On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann <julianwissmann@gmail.com
> >
> wrote:
>
> > Hi,
> >
> > I'm currently writing my thesis, in part it is about HBase. I was
> wondering
> > if there are some current numbers for large deployments, i.e Facebook or
> > Yahoo. I'm particularly interested in things like number of nodes, amount
> > of data managed and (if available) query throughput.
> >
> >
> > The most recent information I was able to find is from 2012. If anyone
> has
> > some more recent numbers, or knows where to find them, I'd be happy for
> > some hints in the right direction (or a let me google that for you link,
> if
> > in fact I was just too stupid to search the right thing ;-) ).
> >
> > Regards,
> > Julian
> >
>

Re: Current Deployment Sizes

Posted by Ted Yu <yu...@gmail.com>.
Have you looked at http://www.meetup.com/hbaseusergroup/files/ ?

I think the following talks are relevant to your thesis:
HBase-at-twitter <http://files.meetup.com/1350427/HBase-at-twitter.pdf>
HBase Sizing Notes
<http://files.meetup.com/1350427/HBase%20Sizing%20Notes.pdf>

On Fri, Nov 21, 2014 at 6:19 AM, Julian Wissmann <ju...@gmail.com>
wrote:

> Hi,
>
> I'm currently writing my thesis, in part it is about HBase. I was wondering
> if there are some current numbers for large deployments, i.e Facebook or
> Yahoo. I'm particularly interested in things like number of nodes, amount
> of data managed and (if available) query throughput.
>
>
> The most recent information I was able to find is from 2012. If anyone has
> some more recent numbers, or knows where to find them, I'd be happy for
> some hints in the right direction (or a let me google that for you link, if
> in fact I was just too stupid to search the right thing ;-) ).
>
> Regards,
> Julian
>