You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Tousif <to...@gmail.com> on 2012/12/17 07:28:17 UTC

merge local hbase data with production hbase

Hi,

Can anyone help me identify a tool or best method to append local hbase
table data to existing production hbase table.


Thanks in advance.

-- 


Regards
Tousif Khazi

 .

Re: merge local hbase data with production hbase

Posted by 周梦想 <ab...@gmail.com>.
hi tousif,
maybe you can first export the data to hdfs,and import the data to your
production hbase table. or write some code to do this, or just using hive
to complete this!

best regards!
andy

2012/12/17 Tousif <to...@gmail.com>

> Hi,
>
> Can anyone help me identify a tool or best method to append local hbase
> table data to existing production hbase table.
>
>
> Thanks in advance.
>
> --
>
>
> Regards
> Tousif Khazi
>
>  .
>

what is the max size for one region and what is the max size of region for one server

Posted by tgh <gu...@ia.ac.cn>.
Hi
	I try to use hbase 0.90 to store 100billion massage, and I have
setup hbase, and use API to store messages into hbase, It seems ok,
	But, my mates tell me that , for hbase , max size for one region is
4GB , and for one server, the max number for region is 100, then for one
server, I can only store 400GB data, is it that ?

	Or what about the max size for one region and what about the max
size of region for one server?

	For 100billion message, I need to store more data in one server, 
	Could you help me 

Thank you
--------------------
Tian Guanhua





答复: what is the max size for one region and what is the max size of region for one server

Posted by tgh <gu...@ia.ac.cn>.
Thank you for your reply, and I visit the webpage, it is helpful,
And following it, I can use 500 region in ONE server, is it?
And then if I use 500 region in ONE server, and one region is 40GB, and one
server will store 20TB, it is ok , is it?



Thank you 
---------------------
Tian Guanhua



-----邮件原件-----
发件人: user-return-32520-guanhua.tian=ia.ac.cn@hbase.apache.org
[mailto:user-return-32520-guanhua.tian=ia.ac.cn@hbase.apache.org] 代表
Nicolas Liochon
发送时间: 2012年12月17日 16:28
收件人: user@hbase.apache.org
主题: Re: what is the max size for one region and what is the max size of
region for one server

This should help:
http://hbase.apache.org/book/important_configurations.html#bigger.regions

On Mon, Dec 17, 2012 at 9:11 AM, tgh <gu...@ia.ac.cn> wrote:

>
>         Or what about the max size for one region and what about the 
> max size of region for one server?
>



答复: 答复: what is the max size for one region and what is the max size of region for one server

Posted by tgh <gu...@ia.ac.cn>.
Thank you for your reply ,

but I write for I want to make sure, if the number of region in ONE server
exceed 300 or 500, the hbase will fail or something, or what is the max
number of region for ONE server?
And I use hbase 0.90,

Could you help me


Thank you
------------------------
Tian Guanhua









-----邮件原件-----
发件人: user-return-32523-guanhua.tian=ia.ac.cn@hbase.apache.org
[mailto:user-return-32523-guanhua.tian=ia.ac.cn@hbase.apache.org] 代表
Nicolas Liochon
发送时间: 2012年12月17日 17:42
收件人: user@hbase.apache.org
主题: Re: 答复: what is the max size for one region and what is the max size
of region for one server

You're reading correctly. It's a little bit extreme however. More extreme
than necessary imho: if you have 20TB of HBase data, this leads to 60TB of
hdfs data, plus the WALs. That's a lot for a single machine.

On Mon, Dec 17, 2012 at 9:56 AM, tgh <gu...@ia.ac.cn> wrote:

> is it?
> And then if I use 500 region in ONE server, and one region
>



Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

Posted by lars hofhansl <lh...@yahoo.com>.
Here's some back of the envelope math:
Say you have 6 1T drives per machines. That gives you about 2T of usable space (considering HDFS 3-way replication).
A reasonable max size for regions is 20gb. That's 100 regions for 2T.
If you set the flushsize to 128mb, you'd need ~13gb RAM in the worst case, just for the memstores. (Not all of them will be full all the time, though.)
Then you also want a block cache, plus the normal amount of memory just needed to run HBase.

You'll reach the JVM's reasonable memory limit pretty quickly, in the case above you'd probably want a 24gb JVM heap at least.


I find that current JVM technology 6T per machines is about as much as you can reasonably do with HBase.
As Andy said, you should upgrade, in HBase < 0.92 the max region size is somewhere around 4gb.

-- Lars



________________________________
 From: Andrew Purtell <ap...@apache.org>
To: "user@hbase.apache.org" <us...@hbase.apache.org> 
Cc: tgh <gu...@ia.ac.cn> 
Sent: Monday, December 17, 2012 11:26 AM
Subject: Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server
 
Don't use HBase 0.90. Our current release is 0.94. You will find the
community is able to help you much more satisfactorily if you start with
the current release.


On Mon, Dec 17, 2012 at 2:26 AM, tgh <gu...@ia.ac.cn> wrote:

> Thank you for your reply ,
>
> but I write for I want to make sure, if the number of region in ONE server
> exceed 300 or 500, the hbase will fail or something, or what is the max
> number of region for ONE server?
> And I use hbase 0.90,
>
> Could you help me
>
>
> Thank you
> ------------------------
> Tian Guanhua
>
>
>
>
>
>
>
>
>
> -----邮件原件-----
> 发件人: user-return-32523-guanhua.tian=ia.ac.cn@hbase.apache.org
> [mailto:user-return-32523-guanhua.tian=ia.ac.cn@hbase.apache.org] 代表
> Nicolas Liochon
> 发送时间: 2012年12月17日 17:42
> 收件人: user@hbase.apache.org
> 主题: Re: 答复: what is the max size for one region and what is the max size
> of region for one server
>
> You're reading correctly. It's a little bit extreme however. More extreme
> than necessary imho: if you have 20TB of HBase data, this leads to 60TB of
> hdfs data, plus the WALs. That's a lot for a single machine.
>
> On Mon, Dec 17, 2012 at 9:56 AM, tgh <gu...@ia.ac.cn> wrote:
>
> > is it?
> > And then if I use 500 region in ONE server, and one region
> >
>
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

Posted by Andrew Purtell <ap...@apache.org>.
Don't use HBase 0.90. Our current release is 0.94. You will find the
community is able to help you much more satisfactorily if you start with
the current release.


On Mon, Dec 17, 2012 at 2:26 AM, tgh <gu...@ia.ac.cn> wrote:

> Thank you for your reply ,
>
> but I write for I want to make sure, if the number of region in ONE server
> exceed 300 or 500, the hbase will fail or something, or what is the max
> number of region for ONE server?
> And I use hbase 0.90,
>
> Could you help me
>
>
> Thank you
> ------------------------
> Tian Guanhua
>
>
>
>
>
>
>
>
>
> -----邮件原件-----
> 发件人: user-return-32523-guanhua.tian=ia.ac.cn@hbase.apache.org
> [mailto:user-return-32523-guanhua.tian=ia.ac.cn@hbase.apache.org] 代表
> Nicolas Liochon
> 发送时间: 2012年12月17日 17:42
> 收件人: user@hbase.apache.org
> 主题: Re: 答复: what is the max size for one region and what is the max size
> of region for one server
>
> You're reading correctly. It's a little bit extreme however. More extreme
> than necessary imho: if you have 20TB of HBase data, this leads to 60TB of
> hdfs data, plus the WALs. That's a lot for a single machine.
>
> On Mon, Dec 17, 2012 at 9:56 AM, tgh <gu...@ia.ac.cn> wrote:
>
> > is it?
> > And then if I use 500 region in ONE server, and one region
> >
>
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

Posted by Bryan Beaudreault <bb...@hubspot.com>.
0.90.x supports up to 4GB region sizes max, not 40. You would need to upgrade to 0.92.x at least to go higher than that. 

Sent from iPhone.

On Dec 17, 2012, at 9:31 AM, Doug Meil <do...@explorysmedical.com> wrote:

> 
> Hi there,
> 
> When sizing your data, don't forget to read thisŠ
> 
> http://hbase.apache.org/book.html#schema.creation
> 
> and
> 
> http://hbase.apache.org/book.html#regions.arch
> 
> "9.7.5.4. KeyValue"
> 
> You need to understand how Hbase stores data internally on initial design
> to avoid problems down the line.  Keep the keys as small as reasonable,
> likewise CF name, and column names.
> 
> 
> 
> 
> On 12/17/12 6:07 AM, "Nicolas Liochon" <nk...@gmail.com> wrote:
> 
>> I think it's safer to use a newer version (0.94): there are a lot of
>> things
>> around performances & volumes in the 0.92 & 0.94. As well, there are much
>> more bug fixes releases on the 0.94.
>> 
>> For the number of region, there is no maximum written in stone. Having too
>> many regions will essentially impact the performances. As I said, having
>> 60TB of data per machine is not standard today (points are: that's a lot
>> of
>> disk a single machine; what's the impact if you lose a node; what will be
>> the network load, ...). I suppose all this is documented in the usual
>> books
>> on HBase.
>> 
>> 
>> On Mon, Dec 17, 2012 at 11:26 AM, tgh <gu...@ia.ac.cn> wrote:
>> 
>>> number of region for ONE server?
> 

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

Posted by Doug Meil <do...@explorysmedical.com>.
Hi there,

When sizing your data, don't forget to read thisŠ

http://hbase.apache.org/book.html#schema.creation

and

http://hbase.apache.org/book.html#regions.arch

"9.7.5.4. KeyValue"

You need to understand how Hbase stores data internally on initial design
to avoid problems down the line.  Keep the keys as small as reasonable,
likewise CF name, and column names.




On 12/17/12 6:07 AM, "Nicolas Liochon" <nk...@gmail.com> wrote:

>I think it's safer to use a newer version (0.94): there are a lot of
>things
>around performances & volumes in the 0.92 & 0.94. As well, there are much
>more bug fixes releases on the 0.94.
>
>For the number of region, there is no maximum written in stone. Having too
>many regions will essentially impact the performances. As I said, having
>60TB of data per machine is not standard today (points are: that's a lot
>of
>disk a single machine; what's the impact if you lose a node; what will be
>the network load, ...). I suppose all this is documented in the usual
>books
>on HBase.
>
>
>On Mon, Dec 17, 2012 at 11:26 AM, tgh <gu...@ia.ac.cn> wrote:
>
>> number of region for ONE server?


Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

Posted by Nicolas Liochon <nk...@gmail.com>.
I think it's safer to use a newer version (0.94): there are a lot of things
around performances & volumes in the 0.92 & 0.94. As well, there are much
more bug fixes releases on the 0.94.

For the number of region, there is no maximum written in stone. Having too
many regions will essentially impact the performances. As I said, having
60TB of data per machine is not standard today (points are: that's a lot of
disk a single machine; what's the impact if you lose a node; what will be
the network load, ...). I suppose all this is documented in the usual books
on HBase.


On Mon, Dec 17, 2012 at 11:26 AM, tgh <gu...@ia.ac.cn> wrote:

> number of region for ONE server?

Re: 答复: what is the max size for one region and what is the max size of region for one server

Posted by Nicolas Liochon <nk...@gmail.com>.
You're reading correctly. It's a little bit extreme however. More extreme
than necessary imho: if you have 20TB of HBase data, this leads to 60TB of
hdfs data, plus the WALs. That's a lot for a single machine.

On Mon, Dec 17, 2012 at 9:56 AM, tgh <gu...@ia.ac.cn> wrote:

> is it?
> And then if I use 500 region in ONE server, and one region
>

Re: what is the max size for one region and what is the max size of region for one server

Posted by Nicolas Liochon <nk...@gmail.com>.
This should help:
http://hbase.apache.org/book/important_configurations.html#bigger.regions

On Mon, Dec 17, 2012 at 9:11 AM, tgh <gu...@ia.ac.cn> wrote:

>
>         Or what about the max size for one region and what about the max
> size of region for one server?
>