You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Sam Seigal <se...@yahoo.com> on 2011/10/24 10:27:39 UTC

pre splitting tables

According to the HBase book , pre splitting tables and doing manual
splits is a better long term strategy than letting HBase handle it.

I have done a lot of offline testing with HBase and I am at a stage
now where I would like to hook my cluster into the production queue
feeding data into our systems.

Since I do not know what the keys from the prod system are going to
look like , I am adding a machine number prefix to the the row keys
and pre splitting the tables  based on the prefix (prefix 0 goes to
machine A, prefix 1 goes to machine b etc).

Once I decide to add more machines, I can always do a rolling split
and add more prefixes.

Is this a good strategy for pre splitting the tables ?

Re: pre splitting tables

Posted by Sam Seigal <se...@yahoo.com>.

On Tue, Oct 25, 2011 at 1:02 PM, Nicolas Spiegelberg
<ns...@fb.com> wrote:
>>According to my understanding, the way that HBase works is that on a
>>brand new system, all keys will start going to a single region i.e. a
>>single region server. Once that region
>>reaches a max region size, it will split and then move to another
>>region server, and so on and so forth.
>
> Basically, the default table create is 1 region per table that can only go
> to 1 RS.  Splits happen on that region when it gets large enough, but
> balancing the new region to another server is an asynchronous event and
> doesn't happen immediately after the first split because of
> "hbase.regions.slop".  The idea is to create the table with R regions
> across S servers so each server has R/S regions and puts will be roughly
> uniformly distributed across the R regions, keeping every server equally
> busy.  Sounds like you have a good handle on this behavior.
>
>>My strategy is to take the incoming data, calculate the hash and then
>>mod the hash with the number of machines I have. I will split the
>>regions according to the prefix # .
>>This should , I think provide for better data distribution when the
>>cluster first starts up with one region / region server.
>
> The problem with this strategy: so say you split into 256 regions.  Region
> splits are basically memcmp() ranges, so they would look like this:
>
> Key prefix      Region
> 0x00 - 0x01     1
> 0x01 - 0x02     2
> 0x02 - 0x03     3
> ...
>

Aren't the regions boundaries going to be something like this:

0x000000000000 ...
0xf0ffffffffffffffffffffffffff ...
0x100000000000..
0x1ffffffffffffffffffffffff ...

The idea is that once the region starts growing, I continue doing
manual splits, the data
will be split at boundaries at a finer grain than just the machine
prefix, and once good
distribution is achieved, I basically stop doing the splits.

When adding more machines, a rolling split will have to be performed
in any case, right ?
The application logic can also be modified for adding more machine prefixes.


> Let's say you prefix your key with the machine ID#.  You are probably
> using a UINT32 for the machine ID, but let's assume that your using a
> UINT8 for this discussion.  Your M machine IDs would map to exactly 1
> region each.  So only M out of 256 regions would contain any data.  Every
> time you add a new machine, all the puts will only go to one region.  By
> prefixing your key with a hash (MD5, SHA1, whatevs), you'll get random
> distribution on your prefix and will populate all 256 regions evenly.

For my use case, I am generating indexes for the reads, so prefixing a hash
should not be a problem.

But there is also a broader question. By prefixing the hash, I am
losing a lot in
data locality, as well as the ability to query on the leading position
of the key.
Doesn't this make a predictable set of machine id's a better candidate ?

> (FYI: If you were using a monotonically increasing UINT32, this would be
> worse because they'd probably be low numbers and all map to the 0x00
> region).
>
>>These regions should then grow fairly uniformly. Once they reach a
>>size like ~ 5G, I can do a rolling split.
>
> I think you want to do rolling splits very infrequently.  We have PB of
> data and only rolling split twice.  The more regions / server, the faster
> server recovery happens because there's more parallelism for distributed
> log splitting.  If you have too many regions / cluster, you can overwork
> the load balancer on the master and increase startup time.  We're running
> at 20 regions/server * 95 regionservers ~= 1900 regions on the master.
> How many servers do you have?  If you have M servers, I'd try to split
> into M*(M-1) regions but keep that value lower than 1000 if possible.
> It's also nice to split on log2 boundaries for readability, but
> technically the optimal scenario is to have 1 server die and have every
> other server split & add exactly the same amount of its regions. M*(M-1)
> would give every other server 1 of the regions.
>
>>Also, I want to make sure my regions do not grow too much in size that
>>when I end up adding more machines, it does not take a very long time
>>to perform a rolling split.
>
> Meh.  This happens.  This is also what rolling split is designed/optimized
> for.  It persists the split plan to HDFS so you can stop it and restart
> later.  It also round robin's the splits across servers & prioritizes
> splitting low-loaded servers to minimize region-based load balancing
> during the process.
>
>>What I do not understand is the advantages/disasvantages of having
>>regions that are too big vs regions that are too thin. What does this
>>impact ? Compaction time ? Split time ? What is the
>>concern when it comes to how the architecture works. I think if I
>>understand this better, I can manage my regions more efficiently.
>
>
> Are you IO-bound or CPU-bound?  The default HBase setup is geared towards
> the CPU-bound use case and you should have acceptable performance out of
> the box after pre-splitting and a couple well-documented configs (ulimit
> comes to mind).  In our case, we have medium sized data and are fronted by
> an application cache, so we're IO-bound.  Because of this, we need to
> tweak the compaction algorithm to minimize IO writes at the expense of
> StoreFiles and use optional features like BloomFilters & the TimeRange
> Filter to minimize the number of StoreFiles we need to query on a get.

The main purpose of the cluster is generating aggregations that can be
tied back to the
transaction level details.

So lets say I have a few transaction come in the form:

eventid_eventKey1 : <attr1: 5, attr2 : 10 ..>
eventid_eventKey1 : <attr1: 2, attr2 : 9 ..>
eventid_eventKey1 : <attr1: 1, attr2 : 2 ..>

Our clients are interested in aggregations at various dimensions of
the incoming data.
So, a map reduce job will run and produce an index on these attributes:

attr1:5
attr1:2
attr1:1

another job will then produce the aggregation:

attr1: 8

(There are plans to capture these job dependencies in some sort of a
DAG job workflow framework as requirements get complex)

Our customers want to be able to tie back the aggregations to the
transaction level details. This is why HBase is
a great choice. We can query an "aggregations" table to get the
aggregations and then use the same key to query a
"transaction detail table" to figure out the attributes that make up
the aggregation.

So long story short, the read volume from concurrent users is going to
be very low. We expect no more than 20 people to be using the system
at a given time.
The write volume on the other hand is going to be high, and there will
be stress on the cluster mostly due to map reduce jobs creating
indexes and
performing aggregations on those indexes.

What would you suggest would be the config parameters to look at for
this sort of a scenario ?

To start off with a shadow HBase cluster will only take up around some
% of the production traffic which is around 20,0000 messages / hr. If
we keep on adding more machines and start
transferring more and more traffic, this can reach around 500,000
messages / hr.

Right now, the hardware I have available to do my testing is 5
machines - 8 GB memory, 2.26 GHz, quad core, 120 GB disk space

Thanks a lot for your help !

>
>

Re: pre splitting tables

Posted by Li Pi <lp...@ucsd.edu>.

It'll lower it. Remember that each regionserver, or region has its own
block cache of a given size. If you increase the regionsize, then you
lower the cachesize/region size ratio.

On Tue, Oct 25, 2011 at 1:53 AM, Sam Seigal <se...@yahoo.com> wrote:
> On Mon, Oct 24, 2011 at 9:22 PM, Karthik Ranganathan
> <kr...@fb.com> wrote:
>>
>>
>> << ...mod the hash with the number of machines I have... >>
>> This means that the data will change with the number of machines - so all
>> your data will map to different regions if you add a new machine to your
>> cluster.
>>
>>
>> << What I do not understand is the advantages/disasvantages of having
>> regions that are too big vs regions that are too thin. >>
>> The disadvantage is that some regions (and consequently nodes) will have a
>> lot of data which will adversely affect things like storage (if dfs is
>> local to that node), block cache hit ratio, etc.
>
> Can you please explain a bit more on how a bigger region size will
> affect the block cache hit ratio ?
>
>>
>> In general - per our experience using Hbase, its much more desirable to
>> disperse data up-front. If you are building indexes using MR, then you
>> probably don¹t need range scan ability on your keys.
>>
>> Thanks
>> Karthik
>>
>>
>>
>> On 10/24/11 4:48 PM, "Sam Seigal" <se...@yahoo.com> wrote:
>>
>>>According to my understanding, the way that HBase works is that on a
>>>brand new system, all keys will start going to a single region i.e. a
>>>single region server. Once that region
>>>reaches a max region size, it will split and then move to another
>>>region server, and so on and so forth.
>>>
>>>Initially hooking up HBase to a prod system, I am concerned about this
>>>behaviour, since a clean HBase cluster is going to experience a surge
>>>of traffic all going into one region server initially.
>>>This is the motivation behind pre-defining the regions, so the initial
>>>surge of traffic is distributed evenly.
>>>
>>>My strategy is to take the incoming data, calculate the hash and then
>>>mod the hash with the number of machines I have. I will split the
>>>regions according to the prefix # .
>>>This should , I think provide for better data distribution when the
>>>cluster first starts up with one region / region server.
>>>
>>>These regions should then grow fairly uniformly. Once they reach a
>>>size like ~ 5G, I can do a rolling split.
>>>
>>>Also, I want to make sure my regions do not grow too much in size that
>>>when I end up adding more machines, it does not take a very long time
>>>to perform a rolling split.
>>>
>>>What I do not understand is the advantages/disasvantages of having
>>>regions that are too big vs regions that are too thin. What does this
>>>impact ? Compaction time ? Split time ? What is the
>>>concern when it comes to how the architecture works. I think if I
>>>understand this better, I can manage my regions more efficiently.
>>>
>>>
>>>
>>>On Mon, Oct 24, 2011 at 3:23 PM, Nicolas Spiegelberg
>>><ns...@fb.com> wrote:
>>>> Isn't a better strategy to create the HBase keys as
>>>>
>>>> Key = hash(MySQL_key) + MySQL_key
>>>>
>>>> That way you'll know your key distribution and can add new machines
>>>> seamlessly.  I'm assuming that your rows don't overlap between any 2
>>>> machines.  If so, you could append the MACHINE_ID to the key (not
>>>> prepend).  I don't think you want the machine # as the first dimension
>>>>on
>>>> your rows, because you want the data from new machines to be evenly
>>>>spread
>>>> out across the existing regions.
>>>>
>>>>
>>>> On 10/24/11 9:07 AM, "Stack" <st...@duboce.net> wrote:
>>>>
>>>>>On Mon, Oct 24, 2011 at 1:27 AM, Sam Seigal <se...@yahoo.com> wrote:
>>>>>> According to the HBase book , pre splitting tables and doing manual
>>>>>> splits is a better long term strategy than letting HBase handle it.
>>>>>>
>>>>>
>>>>>Its good for getting a table off the ground, yes.
>>>>>
>>>>>
>>>>>> Since I do not know what the keys from the prod system are going to
>>>>>> look like , I am adding a machine number prefix to the the row keys
>>>>>> and pre splitting the tables  based on the prefix (prefix 0 goes to
>>>>>> machine A, prefix 1 goes to machine b etc).
>>>>>>
>>>>>
>>>>>You don't need to do inorder scan of the data?  Whats the rest of your
>>>>>row key look like?
>>>>>
>>>>>
>>>>>> Once I decide to add more machines, I can always do a rolling split
>>>>>> and add more prefixes.
>>>>>>
>>>>>
>>>>>Yes.
>>>>>
>>>>>> Is this a good strategy for pre splitting the tables ?
>>>>>>
>>>>>
>>>>>So, you'll start out with one region per server?
>>>>>
>>>>>What do you think the rate of splitting will be like?  Are you using
>>>>>default region size or have you bumped this up?
>>>>>
>>>>>St.Ack
>>>>
>>>>
>>
>>
>

Re: pre splitting tables

Posted by Sam Seigal <se...@yahoo.com>.

On Mon, Oct 24, 2011 at 9:22 PM, Karthik Ranganathan
<kr...@fb.com> wrote:
>
>
> << ...mod the hash with the number of machines I have... >>
> This means that the data will change with the number of machines - so all
> your data will map to different regions if you add a new machine to your
> cluster.
>
>
> << What I do not understand is the advantages/disasvantages of having
> regions that are too big vs regions that are too thin. >>
> The disadvantage is that some regions (and consequently nodes) will have a
> lot of data which will adversely affect things like storage (if dfs is
> local to that node), block cache hit ratio, etc.

Can you please explain a bit more on how a bigger region size will
affect the block cache hit ratio ?

>
> In general - per our experience using Hbase, its much more desirable to
> disperse data up-front. If you are building indexes using MR, then you
> probably don¹t need range scan ability on your keys.
>
> Thanks
> Karthik
>
>
>
> On 10/24/11 4:48 PM, "Sam Seigal" <se...@yahoo.com> wrote:
>
>>According to my understanding, the way that HBase works is that on a
>>brand new system, all keys will start going to a single region i.e. a
>>single region server. Once that region
>>reaches a max region size, it will split and then move to another
>>region server, and so on and so forth.
>>
>>Initially hooking up HBase to a prod system, I am concerned about this
>>behaviour, since a clean HBase cluster is going to experience a surge
>>of traffic all going into one region server initially.
>>This is the motivation behind pre-defining the regions, so the initial
>>surge of traffic is distributed evenly.
>>
>>My strategy is to take the incoming data, calculate the hash and then
>>mod the hash with the number of machines I have. I will split the
>>regions according to the prefix # .
>>This should , I think provide for better data distribution when the
>>cluster first starts up with one region / region server.
>>
>>These regions should then grow fairly uniformly. Once they reach a
>>size like ~ 5G, I can do a rolling split.
>>
>>Also, I want to make sure my regions do not grow too much in size that
>>when I end up adding more machines, it does not take a very long time
>>to perform a rolling split.
>>
>>What I do not understand is the advantages/disasvantages of having
>>regions that are too big vs regions that are too thin. What does this
>>impact ? Compaction time ? Split time ? What is the
>>concern when it comes to how the architecture works. I think if I
>>understand this better, I can manage my regions more efficiently.
>>
>>
>>
>>On Mon, Oct 24, 2011 at 3:23 PM, Nicolas Spiegelberg
>><ns...@fb.com> wrote:
>>> Isn't a better strategy to create the HBase keys as
>>>
>>> Key = hash(MySQL_key) + MySQL_key
>>>
>>> That way you'll know your key distribution and can add new machines
>>> seamlessly.  I'm assuming that your rows don't overlap between any 2
>>> machines.  If so, you could append the MACHINE_ID to the key (not
>>> prepend).  I don't think you want the machine # as the first dimension
>>>on
>>> your rows, because you want the data from new machines to be evenly
>>>spread
>>> out across the existing regions.
>>>
>>>
>>> On 10/24/11 9:07 AM, "Stack" <st...@duboce.net> wrote:
>>>
>>>>On Mon, Oct 24, 2011 at 1:27 AM, Sam Seigal <se...@yahoo.com> wrote:
>>>>> According to the HBase book , pre splitting tables and doing manual
>>>>> splits is a better long term strategy than letting HBase handle it.
>>>>>
>>>>
>>>>Its good for getting a table off the ground, yes.
>>>>
>>>>
>>>>> Since I do not know what the keys from the prod system are going to
>>>>> look like , I am adding a machine number prefix to the the row keys
>>>>> and pre splitting the tables  based on the prefix (prefix 0 goes to
>>>>> machine A, prefix 1 goes to machine b etc).
>>>>>
>>>>
>>>>You don't need to do inorder scan of the data?  Whats the rest of your
>>>>row key look like?
>>>>
>>>>
>>>>> Once I decide to add more machines, I can always do a rolling split
>>>>> and add more prefixes.
>>>>>
>>>>
>>>>Yes.
>>>>
>>>>> Is this a good strategy for pre splitting the tables ?
>>>>>
>>>>
>>>>So, you'll start out with one region per server?
>>>>
>>>>What do you think the rate of splitting will be like?  Are you using
>>>>default region size or have you bumped this up?
>>>>
>>>>St.Ack
>>>
>>>
>
>

Re: pre splitting tables

Posted by Nicolas Spiegelberg <ns...@fb.com>.

>According to my understanding, the way that HBase works is that on a
>brand new system, all keys will start going to a single region i.e. a
>single region server. Once that region
>reaches a max region size, it will split and then move to another
>region server, and so on and so forth.

Basically, the default table create is 1 region per table that can only go
to 1 RS.  Splits happen on that region when it gets large enough, but
balancing the new region to another server is an asynchronous event and
doesn't happen immediately after the first split because of
"hbase.regions.slop".  The idea is to create the table with R regions
across S servers so each server has R/S regions and puts will be roughly
uniformly distributed across the R regions, keeping every server equally
busy.  Sounds like you have a good handle on this behavior.

>My strategy is to take the incoming data, calculate the hash and then
>mod the hash with the number of machines I have. I will split the
>regions according to the prefix # .
>This should , I think provide for better data distribution when the
>cluster first starts up with one region / region server.

The problem with this strategy: so say you split into 256 regions.  Region
splits are basically memcmp() ranges, so they would look like this:

Key prefix	Region
0x00 - 0x01 	1
0x01 - 0x02	2
0x02 - 0x03 	3
...

Let's say you prefix your key with the machine ID#.  You are probably
using a UINT32 for the machine ID, but let's assume that your using a
UINT8 for this discussion.  Your M machine IDs would map to exactly 1
region each.  So only M out of 256 regions would contain any data.  Every
time you add a new machine, all the puts will only go to one region.  By
prefixing your key with a hash (MD5, SHA1, whatevs), you'll get random
distribution on your prefix and will populate all 256 regions evenly.
(FYI: If you were using a monotonically increasing UINT32, this would be
worse because they'd probably be low numbers and all map to the 0x00
region).

>These regions should then grow fairly uniformly. Once they reach a
>size like ~ 5G, I can do a rolling split.

I think you want to do rolling splits very infrequently.  We have PB of
data and only rolling split twice.  The more regions / server, the faster
server recovery happens because there's more parallelism for distributed
log splitting.  If you have too many regions / cluster, you can overwork
the load balancer on the master and increase startup time.  We're running
at 20 regions/server * 95 regionservers ~= 1900 regions on the master.
How many servers do you have?  If you have M servers, I'd try to split
into M*(M-1) regions but keep that value lower than 1000 if possible.
It's also nice to split on log2 boundaries for readability, but
technically the optimal scenario is to have 1 server die and have every
other server split & add exactly the same amount of its regions. M*(M-1)
would give every other server 1 of the regions.

>Also, I want to make sure my regions do not grow too much in size that
>when I end up adding more machines, it does not take a very long time
>to perform a rolling split.

Meh.  This happens.  This is also what rolling split is designed/optimized
for.  It persists the split plan to HDFS so you can stop it and restart
later.  It also round robin's the splits across servers & prioritizes
splitting low-loaded servers to minimize region-based load balancing
during the process.

>What I do not understand is the advantages/disasvantages of having
>regions that are too big vs regions that are too thin. What does this
>impact ? Compaction time ? Split time ? What is the
>concern when it comes to how the architecture works. I think if I
>understand this better, I can manage my regions more efficiently.


Are you IO-bound or CPU-bound?  The default HBase setup is geared towards
the CPU-bound use case and you should have acceptable performance out of
the box after pre-splitting and a couple well-documented configs (ulimit
comes to mind).  In our case, we have medium sized data and are fronted by
an application cache, so we're IO-bound.  Because of this, we need to
tweak the compaction algorithm to minimize IO writes at the expense of
StoreFiles and use optional features like BloomFilters & the TimeRange
Filter to minimize the number of StoreFiles we need to query on a get.

Re: pre splitting tables

Posted by Karthik Ranganathan <kr...@fb.com>.


<< ...mod the hash with the number of machines I have... >>
This means that the data will change with the number of machines - so all
your data will map to different regions if you add a new machine to your
cluster.


<< What I do not understand is the advantages/disasvantages of having
regions that are too big vs regions that are too thin. >>
The disadvantage is that some regions (and consequently nodes) will have a
lot of data which will adversely affect things like storage (if dfs is
local to that node), block cache hit ratio, etc.

In general - per our experience using Hbase, its much more desirable to
disperse data up-front. If you are building indexes using MR, then you
probably don¹t need range scan ability on your keys.

Thanks
Karthik



On 10/24/11 4:48 PM, "Sam Seigal" <se...@yahoo.com> wrote:

>According to my understanding, the way that HBase works is that on a
>brand new system, all keys will start going to a single region i.e. a
>single region server. Once that region
>reaches a max region size, it will split and then move to another
>region server, and so on and so forth.
>
>Initially hooking up HBase to a prod system, I am concerned about this
>behaviour, since a clean HBase cluster is going to experience a surge
>of traffic all going into one region server initially.
>This is the motivation behind pre-defining the regions, so the initial
>surge of traffic is distributed evenly.
>
>My strategy is to take the incoming data, calculate the hash and then
>mod the hash with the number of machines I have. I will split the
>regions according to the prefix # .
>This should , I think provide for better data distribution when the
>cluster first starts up with one region / region server.
>
>These regions should then grow fairly uniformly. Once they reach a
>size like ~ 5G, I can do a rolling split.
>
>Also, I want to make sure my regions do not grow too much in size that
>when I end up adding more machines, it does not take a very long time
>to perform a rolling split.
>
>What I do not understand is the advantages/disasvantages of having
>regions that are too big vs regions that are too thin. What does this
>impact ? Compaction time ? Split time ? What is the
>concern when it comes to how the architecture works. I think if I
>understand this better, I can manage my regions more efficiently.
>
>
>
>On Mon, Oct 24, 2011 at 3:23 PM, Nicolas Spiegelberg
><ns...@fb.com> wrote:
>> Isn't a better strategy to create the HBase keys as
>>
>> Key = hash(MySQL_key) + MySQL_key
>>
>> That way you'll know your key distribution and can add new machines
>> seamlessly.  I'm assuming that your rows don't overlap between any 2
>> machines.  If so, you could append the MACHINE_ID to the key (not
>> prepend).  I don't think you want the machine # as the first dimension
>>on
>> your rows, because you want the data from new machines to be evenly
>>spread
>> out across the existing regions.
>>
>>
>> On 10/24/11 9:07 AM, "Stack" <st...@duboce.net> wrote:
>>
>>>On Mon, Oct 24, 2011 at 1:27 AM, Sam Seigal <se...@yahoo.com> wrote:
>>>> According to the HBase book , pre splitting tables and doing manual
>>>> splits is a better long term strategy than letting HBase handle it.
>>>>
>>>
>>>Its good for getting a table off the ground, yes.
>>>
>>>
>>>> Since I do not know what the keys from the prod system are going to
>>>> look like , I am adding a machine number prefix to the the row keys
>>>> and pre splitting the tables  based on the prefix (prefix 0 goes to
>>>> machine A, prefix 1 goes to machine b etc).
>>>>
>>>
>>>You don't need to do inorder scan of the data?  Whats the rest of your
>>>row key look like?
>>>
>>>
>>>> Once I decide to add more machines, I can always do a rolling split
>>>> and add more prefixes.
>>>>
>>>
>>>Yes.
>>>
>>>> Is this a good strategy for pre splitting the tables ?
>>>>
>>>
>>>So, you'll start out with one region per server?
>>>
>>>What do you think the rate of splitting will be like?  Are you using
>>>default region size or have you bumped this up?
>>>
>>>St.Ack
>>
>>

Re: pre splitting tables

Posted by Sam Seigal <se...@yahoo.com>.

According to my understanding, the way that HBase works is that on a
brand new system, all keys will start going to a single region i.e. a
single region server. Once that region
reaches a max region size, it will split and then move to another
region server, and so on and so forth.

Initially hooking up HBase to a prod system, I am concerned about this
behaviour, since a clean HBase cluster is going to experience a surge
of traffic all going into one region server initially.
This is the motivation behind pre-defining the regions, so the initial
surge of traffic is distributed evenly.

My strategy is to take the incoming data, calculate the hash and then
mod the hash with the number of machines I have. I will split the
regions according to the prefix # .
This should , I think provide for better data distribution when the
cluster first starts up with one region / region server.

These regions should then grow fairly uniformly. Once they reach a
size like ~ 5G, I can do a rolling split.

Also, I want to make sure my regions do not grow too much in size that
when I end up adding more machines, it does not take a very long time
to perform a rolling split.

What I do not understand is the advantages/disasvantages of having
regions that are too big vs regions that are too thin. What does this
impact ? Compaction time ? Split time ? What is the
concern when it comes to how the architecture works. I think if I
understand this better, I can manage my regions more efficiently.

On Mon, Oct 24, 2011 at 3:23 PM, Nicolas Spiegelberg
<ns...@fb.com> wrote:
> Isn't a better strategy to create the HBase keys as
>
> Key = hash(MySQL_key) + MySQL_key
>
> That way you'll know your key distribution and can add new machines
> seamlessly.  I'm assuming that your rows don't overlap between any 2
> machines.  If so, you could append the MACHINE_ID to the key (not
> prepend).  I don't think you want the machine # as the first dimension on
> your rows, because you want the data from new machines to be evenly spread
> out across the existing regions.
>
>
> On 10/24/11 9:07 AM, "Stack" <st...@duboce.net> wrote:
>
>>On Mon, Oct 24, 2011 at 1:27 AM, Sam Seigal <se...@yahoo.com> wrote:
>>> According to the HBase book , pre splitting tables and doing manual
>>> splits is a better long term strategy than letting HBase handle it.
>>>
>>
>>Its good for getting a table off the ground, yes.
>>
>>
>>> Since I do not know what the keys from the prod system are going to
>>> look like , I am adding a machine number prefix to the the row keys
>>> and pre splitting the tables  based on the prefix (prefix 0 goes to
>>> machine A, prefix 1 goes to machine b etc).
>>>
>>
>>You don't need to do inorder scan of the data?  Whats the rest of your
>>row key look like?
>>
>>
>>> Once I decide to add more machines, I can always do a rolling split
>>> and add more prefixes.
>>>
>>
>>Yes.
>>
>>> Is this a good strategy for pre splitting the tables ?
>>>
>>
>>So, you'll start out with one region per server?
>>
>>What do you think the rate of splitting will be like?  Are you using
>>default region size or have you bumped this up?
>>
>>St.Ack
>
>

Re: pre splitting tables

Posted by Sam Seigal <se...@yahoo.com>.

Hi Stack,

Inline.

>> According to the HBase book , pre splitting tables and doing manual
>> splits is a better long term strategy than letting HBase handle it.
>>
>
> Its good for getting a table off the ground, yes.
>
>
>> Since I do not know what the keys from the prod system are going to
>> look like , I am adding a machine number prefix to the the row keys
>> and pre splitting the tables  based on the prefix (prefix 0 goes to
>> machine A, prefix 1 goes to machine b etc).
>>
>
> You don't need to do inorder scan of the data?  Whats the rest of your
> row key look like?

I need to do be able to do this on 5-6 types of keys/dimensions.
I have a map reduce job that runs periodically and creates the indexes
on separate tables
for querying the data.

>
>> Once I decide to add more machines, I can always do a rolling split
>> and add more prefixes.
>>
>
> Yes.
>
>> Is this a good strategy for pre splitting the tables ?
>>
>
> So, you'll start out with one region per server?
>
> What do you think the rate of splitting will be like?  Are you using
> default region size or have you bumped this up?

This prefix strategy should I think create one region per region server.
I have configured a single region size to 2 G right now. This is just
the number I picked.

This is a small cluster as a proof of concept running in parallel with
some of the other
monolithic reporting infrastructures we have, and will only be serving
 a fraction of the
prod traffic to start off with.

The machines on the cluster look like - 120 GB of disk space ; 8 GB of memory ;
Quad core 2.66 Ghz . I am going to allocate around 80 GB of memory for
HBase use.

On a side note, I don't think I understand how to really decide how
many regions / region server
do I need.

If I was to create one region / region server and set
hbase.hregion.max.filesize to Long.MAX, why is that
a bad thing ? What kind of problems can I run into ? If I was to err
on the side of
too many regions , what are the advantages/disadvantages there ?




> St.Ack
>

Re: pre splitting tables

Posted by Nicolas Spiegelberg <ns...@fb.com>.

Isn't a better strategy to create the HBase keys as

Key = hash(MySQL_key) + MySQL_key

That way you'll know your key distribution and can add new machines
seamlessly.  I'm assuming that your rows don't overlap between any 2
machines.  If so, you could append the MACHINE_ID to the key (not
prepend).  I don't think you want the machine # as the first dimension on
your rows, because you want the data from new machines to be evenly spread
out across the existing regions.


On 10/24/11 9:07 AM, "Stack" <st...@duboce.net> wrote:

>On Mon, Oct 24, 2011 at 1:27 AM, Sam Seigal <se...@yahoo.com> wrote:
>> According to the HBase book , pre splitting tables and doing manual
>> splits is a better long term strategy than letting HBase handle it.
>>
>
>Its good for getting a table off the ground, yes.
>
>
>> Since I do not know what the keys from the prod system are going to
>> look like , I am adding a machine number prefix to the the row keys
>> and pre splitting the tables  based on the prefix (prefix 0 goes to
>> machine A, prefix 1 goes to machine b etc).
>>
>
>You don't need to do inorder scan of the data?  Whats the rest of your
>row key look like?
>
>
>> Once I decide to add more machines, I can always do a rolling split
>> and add more prefixes.
>>
>
>Yes.
>
>> Is this a good strategy for pre splitting the tables ?
>>
>
>So, you'll start out with one region per server?
>
>What do you think the rate of splitting will be like?  Are you using
>default region size or have you bumped this up?
>
>St.Ack

Re: pre splitting tables

Posted by Stack <st...@duboce.net>.

On Mon, Oct 24, 2011 at 1:27 AM, Sam Seigal <se...@yahoo.com> wrote:
> According to the HBase book , pre splitting tables and doing manual
> splits is a better long term strategy than letting HBase handle it.
>

Its good for getting a table off the ground, yes.

> Since I do not know what the keys from the prod system are going to
> look like , I am adding a machine number prefix to the the row keys
> and pre splitting the tables  based on the prefix (prefix 0 goes to
> machine A, prefix 1 goes to machine b etc).
>

You don't need to do inorder scan of the data?  Whats the rest of your
row key look like?

> Once I decide to add more machines, I can always do a rolling split
> and add more prefixes.
>

Yes.

> Is this a good strategy for pre splitting the tables ?
>

So, you'll start out with one region per server?

What do you think the rate of splitting will be like?  Are you using
default region size or have you bumped this up?

St.Ack