You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Li Li <fa...@gmail.com> on 2014/05/08 14:58:01 UTC
how to pre split a table whose row key is MD5(url)?
say I have 4 region server. How to pre split a table using MD5 as row key?
Re: how to pre split a table whose row key is MD5(url)?
Posted by Michael Segel <mi...@hotmail.com>.
Meh.
Did this before my morning coffee.
You get the idea. ;-)
On May 12, 2014, at 2:37 AM, Li Li <fa...@gmail.com> wrote:
> thanks. I will try this.
> by the way, byte range is -128 - 127
>
> On Mon, May 12, 2014 at 6:13 AM, Michael Segel
> <mi...@hotmail.com> wrote:
>> Simple answer… you really can’t.
>> The best thing you can do is to pre split the table in to 4 regions based on splitting the first byte in to 4 equal ranges. (0-63,64-127,128-191,191-255)
>>
>> And hope that you’ll have an even split.
>>
>> In theory, over time you will.
>>
>>
>> On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
>>
>>> say I have 4 region server. How to pre split a table using MD5 as row key?
>>>
>>
>
Re: how to pre split a table whose row key is MD5(url)?
Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Or even easier, just use this:
create 't1', 'f1', {NUMREGIONS => 16, SPLITALGO => 'HexStringSplit'}
2014-05-14 10:50 GMT-04:00 Jean-Daniel Cryans <jd...@apache.org>:
> On Tue, May 13, 2014 at 9:58 AM, Liam Slusser <ls...@gmail.com> wrote:
>
> > You can also create a table via the hbase shell with pre-split tables
> like
> > this...
> >
> > Here is a 32-byte split into 16 different regions, using base16 (ie a md5
> > hash) for the key-type.
> >
> > create 't1', {NAME => 'f1'},
> > {SPLITS=> ['10000000000000000000000000000000',
> > '20000000000000000000000000000000',
> > '30000000000000000000000000000000',
> > '40000000000000000000000000000000',
> > '50000000000000000000000000000000',
> > '60000000000000000000000000000000',
> > '70000000000000000000000000000000',
> > '80000000000000000000000000000000',
> > '90000000000000000000000000000000',
> > 'a0000000000000000000000000000000',
> > 'b0000000000000000000000000000000',
> > 'c0000000000000000000000000000000',
> > 'd0000000000000000000000000000000',
> > 'e0000000000000000000000000000000',
> > 'f0000000000000000000000000000000']}
> >
>
> To make this easier to type, you don't even need the 0 padding. Just '1',
> '2', '3', ... 'f' is enough :)
>
>
> >
> > thanks,
> > liam
> >
> >
> >
> > On Tue, May 13, 2014 at 6:49 AM, sudhakara st <sudhakara.st@gmail.com
> > >wrote:
> >
> > > you can pre-splite table using you hex characters string for start key,
> > end
> > > key and using number of regions to spilit
> > >
> > >
> > >
> >
> **************************************************************************************************************
> > > HTableDescriptor tableDes = new HTableDescriptor(tableName);
> > > tableDes.setValue(HTableDescriptor.SPLIT_POLICY,
> > > KeyPrefixRegionSplitPolicy.class.getName());
> > >
> > > byte[][] splits =
> > > getHexSplits(SPLIT_START_KEY,SPLIT_END_KEY,NUM_OF_REGION_SPLIT);
> > > admin.createTable(tableDes, splits);
> > >
> > >
> > >
> >
> ******************************************************************************************************************
> > > private byte[][] getHexSplits(String startKey, String endKey, int
> > > numRegions) {
> > > byte[][] splits = new byte[numRegions - 1][];
> > > BigInteger lowestKey = new BigInteger(startKey, 8);
> //considering
> > > for first 8bytes to spilte
> > > BigInteger highestKey = new BigInteger(endKey, 8);
> > > BigInteger range = highestKey.subtract(lowestKey);
> > > BigInteger regionIncrement =
> > > range.divide(BigInteger.valueOf(numRegions));
> > > lowestKey = lowestKey.add(regionIncrement);
> > > for (int i = 0; i < numRegions - 1; i++) {
> > > BigInteger key =
> > > lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
> > > byte[] b = String.format("%016x", key).getBytes();
> > > splits[i] = b;
> > > }
> > > return splits;
> > > }
> > >
> > >
> > >
> >
> *************************************************************************************************************
> > >
> > >
> > > On Mon, May 12, 2014 at 7:07 AM, Li Li <fa...@gmail.com> wrote:
> > >
> > > > thanks. I will try this.
> > > > by the way, byte range is -128 - 127
> > > >
> > > > On Mon, May 12, 2014 at 6:13 AM, Michael Segel
> > > > <mi...@hotmail.com> wrote:
> > > > > Simple answer… you really can’t.
> > > > > The best thing you can do is to pre split the table in to 4 regions
> > > > based on splitting the first byte in to 4 equal ranges.
> > > > (0-63,64-127,128-191,191-255)
> > > > >
> > > > > And hope that you’ll have an even split.
> > > > >
> > > > > In theory, over time you will.
> > > > >
> > > > >
> > > > > On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
> > > > >
> > > > >> say I have 4 region server. How to pre split a table using MD5 as
> > row
> > > > key?
> > > > >>
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Regards,
> > > ...sudhakara
> > >
> >
>
Re: how to pre split a table whose row key is MD5(url)?
Posted by Jean-Daniel Cryans <jd...@apache.org>.
On Tue, May 13, 2014 at 9:58 AM, Liam Slusser <ls...@gmail.com> wrote:
> You can also create a table via the hbase shell with pre-split tables like
> this...
>
> Here is a 32-byte split into 16 different regions, using base16 (ie a md5
> hash) for the key-type.
>
> create 't1', {NAME => 'f1'},
> {SPLITS=> ['10000000000000000000000000000000',
> '20000000000000000000000000000000',
> '30000000000000000000000000000000',
> '40000000000000000000000000000000',
> '50000000000000000000000000000000',
> '60000000000000000000000000000000',
> '70000000000000000000000000000000',
> '80000000000000000000000000000000',
> '90000000000000000000000000000000',
> 'a0000000000000000000000000000000',
> 'b0000000000000000000000000000000',
> 'c0000000000000000000000000000000',
> 'd0000000000000000000000000000000',
> 'e0000000000000000000000000000000',
> 'f0000000000000000000000000000000']}
>
To make this easier to type, you don't even need the 0 padding. Just '1',
'2', '3', ... 'f' is enough :)
>
> thanks,
> liam
>
>
>
> On Tue, May 13, 2014 at 6:49 AM, sudhakara st <sudhakara.st@gmail.com
> >wrote:
>
> > you can pre-splite table using you hex characters string for start key,
> end
> > key and using number of regions to spilit
> >
> >
> >
> **************************************************************************************************************
> > HTableDescriptor tableDes = new HTableDescriptor(tableName);
> > tableDes.setValue(HTableDescriptor.SPLIT_POLICY,
> > KeyPrefixRegionSplitPolicy.class.getName());
> >
> > byte[][] splits =
> > getHexSplits(SPLIT_START_KEY,SPLIT_END_KEY,NUM_OF_REGION_SPLIT);
> > admin.createTable(tableDes, splits);
> >
> >
> >
> ******************************************************************************************************************
> > private byte[][] getHexSplits(String startKey, String endKey, int
> > numRegions) {
> > byte[][] splits = new byte[numRegions - 1][];
> > BigInteger lowestKey = new BigInteger(startKey, 8); //considering
> > for first 8bytes to spilte
> > BigInteger highestKey = new BigInteger(endKey, 8);
> > BigInteger range = highestKey.subtract(lowestKey);
> > BigInteger regionIncrement =
> > range.divide(BigInteger.valueOf(numRegions));
> > lowestKey = lowestKey.add(regionIncrement);
> > for (int i = 0; i < numRegions - 1; i++) {
> > BigInteger key =
> > lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
> > byte[] b = String.format("%016x", key).getBytes();
> > splits[i] = b;
> > }
> > return splits;
> > }
> >
> >
> >
> *************************************************************************************************************
> >
> >
> > On Mon, May 12, 2014 at 7:07 AM, Li Li <fa...@gmail.com> wrote:
> >
> > > thanks. I will try this.
> > > by the way, byte range is -128 - 127
> > >
> > > On Mon, May 12, 2014 at 6:13 AM, Michael Segel
> > > <mi...@hotmail.com> wrote:
> > > > Simple answer… you really can’t.
> > > > The best thing you can do is to pre split the table in to 4 regions
> > > based on splitting the first byte in to 4 equal ranges.
> > > (0-63,64-127,128-191,191-255)
> > > >
> > > > And hope that you’ll have an even split.
> > > >
> > > > In theory, over time you will.
> > > >
> > > >
> > > > On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
> > > >
> > > >> say I have 4 region server. How to pre split a table using MD5 as
> row
> > > key?
> > > >>
> > > >
> > >
> >
> >
> >
> > --
> >
> > Regards,
> > ...sudhakara
> >
>
Re: how to pre split a table whose row key is MD5(url)?
Posted by Liam Slusser <ls...@gmail.com>.
You can also create a table via the hbase shell with pre-split tables like
this...
Here is a 32-byte split into 16 different regions, using base16 (ie a md5
hash) for the key-type.
create 't1', {NAME => 'f1'},
{SPLITS=> ['10000000000000000000000000000000',
'20000000000000000000000000000000',
'30000000000000000000000000000000',
'40000000000000000000000000000000',
'50000000000000000000000000000000',
'60000000000000000000000000000000',
'70000000000000000000000000000000',
'80000000000000000000000000000000',
'90000000000000000000000000000000',
'a0000000000000000000000000000000',
'b0000000000000000000000000000000',
'c0000000000000000000000000000000',
'd0000000000000000000000000000000',
'e0000000000000000000000000000000',
'f0000000000000000000000000000000']}
thanks,
liam
On Tue, May 13, 2014 at 6:49 AM, sudhakara st <su...@gmail.com>wrote:
> you can pre-splite table using you hex characters string for start key, end
> key and using number of regions to spilit
>
>
> **************************************************************************************************************
> HTableDescriptor tableDes = new HTableDescriptor(tableName);
> tableDes.setValue(HTableDescriptor.SPLIT_POLICY,
> KeyPrefixRegionSplitPolicy.class.getName());
>
> byte[][] splits =
> getHexSplits(SPLIT_START_KEY,SPLIT_END_KEY,NUM_OF_REGION_SPLIT);
> admin.createTable(tableDes, splits);
>
>
> ******************************************************************************************************************
> private byte[][] getHexSplits(String startKey, String endKey, int
> numRegions) {
> byte[][] splits = new byte[numRegions - 1][];
> BigInteger lowestKey = new BigInteger(startKey, 8); //considering
> for first 8bytes to spilte
> BigInteger highestKey = new BigInteger(endKey, 8);
> BigInteger range = highestKey.subtract(lowestKey);
> BigInteger regionIncrement =
> range.divide(BigInteger.valueOf(numRegions));
> lowestKey = lowestKey.add(regionIncrement);
> for (int i = 0; i < numRegions - 1; i++) {
> BigInteger key =
> lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
> byte[] b = String.format("%016x", key).getBytes();
> splits[i] = b;
> }
> return splits;
> }
>
>
> *************************************************************************************************************
>
>
> On Mon, May 12, 2014 at 7:07 AM, Li Li <fa...@gmail.com> wrote:
>
> > thanks. I will try this.
> > by the way, byte range is -128 - 127
> >
> > On Mon, May 12, 2014 at 6:13 AM, Michael Segel
> > <mi...@hotmail.com> wrote:
> > > Simple answer… you really can’t.
> > > The best thing you can do is to pre split the table in to 4 regions
> > based on splitting the first byte in to 4 equal ranges.
> > (0-63,64-127,128-191,191-255)
> > >
> > > And hope that you’ll have an even split.
> > >
> > > In theory, over time you will.
> > >
> > >
> > > On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
> > >
> > >> say I have 4 region server. How to pre split a table using MD5 as row
> > key?
> > >>
> > >
> >
>
>
>
> --
>
> Regards,
> ...sudhakara
>
Re: how to pre split a table whose row key is MD5(url)?
Posted by sudhakara st <su...@gmail.com>.
you can pre-splite table using you hex characters string for start key, end
key and using number of regions to spilit
**************************************************************************************************************
HTableDescriptor tableDes = new HTableDescriptor(tableName);
tableDes.setValue(HTableDescriptor.SPLIT_POLICY,
KeyPrefixRegionSplitPolicy.class.getName());
byte[][] splits =
getHexSplits(SPLIT_START_KEY,SPLIT_END_KEY,NUM_OF_REGION_SPLIT);
admin.createTable(tableDes, splits);
******************************************************************************************************************
private byte[][] getHexSplits(String startKey, String endKey, int
numRegions) {
byte[][] splits = new byte[numRegions - 1][];
BigInteger lowestKey = new BigInteger(startKey, 8); //considering
for first 8bytes to spilte
BigInteger highestKey = new BigInteger(endKey, 8);
BigInteger range = highestKey.subtract(lowestKey);
BigInteger regionIncrement =
range.divide(BigInteger.valueOf(numRegions));
lowestKey = lowestKey.add(regionIncrement);
for (int i = 0; i < numRegions - 1; i++) {
BigInteger key =
lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
byte[] b = String.format("%016x", key).getBytes();
splits[i] = b;
}
return splits;
}
*************************************************************************************************************
On Mon, May 12, 2014 at 7:07 AM, Li Li <fa...@gmail.com> wrote:
> thanks. I will try this.
> by the way, byte range is -128 - 127
>
> On Mon, May 12, 2014 at 6:13 AM, Michael Segel
> <mi...@hotmail.com> wrote:
> > Simple answer… you really can’t.
> > The best thing you can do is to pre split the table in to 4 regions
> based on splitting the first byte in to 4 equal ranges.
> (0-63,64-127,128-191,191-255)
> >
> > And hope that you’ll have an even split.
> >
> > In theory, over time you will.
> >
> >
> > On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
> >
> >> say I have 4 region server. How to pre split a table using MD5 as row
> key?
> >>
> >
>
--
Regards,
...sudhakara
Re: how to pre split a table whose row key is MD5(url)?
Posted by Li Li <fa...@gmail.com>.
thanks. I will try this.
by the way, byte range is -128 - 127
On Mon, May 12, 2014 at 6:13 AM, Michael Segel
<mi...@hotmail.com> wrote:
> Simple answer… you really can’t.
> The best thing you can do is to pre split the table in to 4 regions based on splitting the first byte in to 4 equal ranges. (0-63,64-127,128-191,191-255)
>
> And hope that you’ll have an even split.
>
> In theory, over time you will.
>
>
> On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
>
>> say I have 4 region server. How to pre split a table using MD5 as row key?
>>
>
Re: how to pre split a table whose row key is MD5(url)?
Posted by Michael Segel <mi...@hotmail.com>.
Simple answer… you really can’t.
The best thing you can do is to pre split the table in to 4 regions based on splitting the first byte in to 4 equal ranges. (0-63,64-127,128-191,191-255)
And hope that you’ll have an even split.
In theory, over time you will.
On May 8, 2014, at 1:58 PM, Li Li <fa...@gmail.com> wrote:
> say I have 4 region server. How to pre split a table using MD5 as row key?
>
Re: how to pre split a table whose row key is MD5(url)?
Posted by Ted Yu <yu...@gmail.com>.
Please see http://hbase.apache.org/book.html#rowkey.regionsplits
On Thu, May 8, 2014 at 5:58 AM, Li Li <fa...@gmail.com> wrote:
> say I have 4 region server. How to pre split a table using MD5 as row key?
>