You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by anil gupta <an...@gmail.com> on 2013/01/17 18:24:05 UTC

RegionSplitter command

Hi All,

I am using HBase0.92.1. I am trying to run ycsb tests on my hbase cluster.
But while running the load test with 40 threads i am getting hotspots on
the RS that is hosting the Region of 'usertable'. Currently, the empty
table only has one region. So, i want to pre-split my 'usertable' using the
RegionSplitter utility.
HexStringSplit is the splitting algorithm, 'c' is the name of the column
family. I tried the following commands but they didnt work:

   1. hbase org.apache.hadoop.hbase.util.RegionSplitter usertable
   HexStringSplit -c 60 -f c
   2. hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f c
   usertable HexStringSplit

Following is the output of above commands:

usage: RegionSplitter <TABLE>
 -c <region count>        Create a new table with a pre-split number of
                          regions
 -D <property=value>      Override HBase Configuration Settings
 -f <family:family:...>   Column Families to create with new table.
                          Required with -c
 -h                       Print this usage help
 -o <count>               Max outstanding splits that have unfinished
                          major compactions
 -r                       Perform a rolling split of an existing region
    --risky               Skip verification steps to complete
                          quickly.STRONGLY DISCOURAGED for production
                          systems.


Please let me know the proper command for RegionSplitter utility.

-- 
Thanks & Regards,
Anil Gupta

confused about Data/Disk ratio

Posted by tgh <gu...@ia.ac.cn>.
Hi
	I use hbase to store Data, and I have an observation, that is,
	When hbase store 1Gb data, hdfs use 10Gb disk space, and when data
is 60Gb, hdfs use 180Gb disk, and when data is about 2Tb, hdfs use 3Tb disk,

	That is, the ratio of data/disk is not a linear one, and why, 

	Could you help me


Thank you
---------------------
Guanhua Tian





Re: confused about Data/Disk ratio

Posted by varun kumar <va...@gmail.com>.
Hi Tian,

What is replication factor you mention in hdfs.

Regards,
Varun Kumar.P

On Mon, Jan 21, 2013 at 12:17 PM, tgh <gu...@ia.ac.cn> wrote:

> Hi
>         I use hbase to store Data, and I have an observation, that is,
>         When hbase store 1Gb data, hdfs use 10Gb disk space, and when data
> is 60Gb, hdfs use 180Gb disk, and when data is about 2Tb, hdfs use 3Tb
> disk,
>
>         That is, the ratio of data/disk is not a linear one, and why,
>
>         Could you help me
>
>
> Thank you
> ---------------------
> Guanhua Tian
>
>
>
>
>


-- 
Regards,
Varun Kumar.P

RE: ValueFilter and VERSIONS

Posted by "Li, Min" <mi...@microstrategy.com>.
Thanks for your explanation.

Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com] 
Sent: Friday, January 18, 2013 2:44 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS


ValueFilter works only on the KVs not at a row level . So something similar is not possible.
Setting versions to 1 will make only one version (latest) version getting back to the client. But the filtering is done prior to the versioning decision and filters will see all the version values.

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 12:00 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Hi Anoop,

Thanks for your reply. But I have to use value filter here, because in some of my use case, I can't identify the qualifier.

Thanks,
Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com]
Sent: Friday, January 18, 2013 2:28 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

RE: ValueFilter and VERSIONS

Posted by Anoop Sam John <an...@huawei.com>.
ValueFilter works only on the KVs not at a row level . So something similar is not possible.
Setting versions to 1 will make only one version (latest) version getting back to the client. But the filtering is done prior to the versioning decision and filters will see all the version values.

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 12:00 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Hi Anoop,

Thanks for your reply. But I have to use value filter here, because in some of my use case, I can't identify the qualifier.

Thanks,
Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com]
Sent: Friday, January 18, 2013 2:28 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

RE: ValueFilter and VERSIONS

Posted by "Li, Min" <mi...@microstrategy.com>.
Hi Anoop,

Thanks for your reply. But I have to use value filter here, because in some of my use case, I can't identify the qualifier.

Thanks,
Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com] 
Sent: Friday, January 18, 2013 2:28 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

RE: ValueFilter and VERSIONS

Posted by Anoop Sam John <an...@huawei.com>.
Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

ValueFilter and VERSIONS

Posted by "Li, Min" <mi...@microstrategy.com>.
Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

Re: RegionSplitter command

Posted by anil gupta <an...@gmail.com>.
Hi Jean,

Yes, i am stuck with 0.92.1. Thanks for your response. I think, i will need
to dig more deep into this.

~Anil

On Thu, Jan 17, 2013 at 3:02 PM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Hi Anil,
>
> bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f f1 test
> HexStringSplit
>
> is working for me.
>
> So your 2nd line should work. I'm not sure if it's because of the
> version since I'm using 0.94.4.
>
> Can you upgrade your version and retry? Or you need to stay with 0.92.1?
>
> JM
>
> 2013/1/17, anil gupta <an...@gmail.com>:
> > Hi All,
> >
> > I am using HBase0.92.1. I am trying to run ycsb tests on my hbase
> cluster.
> > But while running the load test with 40 threads i am getting hotspots on
> > the RS that is hosting the Region of 'usertable'. Currently, the empty
> > table only has one region. So, i want to pre-split my 'usertable' using
> the
> > RegionSplitter utility.
> > HexStringSplit is the splitting algorithm, 'c' is the name of the column
> > family. I tried the following commands but they didnt work:
> >
> >    1. hbase org.apache.hadoop.hbase.util.RegionSplitter usertable
> >    HexStringSplit -c 60 -f c
> >    2. hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f c
> >    usertable HexStringSplit
> >
> > Following is the output of above commands:
> >
> > usage: RegionSplitter <TABLE>
> >  -c <region count>        Create a new table with a pre-split number of
> >                           regions
> >  -D <property=value>      Override HBase Configuration Settings
> >  -f <family:family:...>   Column Families to create with new table.
> >                           Required with -c
> >  -h                       Print this usage help
> >  -o <count>               Max outstanding splits that have unfinished
> >                           major compactions
> >  -r                       Perform a rolling split of an existing region
> >     --risky               Skip verification steps to complete
> >                           quickly.STRONGLY DISCOURAGED for production
> >                           systems.
> >
> >
> > Please let me know the proper command for RegionSplitter utility.
> >
> > --
> > Thanks & Regards,
> > Anil Gupta
> >
>



-- 
Thanks & Regards,
Anil Gupta

Re: RegionSplitter command

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Anil,

bin/hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f f1 test
HexStringSplit

is working for me.

So your 2nd line should work. I'm not sure if it's because of the
version since I'm using 0.94.4.

Can you upgrade your version and retry? Or you need to stay with 0.92.1?

JM

2013/1/17, anil gupta <an...@gmail.com>:
> Hi All,
>
> I am using HBase0.92.1. I am trying to run ycsb tests on my hbase cluster.
> But while running the load test with 40 threads i am getting hotspots on
> the RS that is hosting the Region of 'usertable'. Currently, the empty
> table only has one region. So, i want to pre-split my 'usertable' using the
> RegionSplitter utility.
> HexStringSplit is the splitting algorithm, 'c' is the name of the column
> family. I tried the following commands but they didnt work:
>
>    1. hbase org.apache.hadoop.hbase.util.RegionSplitter usertable
>    HexStringSplit -c 60 -f c
>    2. hbase org.apache.hadoop.hbase.util.RegionSplitter -c 60 -f c
>    usertable HexStringSplit
>
> Following is the output of above commands:
>
> usage: RegionSplitter <TABLE>
>  -c <region count>        Create a new table with a pre-split number of
>                           regions
>  -D <property=value>      Override HBase Configuration Settings
>  -f <family:family:...>   Column Families to create with new table.
>                           Required with -c
>  -h                       Print this usage help
>  -o <count>               Max outstanding splits that have unfinished
>                           major compactions
>  -r                       Perform a rolling split of an existing region
>     --risky               Skip verification steps to complete
>                           quickly.STRONGLY DISCOURAGED for production
>                           systems.
>
>
> Please let me know the proper command for RegionSplitter utility.
>
> --
> Thanks & Regards,
> Anil Gupta
>