You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Li, Min" <mi...@microstrategy.com> on 2013/01/18 07:17:52 UTC

ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

confused about Data/Disk ratio

Posted by tgh <gu...@ia.ac.cn>.
Hi
	I use hbase to store Data, and I have an observation, that is,
	When hbase store 1Gb data, hdfs use 10Gb disk space, and when data
is 60Gb, hdfs use 180Gb disk, and when data is about 2Tb, hdfs use 3Tb disk,

	That is, the ratio of data/disk is not a linear one, and why, 

	Could you help me


Thank you
---------------------
Guanhua Tian





Re: confused about Data/Disk ratio

Posted by varun kumar <va...@gmail.com>.
Hi Tian,

What is replication factor you mention in hdfs.

Regards,
Varun Kumar.P

On Mon, Jan 21, 2013 at 12:17 PM, tgh <gu...@ia.ac.cn> wrote:

> Hi
>         I use hbase to store Data, and I have an observation, that is,
>         When hbase store 1Gb data, hdfs use 10Gb disk space, and when data
> is 60Gb, hdfs use 180Gb disk, and when data is about 2Tb, hdfs use 3Tb
> disk,
>
>         That is, the ratio of data/disk is not a linear one, and why,
>
>         Could you help me
>
>
> Thank you
> ---------------------
> Guanhua Tian
>
>
>
>
>


-- 
Regards,
Varun Kumar.P

RE: ValueFilter and VERSIONS

Posted by "Li, Min" <mi...@microstrategy.com>.
Thanks for your explanation.

Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com] 
Sent: Friday, January 18, 2013 2:44 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS


ValueFilter works only on the KVs not at a row level . So something similar is not possible.
Setting versions to 1 will make only one version (latest) version getting back to the client. But the filtering is done prior to the versioning decision and filters will see all the version values.

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 12:00 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Hi Anoop,

Thanks for your reply. But I have to use value filter here, because in some of my use case, I can't identify the qualifier.

Thanks,
Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com]
Sent: Friday, January 18, 2013 2:28 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

RE: ValueFilter and VERSIONS

Posted by Anoop Sam John <an...@huawei.com>.
ValueFilter works only on the KVs not at a row level . So something similar is not possible.
Setting versions to 1 will make only one version (latest) version getting back to the client. But the filtering is done prior to the versioning decision and filters will see all the version values.

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 12:00 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Hi Anoop,

Thanks for your reply. But I have to use value filter here, because in some of my use case, I can't identify the qualifier.

Thanks,
Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com]
Sent: Friday, January 18, 2013 2:28 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

RE: ValueFilter and VERSIONS

Posted by "Li, Min" <mi...@microstrategy.com>.
Hi Anoop,

Thanks for your reply. But I have to use value filter here, because in some of my use case, I can't identify the qualifier.

Thanks,
Min

-----Original Message-----
From: Anoop Sam John [mailto:anoopsj@huawei.com] 
Sent: Friday, January 18, 2013 2:28 PM
To: user@hbase.apache.org
Subject: RE: ValueFilter and VERSIONS

Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min

RE: ValueFilter and VERSIONS

Posted by Anoop Sam John <an...@huawei.com>.
Can you make use of SingleColumnValueFilter.  In this you can specify whether the condition to be checked only on the latest version or not.
SCVF#setLatestVersionOnly ( true)

-Anoop-
________________________________________
From: Li, Min [mili@microstrategy.com]
Sent: Friday, January 18, 2013 11:47 AM
To: user@hbase.apache.org
Subject: ValueFilter and VERSIONS

Hi all,

As you know, ValueFilter will filter data from all versions, so I create a table and indicate it has only 1 version. However, the old version record still can be gotten by ValueFilter? Does anyone know how to create a table with only one version record?

BTW, I am using hbase 0.92.1. Following is my testing commands:


hbase(main):016:0> create 'testUser',  {NAME => 'F', VERSIONS => 1}
0 row(s) in 1.0630 seconds

hbase(main):017:0> put 'testUser', '123, 'F:f', '3'
0 row(s) in 0.0120 seconds

hbase(main):018:0> put 'testUser', '123, 'F:f', '1'
0 row(s) in 0.0060 seconds

hbase(main):019:0> scan 'testUser'
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489113213, value=1
1 row(s) in 0.0110 seconds

hbase(main):020:0> scan 'testUser',{FILTER => "(PrefixFilter ('123') AND ValueFilter (>,'binary:1')"}
ROW                                                COLUMN+CELL
 123                column=F:f, timestamp=1358489110172, value=3
1 row(s) in 0.1790 seconds


Thanks,
Min