You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by William Shen <wi...@marinsoftware.com> on 2018/10/16 20:15:01 UTC
Scanning table using partial row key match returns unexpected results
Hi there,
I am trying to scan using a partial match on the row key (derived from the
Phoenix primary key), however, hbase shell is returning results that do not
look like a match. Can someone help me understand why the following row
keys are considered a match and returned?
In addition, I am not sure how to interpret the values like \xF3^ and \x14'
that are suppose to be hex values...
hbase(main):001:0> import org.apache.hadoop.hbase.filter.CompareFilter
=> Java::OrgApacheHadoopHbaseFilter::CompareFilter
hbase(main):002:0> import org.apache.hadoop.hbase.filter.SubstringComparator
=> Java::OrgApacheHadoopHbaseFilter::SubstringComparator
hbase(main):003:0> import org.apache.hadoop.hbase.filter.RowFilter
=> Java::OrgApacheHadoopHbaseFilter::RowFilter
hbase(main):004:0> scan 'TEST_SCHEMA.TEST_TABLE', {COLUMNS => 'TG:_0',
LIMIT => 5, FILTER =>
RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new("\x80\x00\x00\x00\x00\x00\x14\x27\x00\x07\x80\x00\x00\x00\x00\xC7\xE5\x87"))}
ROW
COLUMN+CELL
\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\x81
column=TG:_0, timestamp=1481844289334, value=
\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\xE5
column=TG:_0, timestamp=1481844289334, value=
\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF3^
column=TG:_0, timestamp=1481844289334, value=
\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF8\xA5
column=TG:_0, timestamp=1481844289334, value=
\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF9b
column=TG:_0, timestamp=1481844289334, value=
5 row(s) in 0.9430 seconds
Thanks in advance!
- Will
Re: Scanning table using partial row key match returns unexpected results
Posted by William Shen <wi...@marinsoftware.com>.
Actually, they are correctly matched. After further investigation, it turns
out that the Row Key is printed out differently because it used binary
string representation (
https://stackoverflow.com/questions/42353013/what-are-the-non-hex-characters-in-hbase-shell-rowkey
).
After converting them back to hex, you can see they are actually correctly
matched:
hbase(main):015:0>
Bytes.toHex(Bytes.toBytesBinary("\x80\x00\x00\x00\x00\x00\x14\x27\x00\x07\x80\x00\x00\x00\x00\xC7\xE5\x87"))
=> "fd000000000014270007fd00000000fdfd"
is indeed contained in:
hbase(main):016:0>
Bytes.toHex(Bytes.toBytesBinary("\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\x81")
)
=> "00fd000000000014270007fd00000000fdfd"
and
hbase(main):018:0>
Bytes.toHex(Bytes.toBytesBinary("\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF3^"))
=> "00fd000000000014270007fd00000000fdfd5e"
On Tue, Oct 16, 2018 at 1:15 PM William Shen <wi...@marinsoftware.com>
wrote:
> Hi there,
>
> I am trying to scan using a partial match on the row key (derived from the
> Phoenix primary key), however, hbase shell is returning results that do not
> look like a match. Can someone help me understand why the following row
> keys are considered a match and returned?
>
> In addition, I am not sure how to interpret the values like \xF3^ and
> \x14' that are suppose to be hex values...
>
> hbase(main):001:0> import org.apache.hadoop.hbase.filter.CompareFilter
>
> => Java::OrgApacheHadoopHbaseFilter::CompareFilter
>
> hbase(main):002:0> import
> org.apache.hadoop.hbase.filter.SubstringComparator
>
> => Java::OrgApacheHadoopHbaseFilter::SubstringComparator
>
> hbase(main):003:0> import org.apache.hadoop.hbase.filter.RowFilter
>
> => Java::OrgApacheHadoopHbaseFilter::RowFilter
>
> hbase(main):004:0> scan 'TEST_SCHEMA.TEST_TABLE', {COLUMNS => 'TG:_0',
> LIMIT => 5, FILTER =>
> RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new("\x80\x00\x00\x00\x00\x00\x14\x27\x00\x07\x80\x00\x00\x00\x00\xC7\xE5\x87"))}
>
> ROW
> COLUMN+CELL
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\x81
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\xE5
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF3^
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF8\xA5
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF9b
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> 5 row(s) in 0.9430 seconds
>
> Thanks in advance!
>
> - Will
>
Re: Scanning table using partial row key match returns unexpected results
Posted by William Shen <wi...@marinsoftware.com>.
Actually, they are correctly matched. After further investigation, it turns
out that the Row Key is printed out differently because it used binary
string representation (
https://stackoverflow.com/questions/42353013/what-are-the-non-hex-characters-in-hbase-shell-rowkey
).
After converting them back to hex, you can see they are actually correctly
matched:
hbase(main):015:0>
Bytes.toHex(Bytes.toBytesBinary("\x80\x00\x00\x00\x00\x00\x14\x27\x00\x07\x80\x00\x00\x00\x00\xC7\xE5\x87"))
=> "fd000000000014270007fd00000000fdfd"
is indeed contained in:
hbase(main):016:0>
Bytes.toHex(Bytes.toBytesBinary("\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\x81")
)
=> "00fd000000000014270007fd00000000fdfd"
and
hbase(main):018:0>
Bytes.toHex(Bytes.toBytesBinary("\x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF3^"))
=> "00fd000000000014270007fd00000000fdfd5e"
On Tue, Oct 16, 2018 at 1:15 PM William Shen <wi...@marinsoftware.com>
wrote:
> Hi there,
>
> I am trying to scan using a partial match on the row key (derived from the
> Phoenix primary key), however, hbase shell is returning results that do not
> look like a match. Can someone help me understand why the following row
> keys are considered a match and returned?
>
> In addition, I am not sure how to interpret the values like \xF3^ and
> \x14' that are suppose to be hex values...
>
> hbase(main):001:0> import org.apache.hadoop.hbase.filter.CompareFilter
>
> => Java::OrgApacheHadoopHbaseFilter::CompareFilter
>
> hbase(main):002:0> import
> org.apache.hadoop.hbase.filter.SubstringComparator
>
> => Java::OrgApacheHadoopHbaseFilter::SubstringComparator
>
> hbase(main):003:0> import org.apache.hadoop.hbase.filter.RowFilter
>
> => Java::OrgApacheHadoopHbaseFilter::RowFilter
>
> hbase(main):004:0> scan 'TEST_SCHEMA.TEST_TABLE', {COLUMNS => 'TG:_0',
> LIMIT => 5, FILTER =>
> RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),SubstringComparator.new("\x80\x00\x00\x00\x00\x00\x14\x27\x00\x07\x80\x00\x00\x00\x00\xC7\xE5\x87"))}
>
> ROW
> COLUMN+CELL
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\x81
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xEA\xE5
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF3^
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF8\xA5
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> \x00\x80\x00\x00\x00\x00\x00\x14'\x00\x07\x80\x00\x00\x00\x00\xBC\xF9b
> column=TG:_0, timestamp=1481844289334, value=
>
>
>
>
> 5 row(s) in 0.9430 seconds
>
> Thanks in advance!
>
> - Will
>