You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ClownfishYang (Jira)" <ji...@apache.org> on 2020/12/02 09:59:00 UTC
[jira] [Updated] (HBASE-25350) The scan command gives a row, but
get does not have this row.
[ https://issues.apache.org/jira/browse/HBASE-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ClownfishYang updated HBASE-25350:
----------------------------------
Description:
Hbase version:1.1.2
We used Hbase as a real-time database and then used hive external tables for our queries, but found that there was a problem with the data query for one table.
{code:java}
// sql, result id in (12045075, 12045076,...)
SELECT id FROM t1 LIMIT 10
// not result
SELECT id FROM t1 WHERE id = '12045075' LIMIT 10
// create table
CREATE EXTERNAL TABLE `t1`(
`__key` string COMMENT '',
`id` string COMMENT '主键ID')
COMMENT ''
ROW FORMAT SERDE
'org.apache.hadoop.hive.hbase.HBaseSerDe'
STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
'hbase.columns.mapping'=':key,f:id',
'serialization.format'='1')
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
'hbase.table.name'='t1',
'numFiles'='0',
'numRows'='0',
'rawDataSize'='0',
'totalSize'='0',
'transient_lastDdlTime'='1606804842'){code}
During this period, I added space or like, but the cause of the problem could not be verified. I began to suspect that it was hbase.
{code:java}
// hbase table desc
describe 't1'
Table t1 is ENABLED
t1, {TABLE_ATTRIBUTES => {METADATA => {'COMPACTION_ENABLED' => 'true'}} COLUMN FAMILIES DESCRIPTION {NAME => 'f', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '6 04800 SECONDS (7 DAYS)', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
// scan row
scan 't1', {COLUMNS => 'f:id', LIMIT => 10}
// result
ROW COLUMN+CELL
12044083 column=f:id, timestamp=1606293182000, value=12044083
12044084 column=f:id, timestamp=1606293183000, value=12044084
12044085 column=f:id, timestamp=1606293185000, value=12044085
12044086 column=f:id, timestamp=1606293190000, value=12044086
12044087 column=f:id, timestamp=1606293192000, value=12044087
12044088 column=f:id, timestamp=1606293197000, value=12044088
12044089 column=f:id, timestamp=1606293198000, value=12044089
12044090 column=f:id, timestamp=1606293204000, value=12044090
12044091 column=f:id, timestamp=1606293207000, value=12044091
12044092 column=f:id, timestamp=1606293208000, value=12044092
// get row, not result
get 't1', "12044083" , {COLUMNS => 'f:id'}{code}
First of all, only row and ID queries will have this problem, and other column queries are normal.Now I think we have reason to suspect that there are invisible escape characters or something in the data, but how do I know?
The worst part is that I've used the Java API to make the call, and the returned data doesn't find any invisible escape characters on the row or ID.
was:
Hbase version:1.1.2
We used Hbase as a real-time database and then used hive external tables for our queries, but found that there was a problem with the data query for one table.
{code:java}
// sql, result id in (12045075, 12045076,...)
SELECT id FROM t1 LIMIT 10
// not result
SELECT id FROM t1 WHERE id = '12045075' LIMIT 10
// create table
CREATE EXTERNAL TABLE `t1`(
`__key` string COMMENT '',
`id` string COMMENT '主键ID')
COMMENT ''
ROW FORMAT SERDE
'org.apache.hadoop.hive.hbase.HBaseSerDe'
STORED BY
'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
'hbase.columns.mapping'=':key,f:id',
'serialization.format'='1')
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
'hbase.table.name'='t1',
'numFiles'='0',
'numRows'='0',
'rawDataSize'='0',
'totalSize'='0',
'transient_lastDdlTime'='1606804842'){code}
During this period, I added space or like, but the cause of the problem could not be verified. I began to suspect that it was hbase.
{code:java}
// scan row
scan 't1', {COLUMNS => 'f:id', LIMIT => 10}
// result
ROW COLUMN+CELL
12044083 column=f:id, timestamp=1606293182000, value=12044083
12044084 column=f:id, timestamp=1606293183000, value=12044084
12044085 column=f:id, timestamp=1606293185000, value=12044085
12044086 column=f:id, timestamp=1606293190000, value=12044086
12044087 column=f:id, timestamp=1606293192000, value=12044087
12044088 column=f:id, timestamp=1606293197000, value=12044088
12044089 column=f:id, timestamp=1606293198000, value=12044089
12044090 column=f:id, timestamp=1606293204000, value=12044090
12044091 column=f:id, timestamp=1606293207000, value=12044091
12044092 column=f:id, timestamp=1606293208000, value=12044092
// get row, not result
get 't1', "12044083" , {COLUMNS => 'f:id'}{code}
First of all, only row and ID queries will have this problem, and other column queries are normal.Now I think we have reason to suspect that there are invisible escape characters or something in the data, but how do I know?
The worst part is that I've used the Java API to make the call, and the returned data doesn't find any invisible escape characters on the row or ID.
> The scan command gives a row, but get does not have this row.
> -------------------------------------------------------------
>
> Key: HBASE-25350
> URL: https://issues.apache.org/jira/browse/HBASE-25350
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.2.11
> Reporter: ClownfishYang
> Priority: Major
>
> Hbase version:1.1.2
>
> We used Hbase as a real-time database and then used hive external tables for our queries, but found that there was a problem with the data query for one table.
>
> {code:java}
> // sql, result id in (12045075, 12045076,...)
> SELECT id FROM t1 LIMIT 10
> // not result
> SELECT id FROM t1 WHERE id = '12045075' LIMIT 10
> // create table
> CREATE EXTERNAL TABLE `t1`(
> `__key` string COMMENT '',
> `id` string COMMENT '主键ID')
> COMMENT ''
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> 'hbase.columns.mapping'=':key,f:id',
> 'serialization.format'='1')
> TBLPROPERTIES (
> 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
> 'hbase.table.name'='t1',
> 'numFiles'='0',
> 'numRows'='0',
> 'rawDataSize'='0',
> 'totalSize'='0',
> 'transient_lastDdlTime'='1606804842'){code}
> During this period, I added space or like, but the cause of the problem could not be verified. I began to suspect that it was hbase.
> {code:java}
> // hbase table desc
> describe 't1'
> Table t1 is ENABLED
> t1, {TABLE_ATTRIBUTES => {METADATA => {'COMPACTION_ENABLED' => 'true'}} COLUMN FAMILIES DESCRIPTION {NAME => 'f', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '6 04800 SECONDS (7 DAYS)', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
> // scan row
> scan 't1', {COLUMNS => 'f:id', LIMIT => 10}
> // result
> ROW COLUMN+CELL
> 12044083 column=f:id, timestamp=1606293182000, value=12044083
> 12044084 column=f:id, timestamp=1606293183000, value=12044084
> 12044085 column=f:id, timestamp=1606293185000, value=12044085
> 12044086 column=f:id, timestamp=1606293190000, value=12044086
> 12044087 column=f:id, timestamp=1606293192000, value=12044087
> 12044088 column=f:id, timestamp=1606293197000, value=12044088
> 12044089 column=f:id, timestamp=1606293198000, value=12044089
> 12044090 column=f:id, timestamp=1606293204000, value=12044090
> 12044091 column=f:id, timestamp=1606293207000, value=12044091
> 12044092 column=f:id, timestamp=1606293208000, value=12044092
> // get row, not result
> get 't1', "12044083" , {COLUMNS => 'f:id'}{code}
> First of all, only row and ID queries will have this problem, and other column queries are normal.Now I think we have reason to suspect that there are invisible escape characters or something in the data, but how do I know?
> The worst part is that I've used the Java API to make the call, and the returned data doesn't find any invisible escape characters on the row or ID.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)