You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2020/12/02 10:54:00 UTC
[jira] [Commented] (HBASE-25350) The scan command gives a row, but
get does not have this row.
[ https://issues.apache.org/jira/browse/HBASE-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242249#comment-17242249 ]
Wellington Chevreuil commented on HBASE-25350:
----------------------------------------------
Have you tried write a test that uses Java Client API to scan this table? Checking for the actual byte array representation to make sure there's no hidden character not been shown on the hbase shell output?
It may as well be another manifestation of the issues fixed on HBASE-23238/HBASE-17489.
> The scan command gives a row, but get does not have this row.
> -------------------------------------------------------------
>
> Key: HBASE-25350
> URL: https://issues.apache.org/jira/browse/HBASE-25350
> Project: HBase
> Issue Type: Bug
> Affects Versions: 1.2.11
> Reporter: ClownfishYang
> Priority: Major
>
> Hbase version:1.1.2
>
> We used Hbase as a real-time database and then used hive external tables for our queries, but found that there was a problem with the data query for one table.
>
> {code:java}
> // sql, result id in (12045075, 12045076,...)
> SELECT id FROM t1 LIMIT 10
> // not result
> SELECT id FROM t1 WHERE id = '12045075' LIMIT 10
> // create table
> CREATE EXTERNAL TABLE `t1`(
> `__key` string COMMENT '',
> `id` string COMMENT '主键ID')
> COMMENT ''
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> 'hbase.columns.mapping'=':key,f:id',
> 'serialization.format'='1')
> TBLPROPERTIES (
> 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
> 'hbase.table.name'='t1',
> 'numFiles'='0',
> 'numRows'='0',
> 'rawDataSize'='0',
> 'totalSize'='0',
> 'transient_lastDdlTime'='1606804842'){code}
> During this period, I added space or like, but the cause of the problem could not be verified. I began to suspect that it was hbase.
> {code:java}
> // hbase table desc
> describe 't1'
> Table t1 is ENABLED
> t1, {TABLE_ATTRIBUTES => {METADATA => {'COMPACTION_ENABLED' => 'true'}} COLUMN FAMILIES DESCRIPTION {NAME => 'f', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '6 04800 SECONDS (7 DAYS)', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
> // scan row
> scan 't1', {COLUMNS => 'f:id', LIMIT => 10}
> // result
> ROW COLUMN+CELL
> 12044083 column=f:id, timestamp=1606293182000, value=12044083
> 12044084 column=f:id, timestamp=1606293183000, value=12044084
> 12044085 column=f:id, timestamp=1606293185000, value=12044085
> 12044086 column=f:id, timestamp=1606293190000, value=12044086
> 12044087 column=f:id, timestamp=1606293192000, value=12044087
> 12044088 column=f:id, timestamp=1606293197000, value=12044088
> 12044089 column=f:id, timestamp=1606293198000, value=12044089
> 12044090 column=f:id, timestamp=1606293204000, value=12044090
> 12044091 column=f:id, timestamp=1606293207000, value=12044091
> 12044092 column=f:id, timestamp=1606293208000, value=12044092
> // get row, not result
> get 't1', "12044083" , {COLUMNS => 'f:id'}{code}
> First of all, only row and ID queries will have this problem, and other column queries are normal.Now I think we have reason to suspect that there are invisible escape characters or something in the data, but how do I know?
> The worst part is that I've used the Java API to make the call, and the returned data doesn't find any invisible escape characters on the row or ID.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)