You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Wellington Chevreuil (Jira)" <ji...@apache.org> on 2020/12/02 10:54:00 UTC

[jira] [Commented] (HBASE-25350) The scan command gives a row, but get does not have this row.

    [ https://issues.apache.org/jira/browse/HBASE-25350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242249#comment-17242249 ] 

Wellington Chevreuil commented on HBASE-25350:
----------------------------------------------

Have you tried write a test that uses Java Client API to scan this table? Checking for the actual byte array representation to make sure there's no hidden character not been shown on the hbase shell output?

It may as well be another manifestation of the issues fixed on HBASE-23238/HBASE-17489.

> The scan command gives a row, but get does not have this row.
> -------------------------------------------------------------
>
>                 Key: HBASE-25350
>                 URL: https://issues.apache.org/jira/browse/HBASE-25350
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.11
>            Reporter: ClownfishYang
>            Priority: Major
>
> Hbase version:1.1.2
>  
> We used Hbase as a real-time database and then used hive external tables for our queries, but found that there was a problem with the data query for one table.
>  
> {code:java}
> // sql, result id in (12045075, 12045076,...)
> SELECT id FROM t1 LIMIT 10
> // not result
> SELECT id FROM t1 WHERE id = '12045075' LIMIT 10
> // create table 
> CREATE EXTERNAL TABLE `t1`(
> `__key` string COMMENT '', 
> `id` string COMMENT '主键ID')
> COMMENT ''
> ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ( 
> 'hbase.columns.mapping'=':key,f:id', 
> 'serialization.format'='1')
> TBLPROPERTIES (
> 'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}', 
> 'hbase.table.name'='t1', 
> 'numFiles'='0', 
> 'numRows'='0', 
> 'rawDataSize'='0', 
> 'totalSize'='0', 
> 'transient_lastDdlTime'='1606804842'){code}
> During this period, I added space or like, but the cause of the problem could not be verified. I began to suspect that it was hbase.
> {code:java}
> // hbase table desc
> describe 't1'
> Table t1 is ENABLED
> t1, {TABLE_ATTRIBUTES => {METADATA => {'COMPACTION_ENABLED' => 'true'}} COLUMN FAMILIES DESCRIPTION {NAME => 'f', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => '6 04800 SECONDS (7 DAYS)', COMPRESSION => 'SNAPPY', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} 
> // scan row
> scan 't1', {COLUMNS => 'f:id', LIMIT => 10}
> // result
> ROW                                   COLUMN+CELL                                                                                                 
>  12044083                             column=f:id, timestamp=1606293182000, value=12044083                                                        
>  12044084                             column=f:id, timestamp=1606293183000, value=12044084                                                        
>  12044085                             column=f:id, timestamp=1606293185000, value=12044085                                                        
>  12044086                             column=f:id, timestamp=1606293190000, value=12044086                                                        
>  12044087                             column=f:id, timestamp=1606293192000, value=12044087                                                        
>  12044088                             column=f:id, timestamp=1606293197000, value=12044088                                                        
>  12044089                             column=f:id, timestamp=1606293198000, value=12044089                                                        
>  12044090                             column=f:id, timestamp=1606293204000, value=12044090                                                        
>  12044091                             column=f:id, timestamp=1606293207000, value=12044091                                                        
>  12044092                             column=f:id, timestamp=1606293208000, value=12044092   
> // get row, not result
> get 't1', "12044083" , {COLUMNS => 'f:id'}{code}
> First of all, only row and ID queries will have this problem, and other column queries are normal.Now I think we have reason to suspect that there are invisible escape characters or something in the data, but how do I know?
> The worst part is that I've used the Java API to make the call, and the returned data doesn't find any invisible escape characters on the row or ID.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)