You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org> on 2014/09/03 08:31:51 UTC
[jira] [Commented] (HBASE-11728) Data loss while scanning using
PREFIX_TREE DATA-BLOCK-ENCODING
[ https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119447#comment-14119447 ]
ramkrishna.s.vasudevan commented on HBASE-11728:
------------------------------------------------
Should we do this for some of the previous() cases also as done in the patch. May be that is the reason for the IT to fail.
[~bdifn]
Did you get an opportunity to use this patch and still you had some data loss while scanning?
> Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
> --------------------------------------------------------------
>
> Key: HBASE-11728
> URL: https://issues.apache.org/jira/browse/HBASE-11728
> Project: HBase
> Issue Type: Bug
> Components: Scanners
> Affects Versions: 0.96.1.1, 0.98.4
> Environment: ubuntu12
> hadoop-2.2.0
> Hbase-0.96.1.1
> SUN-JDK(1.7.0_06-b24)
> Reporter: wuchengzhi
> Assignee: ramkrishna.s.vasudevan
> Priority: Critical
> Fix For: 0.99.0, 2.0.0, 0.98.6
>
> Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java
>
> Original Estimate: 72h
> Remaining Estimate: 72h
>
> In Scan case, i prepare some data as beflow:
> Table Desc (Using the prefix-tree encoding) :
> 'prefix_tree_test', {NAME => 'cf_1', DATA_BLOCK_ENCODING => 'PREFIX_TREE', TTL => '15552000'}
> and i put 5 rows as:
> (RowKey , Qualifier, Value)
> 'a-b-0-0', 'qf_1', 'c1-value'
> 'a-b-A-1', 'qf_1', 'c1-value'
> 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
> 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
> 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3'
> so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the corret result:
> Test 1:
> Scan scan = new Scan();
> scan.setStartRow("a-b-A-1".getBytes());
> scan.setStopRow("a-b-A-1:".getBytes());
> ------------------------------------------------------
> 'a-b-A-1', 'qf_1', 'c1-value'
> 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
> 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
> and then i try next , scan to addColumn
> Test2:
> Scan scan = new Scan();
> scan.addColumn(Bytes.toBytes("cf_1") , Bytes.toBytes("qf_2"));
> scan.setStartRow("a-b-A-1".getBytes());
> scan.setStopRow("a-b-A-1:".getBytes());
> ----------------------------------------------
> except:
> 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
> 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
> but actually i got nonthing. Then i update the addColumn for scan.addColumn(Bytes.toBytes("cf_1") , Bytes.toBytes("qf_1")); and i got the expected result 'a-b-A-1', 'qf_1', 'c1-value' as well.
> then i do more testing... i update the case to modify the startRow greater than the 'a-b-A-1'
> Test3:
> Scan scan = new Scan();
> scan.setStartRow("a-b-A-1-".getBytes());
> scan.setStopRow("a-b-A-1:".getBytes());
> ------------------------------------------------------
> except:
> 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value'
> 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
> but actually i got nothing again. i modify the start row greater than 'a-b-A-1-1402329600-1402396277'
> Scan scan = new Scan();
> scan.setStartRow("a-b-A-1-140239".getBytes());
> scan.setStopRow("a-b-A-1:".getBytes());
> and i got the expect row as well:
> 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2'
> So, i think it may be a bug in the prefix-tree encoding.It happens after the data flush to the storefile, and it's ok when the data in mem-store.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)