You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2019/02/28 13:14:00 UTC

[jira] [Commented] (HBASE-21920) Ignoring 'empty' end_key while calculating end_key for new region in HBCK -fixHdfsOverlaps command can cause data loss

    [ https://issues.apache.org/jira/browse/HBASE-21920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780497#comment-16780497 ] 

Hadoop QA commented on HBASE-21920:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  0m  0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 32s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 37s{color} | {color:green} branch-1 passed with JDK v1.8.0_202 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 40s{color} | {color:green} branch-1 passed with JDK v1.7.0_211 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 28s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 48s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 33s{color} | {color:green} branch-1 passed with JDK v1.8.0_202 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 36s{color} | {color:green} branch-1 passed with JDK v1.7.0_211 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 37s{color} | {color:green} the patch passed with JDK v1.8.0_202 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 38s{color} | {color:green} the patch passed with JDK v1.7.0_211 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 22s{color} | {color:red} hbase-server: The patch generated 1 new + 112 unchanged - 1 fixed = 113 total (was 113) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 45s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  1m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 27s{color} | {color:green} the patch passed with JDK v1.8.0_202 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 35s{color} | {color:green} the patch passed with JDK v1.7.0_211 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}114m 42s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}153m  1s{color} | {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| JIRA Issue | HBASE-21920 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12960582/HBASE-21920.branch-1.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux acd3e6007a25 4.4.0-131-generic #157~14.04.1-Ubuntu SMP Fri Jul 13 08:53:17 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh |
| git revision | branch-1 / b5c50b5 |
| maven | version: Apache Maven 3.0.5 |
| Default Java | 1.7.0_211 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-openjdk-amd64:1.8.0_202 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_211 |
| checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/16179/artifact/patchprocess/diff-checkstyle-hbase-server.txt |
|  Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/16179/testReport/ |
| Max. process+thread count | 3789 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/16179/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Ignoring 'empty' end_key while calculating end_key for new region in HBCK -fixHdfsOverlaps command can cause data loss
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21920
>                 URL: https://issues.apache.org/jira/browse/HBASE-21920
>             Project: HBase
>          Issue Type: Bug
>          Components: hbck
>    Affects Versions: 1.0.0
>            Reporter: Syeda Arshiya Tabreen
>            Assignee: Syeda Arshiya Tabreen
>            Priority: Major
>         Attachments: HBASE-21920.branch-1.patch
>
>
> When running *-fixHdfsOverlaps* command due to overlap in the regions of the table ,it moves all the hfiles of overlapping regions into new region with start_key and end_key calculating based on minimum and maximum start_key and end_key of all overlapping regions.
> When calculating start_key and end_key for new region,end_key with 'empty' is not considered which leads to data loss when scanned using '*startrow'.*
> *For example:*
>  1.create table 't' 
>  2.Insert records \{00,111,200} into the table 't'and flush the data
>  3.split the table 't' with split-key '100'
>  4.Now we have three regions( 1 parent and two daughter regions )
>  1.*Region-1*('Empty','Empty') => \{00,111,200}
>  2.*Region-2*('Empty','100')=>\{00}
>  3.*Region-3*('100','Empty')=>\{111,200}
> 5.Make sure parent region is not deleted in file system and run -*fixHdfsOverlaps* command
> This -*fixHdfsOverlaps* command will move all the hfiles of the three regions
> {*Region-1,Region- 2,Region-3*} into a new region(*Region-4*) created with start_key='*Empty'* and end_key='*100'*
> This is because it does not consider  end_key=*'Empty'* and considers end_key=*'100'* as maximum which in turn makes all the hfiles of three regions to move into new region even if records in hfile is more than the end_key='*100'* and one empty region *Region -5   (100,Empty)* will be created because table region end key was not empty.
> Now we have 2 regions:
> 1.*Region-4*(Empty,100)=>\{00,111,200}
> 2.*Region-5*(100,Empty)=>{}
> when the entire table scan is done, all the records will be displayed, there wont be any data loss but scan with start_key is done below are the results:
> 1.scan 't', \{ STARTROW => '00'} => \{00,111,200}
> 2.scan 't', \{ STARTROW => '100'}=>{}
> The second scan will give empty result because it searches the rows in
> *Region -5*(100,Empty) which contains no records but records \{111,200} is present in *Region-4*(Empty,100).
> The problem exists only when end_key=*'Empty'* is present in any of the overlapping regions.I think if end_key is present in any of the overlapping regions,we have to consider it as maximum end_key.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)