You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Francis Liu (JIRA)" <ji...@apache.org> on 2018/07/12 01:23:00 UTC

[jira] [Commented] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

    [ https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16540972#comment-16540972 ] 

Francis Liu commented on HBASE-20704:
-------------------------------------

OK I did some testing NPE only actually occurs in the 1.x. For the master branch it's one of two scenarios depending on wether a pread was used or not as stream scanners open their own inputstream, reader, etc. 
 # pread - you'll get a "stream closed" IOException. In 1.x it's an NPE because the stream references are set to null after the streams are closed.
 # not pread - Since the HStore has no knowledge of the created stream it does not close them. What happens is either the current running scan request is processed or it will get an IOException (replica not found), since the region close operation may have archived and the cleaner chore deleted the file. 

It sounds to me that #1 is good and #2 is acceptable? Let me know. If #2 is not acceptable I'd have to add a map to keep track of the new readers created when a stream scanner is created and close those on region close so exceptions will be like #1.

I've attached another patch which add another test method to surface the exceptions.

> Sometimes some compacted storefiles are not archived on region close
> --------------------------------------------------------------------
>
>                 Key: HBASE-20704
>                 URL: https://issues.apache.org/jira/browse/HBASE-20704
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>            Priority: Critical
>         Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch
>
>
> During region close compacted files which have not yet been archived by the discharger are archived as part of the region closing process. It is important that these files are wholly archived to insure data consistency. ie a storefile containing delete tombstones can be archived while older storefiles containing cells that were supposed to be deleted are left unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has read references (ie open scanners). This behavior is correct for when the discharger chore runs but on region close consistency is of course more important so we should add a special case to ignore any references on the storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)