You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Andrew Purtell (JIRA)" <ji...@apache.org> on 2018/07/12 22:08:00 UTC

[jira] [Comment Edited] (HBASE-20704) Sometimes some compacted storefiles are not archived on region close

    [ https://issues.apache.org/jira/browse/HBASE-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542245#comment-16542245 ] 

Andrew Purtell edited comment on HBASE-20704 at 7/12/18 10:07 PM:
------------------------------------------------------------------

bq.  add a map to keep track of the new readers created when a stream scanner is created and close those on region close 

That does sound like the right thing to do. How difficult would it be? Leaving unclosed streams around, expecting eventual GC to call a finalizer that cleans things up, is both a leak, technically, and a source of elevated GC pause latency. OTOH, we wouldn't expect a high rate of region close and this wouldn't happen every time, so occurrence probability is low. Not a must do IMHO but good if we can


was (Author: apurtell):
bq.  add a map to keep track of the new readers created when a stream scanner is created and close those on region close 

That does sound like the right thing to do. How difficult would it be? Leaving unclosed streams around, expecting eventual GC to call a finalizer that cleans things up, is both a leak, technically, and a source of elevated GC pause latency. 

> Sometimes some compacted storefiles are not archived on region close
> --------------------------------------------------------------------
>
>                 Key: HBASE-20704
>                 URL: https://issues.apache.org/jira/browse/HBASE-20704
>             Project: HBase
>          Issue Type: Bug
>          Components: Compaction
>    Affects Versions: 3.0.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>            Priority: Critical
>         Attachments: HBASE-20704.001.patch, HBASE-20704.002.patch, HBASE-20704.003.patch
>
>
> During region close compacted files which have not yet been archived by the discharger are archived as part of the region closing process. It is important that these files are wholly archived to insure data consistency. ie a storefile containing delete tombstones can be archived while older storefiles containing cells that were supposed to be deleted are left unarchived thereby undeleting those cells. 
> On region close a compacted storefile is skipped from archiving if it has read references (ie open scanners). This behavior is correct for when the discharger chore runs but on region close consistency is of course more important so we should add a special case to ignore any references on the storefile and go ahead and archive it. 
> Attached patch contains a unit test that reproduces the problem and the proposed fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)