You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Hudson (Jira)" <ji...@apache.org> on 2020/05/28 01:23:00 UTC

[jira] [Commented] (HBASE-24428) Priority compaction for recently split daughter regions

    [ https://issues.apache.org/jira/browse/HBASE-24428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118207#comment-17118207 ] 

Hudson commented on HBASE-24428:
--------------------------------

Results for branch branch-2.2
	[build #878 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/878/]: (/) *{color:green}+1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/878//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/878//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/878//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Priority compaction for recently split daughter regions
> -------------------------------------------------------
>
>                 Key: HBASE-24428
>                 URL: https://issues.apache.org/jira/browse/HBASE-24428
>             Project: HBase
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Andrew Kyle Purtell
>            Assignee: Viraj Jasani
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.0, 1.7.0, 2.2.5
>
>
> We observe that under hotspotting conditions that splitting will proceed very slowly and the "_Cannot split region due to reference files being there_" log line will be logged excessively. (branch-1 based production.) This is because after a region is split it must be compacted before it can be split again. Reference files must be replaced by real HFiles, normal housekeeping performed during compaction. However if the regionserver is under excessive load, its compaction queues may become deep. The daughters of a recently split hotspotting region may themselves continue to hotspot and will rapidly need to split again. If the scheduled compaction work to remove/replace reference files is queued hundreds or thousands of compaction queue elements behind current, the recently split daughter regions will not be able to split again for a long time and may grow very large, producing additional complications (very large regions, very deep replication queues).
> To help avoid this condition we should prioritize the compaction of recently split daughter regions. Compaction requests include a {{priority}} field and CompactionRequest implements a comparator that sorts by this field. We already detect when a compaction request involves a region that has reference files, to ensure that it gets selected to be eligible for compaction, but we do not seem to prioritize the requests for post-split housekeeping. Split work should be placed at the top of the queue. Ensure that this is happening.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)