You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (Jira)" <ji...@apache.org> on 2022/08/09 08:03:00 UTC
[jira] [Resolved] (HBASE-25229) Instantiate BucketCache before RS creates a their ephemeral node when rolling-upgrade
[ https://issues.apache.org/jira/browse/HBASE-25229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang resolved HBASE-25229.
-------------------------------
Assignee: (was: Jeongdae Kim)
Resolution: Won't Fix
All 1.x release lines are EOL.
Feel fee to reopen if this also affects 2.x and master.
> Instantiate BucketCache before RS creates a their ephemeral node when rolling-upgrade
> -------------------------------------------------------------------------------------
>
> Key: HBASE-25229
> URL: https://issues.apache.org/jira/browse/HBASE-25229
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 1.5.0, 1.6.0, 1.7.0, 1.4.13
> Reporter: Jeongdae Kim
> Priority: Minor
>
> We observed many clients couldn't get information on region locations for tens of seconds during rolling-upgrade from 1.2.x to 1.4.x, and all requests to regions moved by graceful restart failed.
>
> The reason is that
> # Since HBASE-17931, system tables are assigned to RS with highest version
> # Since HBASE-12034, bucket cache initialization process has moved from RS instantiation to RS initialization process after reporting to master, moreover an ephemeral node for RS is created before bucket cache creation.
> # when using offheap bucketcache, it takes too much time to allocate memory for it (18 seconds for 31GB in our case) [https://github.com/apache/hbase/blob/branch-1.4/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ByteBufferArray.java#L52-L72]
> # Once ephemeral nodes created, a master try to move system regions to RS with highest version when first RS restart of whole rolling-restart process. but, by 3) the RS is not ready for serving system regions yet. moving system regions keep failing until 3) is finished.
>
> I think this could happen only in branch-1, because an ephemeral node is created after creating block caches in hbase 2.x. there is no need to create block caches after ephemeral node creation at all.
>
> I verified this issue could be resolved by just changing their creation order.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)