You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Bryan Beaudreault (Jira)" <ji...@apache.org> on 2022/08/24 22:15:00 UTC

[jira] [Resolved] (HBASE-23340) hmaster /hbase/replication/rs session expired (hbase replication default value is true, we don't use ) causes logcleaner can not clean oldWALs, which resulits in oldWALs too large (more than 2TB)

     [ https://issues.apache.org/jira/browse/HBASE-23340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bryan Beaudreault resolved HBASE-23340.
---------------------------------------
    Release Note: Previously the LogCleaner chores had their own ZK client. If they encounter Session expired error, the LogCleaner chore will never succeed again despite the HMaster continuing to run. With this change, the LogCleaner chores now share the underlying ZK of the HMaster (similar to HFileCleaner chores). So now, if an unrecoverable session expiration occurs, the hmaster will abort and cleaner chores will not be left as zombies.
      Resolution: Fixed

> hmaster  /hbase/replication/rs  session expired (hbase replication default value is true, we don't use ) causes logcleaner can not clean oldWALs, which resulits in oldWALs too large (more than 2TB)
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-23340
>                 URL: https://issues.apache.org/jira/browse/HBASE-23340
>             Project: HBase
>          Issue Type: Improvement
>          Components: master
>    Affects Versions: 3.0.0-alpha-1, 2.2.3
>            Reporter: jackylau
>            Assignee: Bo Cui
>            Priority: Major
>             Fix For: 2.5.0, 3.0.0-alpha-1
>
>         Attachments: Snipaste_2019-11-21_10-39-25.png, Snipaste_2019-11-21_14-10-36.png
>
>
> hmaster /hbase/replication/rs session expired (hbase replication default value is true, we don't use ) causes logcleaner can not clean oldWALs, which resulits in oldWALs too large (more than 2TB).
> !Snipaste_2019-11-21_10-39-25.png!
>  
> !Snipaste_2019-11-21_14-10-36.png!
>  
> we can solve it by following :
> 1) increase the session timeout(but i think it is not a good idea. because we do not know how long to set is suitable)
> 2) close the hbase replication. It is not a good idea too, when our user uses this feature
> 3) we need add retry times, for example when it has already happened three times, we set the ReplicationLogCleaner and SnapShotCleaner stop
> that is all my ideas, i do not konw it is suitable, If it is suitable, could i commit a PR?
> Does anynode have a good idea.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)