You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Ted Yu (JIRA)" <ji...@apache.org> on 2018/06/13 09:24:00 UTC

[jira] [Commented] (HBASE-20727) Persist FlushedSequenceId to speed up WAL split after cluster restart

    [ https://issues.apache.org/jira/browse/HBASE-20727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510834#comment-16510834 ] 

Ted Yu commented on HBASE-20727:
--------------------------------

Have you tried this feature on a cluster ?

If so, can you share some numbers w.r.t. the savings in WAL replay ?

Thanks

> Persist FlushedSequenceId to speed up WAL split after cluster restart
> ---------------------------------------------------------------------
>
>                 Key: HBASE-20727
>                 URL: https://issues.apache.org/jira/browse/HBASE-20727
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HBASE-20727.patch
>
>
> We use flushedSequenceIdByRegion and storeFlushedSequenceIdsByRegion in ServerManager to record the latest flushed seqids of regions and stores. So during log split, we can use seqids stored in those maps to filter out the edits which do not need to be replayed. But, those maps are not persisted. After cluster restart or master restart, info of flushed seqids are all lost. 
> Here I offer a way to persist those info to HDFS, even if master restart, we can still use those info to filter WAL edits and then to speed up replay.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)