You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Anastasia Braginsky (JIRA)" <ji...@apache.org> on 2017/09/03 12:48:00 UTC

[jira] [Commented] (HBASE-18748) Cache pre-warming upon replication

    [ https://issues.apache.org/jira/browse/HBASE-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151792#comment-16151792 ] 

Anastasia Braginsky commented on HBASE-18748:
---------------------------------------------

As explained in the description, we would like to add a feature to the HBase replication methodology. The failover from primary cluster to secondary should have zero effect on the read latency. Currently there is a spike in the read latency upon failover due to cache on the secondary being cold. Simple redirection (duplication by user application) of reads to secondary prior to failover, resolves this issue. However, to make secondary to proceed all the reads is some waist of resources. Therefore, the suggestion is to redirect only "relevant" reads. In other words, the suggested solution is to selectively replay read requests at the backup - namely, those reads that caused cache-ins at the primary. 

We intend to use WAL replication as transport protocol (hopefully, as black box), and of course add custom replay callbacks. Meaning, to add a new "read type" of WAL entries, that are going to be rare, only upon cache-in. Those, read WAL entries, are going to be replicated on the secondary cluster. Of course, the cache blocks on primary and secondary may diverse, but this is a good heuristic.

What do you think about this suggestion? [~stack] and everybody, we would like to hear from you! May be this is anyhow already implemented and we are not aware?

> Cache pre-warming upon replication
> ----------------------------------
>
>                 Key: HBASE-18748
>                 URL: https://issues.apache.org/jira/browse/HBASE-18748
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Anastasia Braginsky
>
> HBase's cluster replication is very important and widely used feature. Let's assume primary cluster is replicated to secondary (backup) cluster using the WAL of the primary cluster to propagate the changes. Let's also assume the secondary cluster is a target for failover when needed and should become primary when needed.
> We suggest improving the way the HBase cluster failover works today. Namely, upon failover, the backup RS's cache is cold. Warming it up to the right working set takes many minutes. The suggested solution is to selectively replay read requests at the backup - namely, those reads that caused cache-ins at the primary. We intend to use WAL replication as transport protocol (hopefully, as black box), and of course add custom replay callbacks. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)