You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Duo Zhang (JIRA)" <ji...@apache.org> on 2016/11/01 12:39:58 UTC

[jira] [Commented] (HBASE-14004) [Replication] Inconsistency between Memstore and WAL may result in data in remote cluster that is not in the origin

    [ https://issues.apache.org/jira/browse/HBASE-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15625326#comment-15625326 ] 

Duo Zhang commented on HBASE-14004:
-----------------------------------

I think we need to pick this up.

With AsyncFSWAL, it is not safe to use DFSInputStream to read the WAL file directly until EOF when it is still open. The data we read maybe disappear later. FSHLog also has this problem but it is much safer... See this document for more details

https://docs.google.com/document/d/11AyWtGhItQs6vsLRIx32PwTxmBY3libXwGXI25obVEY/edit#

The problem only happens when the WAL file is still open. AFAIK, if a RS is alive, then its WAL will always be replicated by itself. So I think it is possible that we expose an API to tell the ReplicationSource the safe length to read of an opened WAL file. And for a ReplicationSource that replicates WAL of other RS, then we can make sure the RS is dead and all its WALs should also be closed(we can also make sure it by calling recoverLease). So it is safe to read it until EOF with DFSInputStream.

Any concerns?  If not, Let's start working!

Thanks.

> [Replication] Inconsistency between Memstore and WAL may result in data in remote cluster that is not in the origin
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-14004
>                 URL: https://issues.apache.org/jira/browse/HBASE-14004
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: He Liangliang
>            Priority: Critical
>              Labels: replication, wal
>
> Looks like the current write path can cause inconsistency between memstore/hfile and WAL which cause the slave cluster has more data than the master cluster.
> The simplified write path looks like:
> 1. insert record into Memstore
> 2. write record to WAL
> 3. sync WAL
> 4. rollback Memstore if 3 fails
> It's possible that the HDFS sync RPC call fails, but the data is already  (may partially) transported to the DNs which finally get persisted. As a result, the handler will rollback the Memstore and the later flushed HFile will also skip this record.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)