You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Pankaj Kumar (JIRA)" <ji...@apache.org> on 2019/06/25 18:46:00 UTC

[jira] [Created] (HBASE-22628) Data loss while migrating to custom WAL directory (hbase.wal.dir)

Pankaj Kumar created HBASE-22628:
------------------------------------

             Summary: Data loss while migrating to custom WAL directory (hbase.wal.dir)
                 Key: HBASE-22628
                 URL: https://issues.apache.org/jira/browse/HBASE-22628
             Project: HBase
          Issue Type: Bug
          Components: Recovery, wal
            Reporter: Pankaj Kumar
            Assignee: Pankaj Kumar


There is one data loss scenario while migrating to custom WAL directory.

Steps to reproduce:
 # Setup HBase cluster with the default setting (all WAL files are under the root directory ie. /hbase/WALs).
 # Create table 't1' and insert few records
 # Flush meta table (so that table region entries persist in FS)
 # Forcibly kill HBase processes (HM & RS).
 # Configure the hbase.wal.dir to outside the root dir (say /hbaseWAL)
 # Start the HBase servers
 # Scan 't1'

Ideally HMaster should submit split task of old RS(s) WAL files (created under /hbase/WALs) and old data should be replayed.

But currently, during HM startup we populate the previous dead servers from the current WAL dir ( hbase.wal.dir -> /hbaseWAL).

In MasterFileSystem.getFailedServersFromLogFolders(),
{code:java}
Set<ServerName> getFailedServersFromLogFolders() {
 boolean retrySplitting = !conf.getBoolean("hbase.hlog.split.skip.errors",
 WALSplitter.SPLIT_SKIP_ERRORS_DEFAULT);

Set<ServerName> serverNames = new HashSet<ServerName>();
 Path logsDirPath = new Path(this.walRootDir, HConstants.HREGION_LOGDIR_NAME);

do {
 if (master.isStopped()) {
 LOG.warn("Master stopped while trying to get failed servers.");
 break;
 }
 try {
 if (!this.walFs.exists(logsDirPath)) return serverNames;
 FileStatus[] logFolders = FSUtils.listStatus(this.walFs, logsDirPath, null);
{code}
For backward compatibility we should consider default WAL directory path also.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)