You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Nihal Jain (JIRA)" <ji...@apache.org> on 2019/01/22 10:03:00 UTC

[jira] [Created] (HBASE-21755) RS aborts while performing replication with wal dir on s3, root dir on hdfs

Nihal Jain created HBASE-21755:
----------------------------------

             Summary: RS aborts while performing replication with wal dir on s3, root dir on hdfs
                 Key: HBASE-21755
                 URL: https://issues.apache.org/jira/browse/HBASE-21755
             Project: HBase
          Issue Type: Bug
          Components: Filesystem Integration, Replication
    Affects Versions: 2.1.3
            Reporter: Nihal Jain
            Assignee: Nihal Jain


*Environment/Configuration*
 - _hbase.wal.dir_ : Configured to be on s3
 - _hbase.rootdir_ : Configured to be on hdfs

In replication scenario, while trying to get archived log dir (using method [WALEntryStream.java#L315|https://github.com/apache/hbase/blob/b0131e19f4b9ced05f501c61596191cb8a86b660/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L315]) we get the following exception:
{code:java}
2019-01-21 17:43:55,440 ERROR [RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2] regionserver.ReplicationSource: Unexpected exception in RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2 currentPath=hdfs://dummy_path/hbase/WALs/host2,22222,1548063439555/host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1.1548063492594
java.lang.IllegalArgumentException: Wrong FS: s3a://xxxxxx/hbase128/oldWALs/host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1.1548063492594, expected: hdfs://dummy_path
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:246)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1622)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1619)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1634)
	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:465)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1742)
	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.getArchivedLog(WALEntryStream.java:319)
	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.resetReader(WALEntryStream.java:404)
	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.reset(WALEntryStream.java:161)
	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:148)
2019-01-21 17:43:55,444 ERROR [RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2] regionserver.HRegionServer: ***** ABORTING region server host2,22222,1548063439555: Unexpected exception in RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2 *****
java.lang.IllegalArgumentException: Wrong FS: s3a://xxxxxx/hbase128/oldWALs/host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1.1548063492594, expected: hdfs://dummy_path
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:246)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1622)
	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1619)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1634)
	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:465)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1742)
	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.getArchivedLog(WALEntryStream.java:319)
	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.resetReader(WALEntryStream.java:404)
	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.reset(WALEntryStream.java:161)
	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:148)


{code}
 
Current code is:
{code}
  private Path getArchivedLog(Path path) throws IOException {
    Path rootDir = FSUtils.getRootDir(conf);

    // Try found the log in old dir
    Path oldLogDir = new Path(rootDir, HConstants.HREGION_OLDLOGDIR_NAME);
    Path archivedLogLocation = new Path(oldLogDir, path.getName());
    if (fs.exists(archivedLogLocation)) {
      LOG.info("Log " + path + " was moved to " + archivedLogLocation);
      return archivedLogLocation;
    }

    // Try found the log in the seperate old log dir
    oldLogDir =
        new Path(rootDir, new StringBuilder(HConstants.HREGION_OLDLOGDIR_NAME)
            .append(Path.SEPARATOR).append(serverName.getServerName()).toString());
    archivedLogLocation = new Path(oldLogDir, path.getName());
    if (fs.exists(archivedLogLocation)) {
      LOG.info("Log " + path + " was moved to " + archivedLogLocation);
      return archivedLogLocation;
    }

    LOG.error("Couldn't locate log: " + path);
    return path;
  }
{code}

It considers root dir while we should use wal dir. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)