You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Nihal Jain (JIRA)" <ji...@apache.org> on 2019/01/22 23:58:00 UTC

[jira] [Comment Edited] (HBASE-21755) RS aborts while performing replication with wal dir on hdfs, root dir on s3

    [ https://issues.apache.org/jira/browse/HBASE-21755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16749307#comment-16749307 ] 

Nihal Jain edited comment on HBASE-21755 at 1/22/19 11:57 PM:
--------------------------------------------------------------

All branches, including master still has the following (See [AbstractFSWALProvider.java#L434|https://github.com/apache/hbase/blob/fa3946fbeaaffd6acfbd8530e22f85e0bf3321eb/hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AbstractFSWALProvider.java#L434]):
{code:java}
  public static Path getArchivedLogPath(Path path, Configuration conf) throws IOException {
    Path rootDir = FSUtils.getRootDir(conf);  <===== SHOULD BE WAL ROOT DIR
    Path oldLogDir = new Path(rootDir, HConstants.HREGION_OLDLOGDIR_NAME);
    if (conf.getBoolean(SEPARATE_OLDLOGDIR, DEFAULT_SEPARATE_OLDLOGDIR)) {
      ServerName serverName = getServerNameFromWALDirectoryName(path);
      if (serverName == null) {
        LOG.error("Couldn't locate log: " + path);
        return path;
      }
      oldLogDir = new Path(oldLogDir, serverName.getServerName());
    }
    Path archivedLogLocation = new Path(oldLogDir, path.getName());
    final FileSystem fs = FSUtils.getCurrentFileSystem(conf); <===== SHOULD BE WAL FS

    if (fs.exists(archivedLogLocation)) {
      LOG.info("Log " + path + " was moved to " + archivedLogLocation);
      return archivedLogLocation;
    } else {
      LOG.error("Couldn't locate log: " + path);
      return path;
    }
  }
{code}
HBASE-21688 somehow missed it.


was (Author: nihaljain.cs):
All branches, including master still has the following:
{code:java}
  public static Path getArchivedLogPath(Path path, Configuration conf) throws IOException {
    Path rootDir = FSUtils.getRootDir(conf);  <===== SHOULD BE WAL ROOT DIR
    Path oldLogDir = new Path(rootDir, HConstants.HREGION_OLDLOGDIR_NAME);
    if (conf.getBoolean(SEPARATE_OLDLOGDIR, DEFAULT_SEPARATE_OLDLOGDIR)) {
      ServerName serverName = getServerNameFromWALDirectoryName(path);
      if (serverName == null) {
        LOG.error("Couldn't locate log: " + path);
        return path;
      }
      oldLogDir = new Path(oldLogDir, serverName.getServerName());
    }
    Path archivedLogLocation = new Path(oldLogDir, path.getName());
    final FileSystem fs = FSUtils.getCurrentFileSystem(conf); <===== SHOULD BE WAL FS

    if (fs.exists(archivedLogLocation)) {
      LOG.info("Log " + path + " was moved to " + archivedLogLocation);
      return archivedLogLocation;
    } else {
      LOG.error("Couldn't locate log: " + path);
      return path;
    }
  }
{code}
HBASE-21688 somehow missed it.

> RS aborts while performing replication with wal dir on hdfs, root dir on s3
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-21755
>                 URL: https://issues.apache.org/jira/browse/HBASE-21755
>             Project: HBase
>          Issue Type: Bug
>          Components: Filesystem Integration, Replication, wal
>    Affects Versions: 1.5.0, 2.1.3
>            Reporter: Nihal Jain
>            Assignee: Nihal Jain
>            Priority: Critical
>              Labels: s3
>
> *Environment/Configuration*
>  - _hbase.wal.dir_ : Configured to be on hdfs
>  - _hbase.rootdir_ : Configured to be on s3
> In replication scenario, while trying to get archived log dir (using method [WALEntryStream.java#L314|https://github.com/apache/hbase/blob/da92b3e0061a7c67aa9a3e403d68f3b56bf59370/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L314]) we get the following exception:
> {code:java}
> 2019-01-21 17:43:55,440 ERROR [RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2] regionserver.ReplicationSource: Unexpected exception in RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2 currentPath=hdfs://dummy_path/hbase/WALs/host2,22222,1548063439555/host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1.1548063492594
> java.lang.IllegalArgumentException: Wrong FS: s3a://xxxxxx/hbase128/oldWALs/host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1.1548063492594, expected: hdfs://dummy_path
> 	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:246)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1622)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1619)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1634)
> 	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:465)
> 	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1742)
> 	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.getArchivedLog(WALEntryStream.java:319)
> 	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.resetReader(WALEntryStream.java:404)
> 	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.reset(WALEntryStream.java:161)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:148)
> 2019-01-21 17:43:55,444 ERROR [RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2] regionserver.HRegionServer: ***** ABORTING region server host2,22222,1548063439555: Unexpected exception in RS_REFRESH_PEER-regionserver/host2:22222-1.replicationSource,2.replicationSource.wal-reader.host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1,2 *****
> java.lang.IllegalArgumentException: Wrong FS: s3a://xxxxxx/hbase128/oldWALs/host2%2C22222%2C1548063439555.host2%2C22222%2C1548063439555.regiongroup-1.1548063492594, expected: hdfs://dummy_path
> 	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:246)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1622)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1619)
> 	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1634)
> 	at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:465)
> 	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1742)
> 	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.getArchivedLog(WALEntryStream.java:319)
> 	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.resetReader(WALEntryStream.java:404)
> 	at org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.reset(WALEntryStream.java:161)
> 	at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.run(ReplicationSourceWALReader.java:148)
> {code}
>  
>  Current code is:
> {code:java}
>   private Path getArchivedLog(Path path) throws IOException {
>     Path rootDir = FSUtils.getRootDir(conf);
>     // Try found the log in old dir
>     Path oldLogDir = new Path(rootDir, HConstants.HREGION_OLDLOGDIR_NAME);
>     Path archivedLogLocation = new Path(oldLogDir, path.getName());
>     if (fs.exists(archivedLogLocation)) {
>       LOG.info("Log " + path + " was moved to " + archivedLogLocation);
>       return archivedLogLocation;
>     }
>     .
>     .
>     .
>     return path;
>   }
> {code}
> It considers root dir while we should use wal dir.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)