You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/03/10 12:55:00 UTC

[jira] [Created] (IMPALA-10579) Deadloop in table metadata loading when using an invalid RemoteIterator

Quanlong Huang created IMPALA-10579:
---------------------------------------

             Summary: Deadloop in table metadata loading when using an invalid RemoteIterator
                 Key: IMPALA-10579
                 URL: https://issues.apache.org/jira/browse/IMPALA-10579
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Quanlong Huang


The file listing thread in catalogd will go into a dead loop if it gets a RemoteIterator on a non-existing path. The first call of the RemoteIterator.hasNext() will throw a FileNotFoundException. However, this exception will be catched and the loop will continue, which results in a dead loop. Related codes: [https://github.com/apache/impala/blob/d89c04bf806682d3449c566ce979632bd2ac5b29/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L789-L814]
{code:java}
  static class FilterIterator implements RemoteIterator<FileStatus> {
    ...
    public boolean hasNext() throws IOException {
      ...
      while (curFile_ == null) {
        FileStatus next;
        try {
          if (!baseIterator_.hasNext()) return false; // <---- throws FileNotFoundException
          ...
          next = baseIterator_.next();
        } catch (FileNotFoundException ex) {
          ...
          LOG.warn(ex.getMessage());
          continue;  // <--------- catch the exception and continue into a dead loop
        }
        if (!isInIgnoredDirectory(startPath_, next)) {
          curFile_ = next;
          return true;
        }
      }
      return true;
    }
{code}
*When will the path to be loading not exist?*
 It happens when metadata (table/partition location) in HMS still have the path. But it's actually removed from the storage.

*When will impala get such an invalid RemoteIterator?*
 For FileSystem implementations that don't override the FileSystem#listStatusIterator() interface, e.g. S3AFileSystem before HADOOP-17281, AzureBlobFileSystem, and GoogleHadoopFileSystem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org