You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Sahil Takiar (Jira)" <ji...@apache.org> on 2020/04/05 16:31:00 UTC

[jira] [Resolved] (IMPALA-9120) Refreshing an ABFS table with a deleted directory fails

     [ https://issues.apache.org/jira/browse/IMPALA-9120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sahil Takiar resolved IMPALA-9120.
----------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

I haven't validated that this has been fixed on master, but after the recent CDP upgrade it should be fixed when USE_CDP_HIVE=true

> Refreshing an ABFS table with a deleted directory fails
> -------------------------------------------------------
>
>                 Key: IMPALA-9120
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9120
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Critical
>             Fix For: Impala 4.0
>
>
> The following fails on ABFS (but succeeds on HDFS):
> {code:java}
> hdfs dfs -mkdir /test-external-table
> ./bin/impala-shell.sh
> [localhost:21000] default> create external table (col int) location '/test-external-table'; 
> [localhost:21000] default> select * from test;
> hdfs dfs -rm -r -skipTrash /test-external-table
> ./bin/impala-shell.sh
> [localhost:21000] default> refresh test;
> ERROR: TableLoadingException: Refreshing file and block metadata for 1 paths for table default.test: failed to load 1 paths. Check the catalog server log for more details.{code}
> This causes the test tests/query_test/test_hdfs_file_mods.py::TestHdfsFileMods::test_file_modifications[modification_type: delete_directory | ...] to fail on ABFS as well.
> The error from catalogd is:
> {code:java}
> E1104 22:38:53.748571 87486 ParallelFileMetadataLoader.java:102] Loading file and block metadata for 1 paths for table test_file_modifications_d0471c2c.t1 encountered an error loading data for path abfss://[]@[].dfs.core.windows.net/test-warehouse/test_file_modifications_d0471c2c
> Java exception follows:
> java.util.concurrent.ExecutionException: java.io.FileNotFoundException: GET https://[].dfs.core.windows.net/[]?resource=filesystem&maxResults=5000&directory=test-warehouse/test_file_modifications_d0471c2c&timeout=90&recursive=false
> StatusCode=404
> StatusDescription=The specified path does not exist.
> ErrorCode=PathNotFound
> ErrorMessage=The specified path does not exist.
> RequestId:[]
> Time:2019-11-04T22:38:53.7469083Z
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:99)
>         at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:606)
>         at org.apache.impala.catalog.HdfsTable.loadAllPartitions(HdfsTable.java:547)
>         at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:973)
>         at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:896)
>         at org.apache.impala.catalog.TableLoader.load(TableLoader.java:83)
>         at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:244)
>         at org.apache.impala.catalog.TableLoadingMgr$2.call(TableLoadingMgr.java:241)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: GET https://[].dfs.core.windows.net/[]?resource=filesystem&maxResults=5000&directory=test-warehouse/test_file_modifications_d0471c2c&timeout=90&recursive=false
> StatusCode=404
> StatusDescription=The specified path does not exist.
> ErrorCode=PathNotFound
> ErrorMessage=The specified path does not exist.
> RequestId:[]
> Time:2019-11-04T22:38:53.7469083Z
>         at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:957)
>         at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:351)
>         at org.apache.hadoop.fs.FileSystem.listStatusBatch(FileSystem.java:1790)
>         at org.apache.hadoop.fs.FileSystem$DirListingIterator.fetchMore(FileSystem.java:2058)
>         at org.apache.hadoop.fs.FileSystem$DirListingIterator.hasNext(FileSystem.java:2047)
>         at org.apache.impala.common.FileSystemUtil$RecursingIterator.hasNext(FileSystemUtil.java:722)
>         at org.apache.impala.common.FileSystemUtil$FilterIterator.hasNext(FileSystemUtil.java:679)
>         at org.apache.impala.catalog.FileMetadataLoader.load(FileMetadataLoader.java:166)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.lambda$load$0(ParallelFileMetadataLoader.java:93)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:293)
>         at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:61)
>         at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:45)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:93)
>         ... 11 more
> Caused by: GET https://[].dfs.core.windows.net/[]?resource=filesystem&maxResults=5000&directory=test-warehouse/test_file_modifications_d0471c2c&timeout=90&recursive=false
> StatusCode=404
> StatusDescription=The specified path does not exist.
> ErrorCode=PathNotFound
> ErrorMessage=The specified path does not exist.
> RequestId:[]
> Time:2019-11-04T22:38:53.7469083Z
>         at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:134)
>         at org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:180)
>         at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:526)
>         at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:348)
>         ... 23 more {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)