You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2022/09/02 11:07:00 UTC

[jira] [Commented] (IMPALA-11464) hasNext() throws FileNotFoundException on staging files and breaks file metadata loading

    [ https://issues.apache.org/jira/browse/IMPALA-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17599415#comment-17599415 ] 

ASF subversion and git services commented on IMPALA-11464:
----------------------------------------------------------

Commit cf7490ccbdd8ce3ced2a3ae35974de6a7ef91db9 in impala's branch refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=cf7490ccb ]

IMPALA-11464: (Addendum) Skip tests in Ozone

Updates the skip for new recursive listing tests to match the comment so
that they're only run on HDFS. The previous skip only roughly matched
the set of all non-HDFS filesystems, and didn't automatically include
new filesystems.

Change-Id: I80de83d506138b57a969258b2f6dcf112dd2e44d
Reviewed-on: http://gerrit.cloudera.org:8080/18934
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> hasNext() throws FileNotFoundException on staging files and breaks file metadata loading
> ----------------------------------------------------------------------------------------
>
>                 Key: IMPALA-11464
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11464
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 4.0.0, Impala 4.1.0
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Critical
>             Fix For: Impala 4.2.0, Impala 4.1.1
>
>         Attachments: Selection_034.png, hdfs-touch-1002-files.sh
>
>
> HDFS provides a RemoteIterator for listing files. The two key interfaces are hasNext() and next(). We only handle the FileNotFoundException thrown in next(). However, hasNext() could also thrown FileNotFoundException for staging files:
>  * The HDFS client will do a partial listing when creating the RemoteIterator. By default it only list 1000 items under the dir. The limit is configured by "[dfs.ls.limit|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml#dfs.ls.limit]".
>  * Usually the hasNext() call just consume the partial list and it's enough. However, if the dir contains more than 1000 items. The 1001-th hasNext() call will do a continue listing. This could hit FileNotFoundException if the dir was removed.
> Such a FileNotFoundException thrown from hasNext() will break the metadata loading:
> https://github.com/apache/impala/blob/89c3e1f821ccd335c3c5507496bb53b80c1cc07a/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L948
> *Reproducing*
> To stably reproduce the issue, I use IntelliJ to stop catalogd at some point, and then manually delete the dir. Here are the detailed steps:
> Create a table and run a simple query on it to let catalogd cache its metadata.
> {code:sql}
> impala-shell> create table my_alltypes like functional.alltypes;
> impala-shell> desc my_alltypes;
> {code}
> Manually create staging dir with more than 1000 files in it.
> {code:bash}
> hdfs dfs -mkdir -p hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001
> bash hdfs-touch-1002-files.sh
> {code}
> In IntelliJ, set a breakpoint at this line and attach to a running catalogd:
> https://github.com/apache/impala/blob/89c3e1f821ccd335c3c5507496bb53b80c1cc07a/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java#L987
> In Hive, trigger an INSERT on this table.
> {code:sql}
> $ beeline -u "jdbc:hive2://localhost:11050"
> beeline> insert into my_alltypes partition(year=2009, month=4) select id,bool_col, tinyint_col, smallint_col, int_col, bigint_col, float_col, double_col, date_string_col, string_col, timestamp_col from functional.alltypes where year=2009 and month=4;
> {code}
> IntelliJ will prompt out for debugging. Step over until the partial listing is done on the staging dir "-ext-10001", as the screenshot shows.
> Manually remove the staging dir "-ext-10001"
> {code}
> hdfs dfs -rm -r -skipTrash hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001
> Deleted hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001
> {code}
> Then stop debugging to resume the process. I can see the following errors in catalogd. Event processor is broken and requires a global INVALIDATE METADATA to recover.
> {noformat}
> I0730 10:36:27.031431  9452 FileSystemUtil.java:837] Ignoring hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001/996 since it is either in a hidden directory or a temporary staging directory hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1
> I0730 10:36:27.031471  9452 FileSystemUtil.java:837] Ignoring hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001/997 since it is either in a hidden directory or a temporary staging directory hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1
> E0730 10:36:27.034335  9452 ParallelFileMetadataLoader.java:171] Loading file and block metadata for 1 paths for table default.my_alltypes encountered an error loading data for path hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4
> Java exception follows:
> java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001 does not exist.
>         at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
>         at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:513)
>         at com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:86)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:168)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:120)
>         at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:780)
>         at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:743)
>         at org.apache.impala.catalog.HdfsTable.createAndLoadPartitions(HdfsTable.java:904)
>         at org.apache.impala.service.CatalogOpExecutor.addHdfsPartitions(CatalogOpExecutor.java:1465)
>         at org.apache.impala.service.CatalogOpExecutor.addPartitionsIfNotRemovedLater(CatalogOpExecutor.java:4068)
>         at org.apache.impala.catalog.events.MetastoreEvents$AddPartitionEvent.process(MetastoreEvents.java:1907)
>         at org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:511)
>         at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:935)
>         at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:833)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.FileNotFoundException: File hdfs://localhost:20500/test-warehouse/my_alltypes/year=2009/month=4/.hive-staging_hive_2022-07-30_09-34-05_894_3245867416204249845-1/-ext-10001 does not exist.
>         at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter(DistributedFileSystem.java:1317)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext(DistributedFileSystem.java:1288)
>         at org.apache.impala.common.FileSystemUtil$RecursingIterator.hasNext(FileSystemUtil.java:948)
>         at org.apache.impala.common.FileSystemUtil$FilterIterator.hasNext(FileSystemUtil.java:888)
>         at org.apache.impala.catalog.FileMetadataLoader.load(FileMetadataLoader.java:202)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.lambda$loadInternal$1(ParallelFileMetadataLoader.java:162)
>         at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>         at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>         at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>         at com.google.common.util.concurrent.MoreExecutors$DirectExecutorService.execute(MoreExecutors.java:322)
>         at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
>         at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:66)
>         at com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:36)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:162)
>         ... 17 more
> E0730 10:36:27.034795  9452 MetastoreEventsProcessor.java:841] Event processing needs a invalidate command to resolve the state
> Java exception follows:
> org.apache.impala.catalog.events.MetastoreNotificationNeedsInvalidateException: EventId: 728392 EventType: ADD_PARTITION Failed to refresh newly added partitions of table {}. Event processing cannot continue. Issue an invalidate metadata command to reset event processor.
>         at org.apache.impala.catalog.events.MetastoreEvents$AddPartitionEvent.process(MetastoreEvents.java:1921)
>         at org.apache.impala.catalog.events.MetastoreEvents$MetastoreEvent.processIfEnabled(MetastoreEvents.java:511)
>         at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:935)
>         at org.apache.impala.catalog.events.MetastoreEventsProcessor.processEvents(MetastoreEventsProcessor.java:833)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.impala.catalog.TableLoadingException: Loading file and block metadata for 1 paths for table default.my_alltypes: failed to load 1 paths. Check the catalog server log for more details.
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.loadInternal(ParallelFileMetadataLoader.java:185)
>         at org.apache.impala.catalog.ParallelFileMetadataLoader.load(ParallelFileMetadataLoader.java:120)
>         at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:780)
>         at org.apache.impala.catalog.HdfsTable.loadFileMetadataForPartitions(HdfsTable.java:743)
>         at org.apache.impala.catalog.HdfsTable.createAndLoadPartitions(HdfsTable.java:904)
>         at org.apache.impala.service.CatalogOpExecutor.addHdfsPartitions(CatalogOpExecutor.java:1465)
>         at org.apache.impala.service.CatalogOpExecutor.addPartitionsIfNotRemovedLater(CatalogOpExecutor.java:4068)
>         at org.apache.impala.catalog.events.MetastoreEvents$AddPartitionEvent.process(MetastoreEvents.java:1907)
>         ... 10 more
> I0730 10:36:27.622951  9547 catalog-server.cc:400] A catalog update with 1 entries is assembled. Catalog version: 1697 Last sent catalog version: 1696
> I0730 10:36:27.625458  9536 catalog-server.cc:813] Collected update: 1:CATALOG_SERVICE_ID, version=1698, original size=60, compressed size=58
> W0730 10:36:28.035472  9452 MetastoreEventsProcessor.java:826] Event processing is skipped since status is NEEDS_INVALIDATE. Last synced event id is 728391
> W0730 10:36:29.035904  9452 MetastoreEventsProcessor.java:826] Event processing is skipped since status is NEEDS_INVALIDATE. Last synced event id is 728391
> I0730 10:36:29.623289  9547 catalog-server.cc:400] A catalog update with 1 entries is assembled. Catalog version: 1698 Last sent catalog version: 1697
> W0730 10:36:30.036418  9452 MetastoreEventsProcessor.java:826] Event processing is skipped since status is NEEDS_INVALIDATE. Last synced event id is 728391
> W0730 10:36:31.036957  9452 MetastoreEventsProcessor.java:826] Event processing is skipped since status is NEEDS_INVALIDATE. Last synced event id is 728391
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org