You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Vihang Karajgaonkar (Code Review)" <ge...@cloudera.org> on 2019/12/04 00:31:59 UTC

[Impala-ASF-CR] IMPALA-9122 : Ignore FileNotFoundException when loading a table

Vihang Karajgaonkar has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/14806 )

Change subject: IMPALA-9122 : Ignore FileNotFoundException when loading a table
......................................................................

IMPALA-9122 : Ignore FileNotFoundException when loading a table

It is possible that when the file metadata of a table or partition is
being loaded, some temporary files (like the ones in .hive-staging
directory) are deleted by external engines like Hive. This causes a
FileNotFoundException during the load and it fails the reload command.
In general, this should not be a problem since users are careful not to
modify the table from Hive or Spark while Impala is reading them. In
the worst case, currently the refresh command fails which can be
retried by the user. However, this does not go well with when event
processing is turned on. EventProcessor tries to reload the table as
soon as it sees a INSERT_EVENT from metastore. Hive may be still
cleaning up the staging directories when EventProcessor issues a reload
causing it go in error state.

Ideally, we should have some sort of intra-engine synchronization
semantics to avoid such issues, but that is much more complex
architectural change. For now, we should ignore such errors and skip
the deleted file from being loaded.

Testing: Unfortunately, this error is hard to reproduce locally. I
tried creating multiple threads which delete some files while multiple
FileMetadataLoaders are loading concurrently but it didn't fail for me.
Ran TestEventProcessing.test_insert_events in a loop for 20 iterations
and didn't see any failure.

Change-Id: Iecf6b193b0d57de27d41ad6ef6e1719005d9e908
---
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
1 file changed, 16 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/14806/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14806
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iecf6b193b0d57de27d41ad6ef6e1719005d9e908
Gerrit-Change-Number: 14806
Gerrit-PatchSet: 2
Gerrit-Owner: Vihang Karajgaonkar <vi...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <hu...@gmail.com>