You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/03/27 21:57:00 UTC

[jira] [Commented] (IMPALA-11330) Handle missing Iceberg data/metadata gracefully

    [ https://issues.apache.org/jira/browse/IMPALA-11330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705691#comment-17705691 ] 

ASF subversion and git services commented on IMPALA-11330:
----------------------------------------------------------

Commit 2c779939dc302be9ee5dd97ddf374bb043040891 in impala's branch refs/heads/master from Andrew Sherman
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2c779939d ]

IMPALA-11509: Prevent queries hanging when Iceberg metadata is missing.

Traditionally table metadata is loaded by the catalog and sent as thrift
to the Impala daemons. With Iceberg tables, some metadata, for example
the org.apache.iceberg.Table, is loaded in the Coordinator at the same
time as the thrift description is being deserialized. If the loading of
the org.apache.iceberg.Table fails, perhaps because of missing Iceberg
metadata, then the loading of the table fails. This can cause an
infinite loop as StmtMetadataLoader.loadTables() waits hopefully for
the catalog to send a new version of the table.

Change some Iceberg table loading methods to throw
IcebergTableLoadingException when a failure occurs. Prevent the hang by
substituting in an IncompleteTable if an IcebergTableLoadingException
occurs.

The test test_drop_incomplete_table had previously been disabled because
of IMPALA-11509. To re-enable this required a second change. The way
that DROP TABLE is executed on an iceberg table depends on which
Iceberg catalog is being used. If this Iceberg catalog is not a Hive
catalog then the execution happens in two parts, first the Iceberg
table is dropped, then the table is dropped in HMS. If this case, if
the drop fails in Iceberg, we should still continue on to perform the
drop in HMS.

TESTING

- Add a new test, originally developed for IMPALA-11330, which tests
  failures after deleting Iceberg metadata.
- Re-enable test_drop_incomplete_table().

Change-Id: I695559e21c510615918a51a4b5057bc616ee5421
Reviewed-on: http://gerrit.cloudera.org:8080/19509
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Handle missing Iceberg data/metadata gracefully
> -----------------------------------------------
>
>                 Key: IMPALA-11330
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11330
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.1.0
>            Reporter: Tamas Mate
>            Assignee: Andrew Sherman
>            Priority: Major
>              Labels: impala-iceberg
>
> In case the data/metadata directory is not available for an Iceberg table queries are failing with NotFoundException, see bellow. This affects DROP TABLE as well, which means that an Iceberg table can get stuck in the system if the administrator moves the data.
> {code:none}
> ERROR: NotFoundException: Failed to open input stream for file: hdfs://localhost:20500/test-warehouse/test2/metadata/00001-398886ba-f6eb-4b72-b755-f1be10ac99c5.metadata.json
> CAUSED BY: FileNotFoundException: File does not exist: /test-warehouse/test2/metadata/00001-398886ba-f6eb-4b72-b755-f1be10ac99c5.metadata.json
> 	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87)
> 	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:159)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2035)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:737)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:454)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)
> CAUSED BY: RemoteException: File does not exist: /test-warehouse/test2/metadata/00001-398886ba-f6eb-4b72-b755-f1be10ac99c5.metadata.json
> 	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87)
> 	at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77)
> 	at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:159)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2035)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:737)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:454)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:533)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:989)
> 	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:917)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2894)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org