You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/24 05:17:12 UTC

[GitHub] [iceberg] coolderli opened a new pull request #2860: HiveMetastore: throw CommitedUnknowException when commit socket timeout.

coolderli opened a new pull request #2860:
URL: https://github.com/apache/iceberg/pull/2860


   When committing to Hive metastore, there is a certain probability that socket timeout will occur due to network reasons. RuntimeException will be thrown in the current implementation, and then the metadata file will be deleted. Socket timeout should be considered an unknown state, and the metadata file should be kept.
    
   The socket timeout exception is as follow:
   ```
   2021-07-20 08:09:20.224 ERROR org.apache.iceberg.hive.HiveTableOperations                  - Cannot tell if commit to nrdc.fs_test_ts_v1 succeeded, attempting to reconnect and check.
   java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
           at org.apache.iceberg.relocated.com.google.common.base.Throwables.propagate(Throwables.java:241)
           at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:80)
           at org.apache.iceberg.hive.HiveTableOperations.lambda$persistTable$3(HiveTableOperations.java:308)
           at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
           at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:77)
           at org.apache.iceberg.hive.HiveTableOperations.persistTable(HiveTableOperations.java:304)
           at org.apache.iceberg.hive.HiveTableOperations.doCommit(HiveTableOperations.java:259)
           at org.apache.iceberg.BaseMetastoreTableOperations.commit(BaseMetastoreTableOperations.java:128)
           at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:307)
           at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)
           at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:213)
           at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:197)
           at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:189)
           at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:289)
           at org.apache.iceberg.flink.sink.IcebergFilesCommitter.commitOperation(IcebergFilesCommitter.java:308)
           at org.apache.iceberg.flink.sink.IcebergFilesCommitter.commitDeltaTxn(IcebergFilesCommitter.java:277)
           at org.apache.iceberg.flink.sink.IcebergFilesCommitter.commitUpToCheckpoint(IcebergFilesCommitter.java:219)
           at org.apache.iceberg.flink.sink.IcebergFilesCommitter.notifyCheckpointComplete(IcebergFilesCommitter.java:189)
           at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.notifyCheckpointComplete(StreamOperatorWrapper.java:99)
           at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.notifyCheckpointComplete(SubtaskCheckpointCoordinatorImpl.java:319)
           at org.apache.flink.streaming.runtime.tasks.StreamTask.notifyCheckpointComplete(StreamTask.java:1089)
           at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointCompleteAsync$11(StreamTask.java:1054)
           at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$notifyCheckpointOperation$13(StreamTask.java:1077)
           at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
           at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
           at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317)
           at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:189)
           at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:619)
           at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:583)
           at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:758)
           at org.apache.flink.runtime.taskmanager.Task.run(Task.java:573)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
           at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
           at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
           at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376)
           at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453)
           at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435)
           at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
           at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
           at org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
           at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
           at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
           at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
           at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
           at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_alter_table_with_environment_context(ThriftHiveMetastore.java:1375)
           at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.alter_table_with_environment_context(ThriftHiveMetastore.java:1359)
           at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:370)
           at sun.reflect.GeneratedMethodAccessor115.invoke(Unknown Source)
           at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:65)
           at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:77)
           ... 30 more
   Caused by: java.net.SocketTimeoutException: Read timed out
           at java.net.SocketInputStream.socketRead0(Native Method)
           at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
           at java.net.SocketInputStream.read(SocketInputStream.java:171)
           at java.net.SocketInputStream.read(SocketInputStream.java:141)
           at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
           at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
           at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
           at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
           ... 49 more
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on pull request #2860: HiveMetastore: throw CommitedUnknowException when commit socket timeout.

Posted by GitBox <gi...@apache.org>.
openinx commented on pull request #2860:
URL: https://github.com/apache/iceberg/pull/2860#issuecomment-919662733


   FYI @pvary 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on a change in pull request #2860: HiveMetastore: throw CommitedUnknowException when commit socket timeout.

Posted by GitBox <gi...@apache.org>.
pvary commented on a change in pull request #2860:
URL: https://github.com/apache/iceberg/pull/2860#discussion_r708842942



##########
File path: hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java
##########
@@ -275,10 +275,15 @@ protected void doCommit(TableMetadata base, TableMetadata metadata) {
       throw new AlreadyExistsException("Table already exists: %s.%s", database, tableName);
 
     } catch (TException | UnknownHostException e) {
-      if (e.getMessage() != null && e.getMessage().contains("Table/View 'HIVE_LOCKS' does not exist")) {
-        throw new RuntimeException("Failed to acquire locks from metastore because 'HIVE_LOCKS' doesn't " +
-            "exist, this probably happened when using embedded metastore or doesn't create a " +
-            "transactional meta table. To fix this, use an alternative metastore", e);
+      if (e.getMessage() != null) {
+        if (e.getMessage().contains("Table/View 'HIVE_LOCKS' does not exist")) {
+          throw new RuntimeException("Failed to acquire locks from metastore because 'HIVE_LOCKS' doesn't " +
+              "exist, this probably happened when using embedded metastore or doesn't create a " +
+              "transactional meta table. To fix this, use an alternative metastore", e);
+        } else if (e.getMessage().contains("java.net.SocketTimeoutException: Read timed out")) {

Review comment:
       Could we just check for the `cause` field instead of the message string?
   Or we are in a same situation as with the MetaExceptions, where there is only a string available?
   
   Thanks, Peter 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org