You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/12/03 16:05:28 UTC

[GitHub] [iceberg] gallushi opened a new pull request #1871: Add Atomic Write support for Hadoop Tables

gallushi opened a new pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871


   This PR introduces support for File System with Atomic Write guarantees in Hadoop Tables.
   until now, hadoop tables required Atomic Rename from the underlying FS, now they can also be configured to use Atomic Write.
   when using Atomic Write, the final metadata file will be (atomically) written instead of using a tempfile and atomically renaming it (this is till used when atomic rename, which remains the default, is used).
   
   To enable the use of Atomic Write, (and assuming this guarantee is indeed supplied by the FS), a new config param is added.
   `iceberg.engine,hadoop.<SCHEME>.atomic.write` (defaults to `false`) where `SCHEME` is the FS scheme (e.g., `{file, hdfs, cos}`.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1871: Add Atomic Write support for Hadoop Tables

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871#discussion_r535533762



##########
File path: site/docs/configuration.md
##########
@@ -96,6 +96,13 @@ The following properties from the Hadoop configuration are used by the Hive Meta
 | iceberg.hive.client-pool-size      | 5                | The size of the Hive client pool when tracking tables in HMS  |
 | iceberg.hive.lock-timeout-ms       | 180000 (3 min)   | Maximum time in milliseconds to acquire a lock                |
 
+The following properties from the Hadoop configuration are used by Hadoop Tables
+
+| Property                                      | Default  | Description                                                            |
+| --------------------------------------------- | -------- | ---------------------------------------------------------------------- |
+| iceberg.engine.hadoop.`SCHEME`.atomic.write   | false    | Controls whether atomic write will be used instead of atomic rename. `SCHEME` is the scheme used, e.g. `{hdfs, cos}`    |

Review comment:
       I'm not sure that this is a good way to configure this feature. We avoid using Hadoop `Configuration` because we don't want to be dependent on it for configuration other than when we use Hadoop components.
   
   Also, I don't think that users should primarily need to add these configurations. If we know that a FS has some feature, then we should default to using that feature for the FS. A good example is locality in HDFS. We detect HDFS URIs and get locality for that file system, but skip it for S3.
   
   I would expect this to do the same, where the `HadoopTableOperations` implementation detects certain file systems and chooses the right atomic operation for them, based on the table location. We can also add configuration to set that list of file systems that is passed in when the `HadoopTableOperations` is created.
   
   Configuring the `TableOperations` is primarily how we want to customize because we have good ways of injecting your own table operations or using existing table operations.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gallushi commented on a change in pull request #1871: Add Atomic Write support for Hadoop Tables

Posted by GitBox <gi...@apache.org>.
gallushi commented on a change in pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871#discussion_r693815318



##########
File path: site/docs/configuration.md
##########
@@ -96,6 +96,13 @@ The following properties from the Hadoop configuration are used by the Hive Meta
 | iceberg.hive.client-pool-size      | 5                | The size of the Hive client pool when tracking tables in HMS  |
 | iceberg.hive.lock-timeout-ms       | 180000 (3 min)   | Maximum time in milliseconds to acquire a lock                |
 
+The following properties from the Hadoop configuration are used by Hadoop Tables
+
+| Property                                      | Default  | Description                                                            |
+| --------------------------------------------- | -------- | ---------------------------------------------------------------------- |
+| iceberg.engine.hadoop.`SCHEME`.atomic.write   | false    | Controls whether atomic write will be used instead of atomic rename. `SCHEME` is the scheme used, e.g. `{hdfs, cos}`    |

Review comment:
       @rdblue sorry for getting back to it just now.
   > I'm not sure that this is a good way to configure this feature. We avoid using Hadoop Configuration because we don't want to be dependent on it for configuration other than when we use Hadoop components.
   
   > Also, I don't think that users should primarily need to add these configurations. If we know that a FS has some feature, then we should default to using that feature for the FS. A good example is locality in HDFS. We detect HDFS URIs and get locality for that file system, but skip it for S3.
   
   I see your point. the only issue i have in my mind arises from the fact that schemes are actually configurable, so one can, for example, use Stocator with IBM COS and the scheme would be `s3` (or any other arbitrary scheme)- so my thought here is to have a list of FS (more precisely, schemes) that we know support Atomic write (all others are implicitly used with atomic rename), and if someone wishes to overwrite this, they can add a config to the hadoop conf (this is used only for a hadoop component - the FS). what do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gallushi commented on a change in pull request #1871: Add Atomic Write support for Hadoop Tables

Posted by GitBox <gi...@apache.org>.
gallushi commented on a change in pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871#discussion_r538459629



##########
File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java
##########
@@ -161,8 +163,24 @@ public void commit(TableMetadata base, TableMetadata metadata) {
           "Failed to check if next version exists: %s", finalMetadataFile);
     }
 
-    // this rename operation is the atomic commit operation
-    renameToFinal(fs, tempMetadataFile, finalMetadataFile);
+    if (useAtomicWrite) {
+      // using atomic write
+      try {
+        TableMetadataParser.write(metadata, io().newOutputFile(finalMetadataFile.toString()));
+      } catch (RuntimeIOException e) {
+        CommitFailedException cfe = new CommitFailedException(
+                "Failed to commit changes using atomic write: %s", finalMetadataFile.toString());

Review comment:
       > How do you know that the `RuntimeIOException` indicates a write conflict?
   
   we can't really know that.
   if the write fails due to a conflict, an `IOException` will be thrown in `FSDataOutputStream.close(...)`, it will be caught in `TableMetadataParser.internalWrite(...)` and a `RuntimeIOException` will be thrown.
   it will be tricky to identify that this is due to a conflict, since:
   1) other problems might have caused the `IOException` to be thrown.
   2) the exception text will vary based on underlying FS implementation (for example, for IBM COS if using Stocator as the connector, the exception message will look something like
    `Caused by: java.io.IOException: saving output gal/test/v1.metadata.json com.amazonaws.services.s3.model.AmazonS3Exception: At least one of the preconditions you specified did not hold. (Service: Amazon S3; Status Code: 412; Error Code: PreconditionFailed; Request ID.....`)
   
   > Should we support a conflict exception in the `FileIO` that can be thrown instead?
   
   the exception might be thrown by the output stream (e.g., during `close(...)`) - so IMHO changing the `FileIO` or even the `OutputFile` won't be enough to allow identifying conflicts.
   
   
   
   
   
   
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #1871: Add Atomic Write support for Hadoop Tables

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871#discussion_r535431178



##########
File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java
##########
@@ -161,8 +163,24 @@ public void commit(TableMetadata base, TableMetadata metadata) {
           "Failed to check if next version exists: %s", finalMetadataFile);
     }
 
-    // this rename operation is the atomic commit operation
-    renameToFinal(fs, tempMetadataFile, finalMetadataFile);
+    if (useAtomicWrite) {
+      // using atomic write
+      try {
+        TableMetadataParser.write(metadata, io().newOutputFile(finalMetadataFile.toString()));
+      } catch (RuntimeIOException e) {
+        CommitFailedException cfe = new CommitFailedException(
+                "Failed to commit changes using atomic write: %s", finalMetadataFile.toString());

Review comment:
       How do you know that the `RuntimeIOException` indicates a write conflict? Should we support a conflict exception in the `FileIO` that can be thrown instead?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] gallushi commented on pull request #1871: Add Atomic Write support for Hadoop Tables

Posted by GitBox <gi...@apache.org>.
gallushi commented on pull request #1871:
URL: https://github.com/apache/iceberg/pull/1871#issuecomment-738142747


   Hi @rdblue , following our discussion in #1655 , i'll appreciate your review, thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org