You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "dramaticlly (via GitHub)" <gi...@apache.org> on 2023/02/20 22:06:27 UTC

[GitHub] [iceberg] dramaticlly commented on a diff in pull request #6849: Docs: Update documentation for Spark AddFiles procedure

dramaticlly commented on code in PR #6849:
URL: https://github.com/apache/iceberg/pull/6849#discussion_r1112356623


##########
docs/spark-procedures.md:
##########
@@ -462,16 +462,29 @@ will then treat these files as if they are part of the set of files  owned by Ic
 
 #### Usage
 
-| Argument Name | Required? | Type | Description |
-|---------------|-----------|------|-------------|
-| `table`       | ✔️  | string | Table which will have files added to|
-| `source_table`| ✔️  | string | Table where files should come from, paths are also possible in the form of \`file_format\`.\`path\` |
-| `partition_filter`  | ️   | map<string, string> | A map of partitions in the source table to import from |
+| Argument Name | Required? | Type | Description                                                                                                                                            |
+|---------------|-----------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `table`       | ✔️  | string | Table which will have files added to                                                                                                                   |
+| `source_table`| ✔️  | string | Table where files should come from, paths are also possible in the form of \`file_format\`.\`path\`                                                    |
+| `partition_filter`  | ️   | map<string, string> | A map of partitions in the source table to import from                                                                                                 |
+| `check_duplicate_files`  | ️   | boolean | When true, will throw a exception if files added will result in duplicate (on by default, it's checking against files path in entries metadata table). |
 
 Warning : Schema is not validated, adding files with different schema to the Iceberg table will cause issues.
 
 Warning : Files added by this method can be physically deleted by Iceberg operations
 
+Warning : SQL delete followed by this add_files procedure immediately might not add files back due to deleted files are still tracked in manifest entry with status = 2: DELETED

Review Comment:
   thank you @szehon-ho , I cut #6889 instead to fix the problem instead to callout as warning here. Will drop this line



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org