You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "RussellSpitzer (via GitHub)" <gi...@apache.org> on 2023/05/22 14:52:42 UTC

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4325: Spark: Skip corrupt files in Spark Procedures and Actions

RussellSpitzer commented on code in PR #4325:
URL: https://github.com/apache/iceberg/pull/4325#discussion_r1200630013


##########
docs/spark-procedures.md:
##########
@@ -432,11 +432,12 @@ By default, the original table is retained with the name `table_BACKUP_`.
 
 #### Usage
 
-| Argument Name | Required? | Type | Description |
-|---------------|-----------|------|-------------|
-| `table`       | ✔️  | string | Name of the table to migrate |
-| `properties`  | ️   | map<string, string> | Properties for the new Iceberg table |
-| `drop_backup` |   | boolean | When true, the original table will not be retained as backup (defaults to false) |
+| Argument Name     | Required? | Type                | Description                                                                        |
+|-------------------|-----------|---------------------|------------------------------------------------------------------------------------|
+| `table`           | ✔️        | string              | Name of the table to migrate                                                       |
+| `properties`      | ️         | map<string, string> | Properties for the new Iceberg table                                               |
+| `drop_backup`     | ️         | boolean             | When true, the original table will not be retained as backup (defaults to false)   |
+| `skip_on_error`   | ️         | boolean             | If true, skip files which cannot be imported into Iceberg (false by default)       |

Review Comment:
   line 439 has "defaults to false", I would pick just one wording and that seems to be more common in this doc



##########
docs/spark-procedures.md:
##########
@@ -475,6 +476,7 @@ will then treat these files as if they are part of the set of files  owned by Ic
 | `source_table`          | ✔️        | string              | Table where files should come from, paths are also possible in the form of \`file_format\`.\`path\` |
 | `partition_filter`      | ️         | map<string, string> | A map of partitions in the source table to import from                                              |
 | `check_duplicate_files` | ️         | boolean             | Whether to prevent files existing in the table from being added (defaults to true)                  |
+| `skip_on_error`         | ️         | boolean             | If true, skip files which cannot be imported into Iceberg (false by default)                        |

Review Comment:
   Same comment as above, "defaults to xxx"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org