You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/08/12 07:20:58 UTC

[GitHub] [iceberg] skambha opened a new pull request #1325: Fix the schemachanges update to add a required column based on its nullability property

skambha opened a new pull request #1325:
URL: https://github.com/apache/iceberg/pull/1325


   Fixes https://github.com/apache/iceberg/issues/1324
   
   - Changes to add a required column or a optional column based on nullability
   - Added a new unit test to add a required column
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] skambha commented on a change in pull request #1325: Fix the schemachanges update to add a required column based on its nullability property

Posted by GitBox <gi...@apache.org>.

skambha commented on a change in pull request #1325:
URL: https://github.com/apache/iceberg/pull/1325#discussion_r470348526



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -178,8 +178,12 @@ private static void apply(UpdateSchema pendingUpdate, TableChange.UpdateColumnPo
 
   private static void apply(UpdateSchema pendingUpdate, TableChange.AddColumn add) {
     Type type = SparkSchemaUtil.convert(add.dataType());
-    pendingUpdate.addColumn(parentName(add.fieldNames()), leafName(add.fieldNames()), type, add.comment());
-
+    if (add.isNullable()) {
+      pendingUpdate.addColumn(parentName(add.fieldNames()), leafName(add.fieldNames()), type, add.comment());
+    } else {
+      pendingUpdate.allowIncompatibleChanges()

Review comment:
       Thanks for the review, @rdblue.  Sure, that makes sense. 
   
   I will update it to throw an error and will use the same error message as SchemaUpdate.addRequiredColumn would have returned without the `allowIncompatibleChanges()`. 
    ie ` Incompatible change: cannot add required column: %s`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #1325: Throw error when adding non-null columns using the alter table ddl

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #1325:
URL: https://github.com/apache/iceberg/pull/1325#issuecomment-674155563


   Looks great. Thank you for fixing this, @skambha!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #1325: Fix the schemachanges update to add a required column based on its nullability property

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #1325:
URL: https://github.com/apache/iceberg/pull/1325#issuecomment-673034138


   Thanks for working on this, @skambha! I think adding the column as nullable is definitely a bug, but I'm not sure that we can add non-null columns. I think we should probably fail the ALTER TABLE command.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue merged pull request #1325: Throw error when adding non-null columns using the alter table ddl

Posted by GitBox <gi...@apache.org>.

rdblue merged pull request #1325:
URL: https://github.com/apache/iceberg/pull/1325


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on a change in pull request #1325: Fix the schemachanges update to add a required column based on its nullability property

Posted by GitBox <gi...@apache.org>.

rdblue commented on a change in pull request #1325:
URL: https://github.com/apache/iceberg/pull/1325#discussion_r469452497



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -178,8 +178,12 @@ private static void apply(UpdateSchema pendingUpdate, TableChange.UpdateColumnPo
 
   private static void apply(UpdateSchema pendingUpdate, TableChange.AddColumn add) {
     Type type = SparkSchemaUtil.convert(add.dataType());
-    pendingUpdate.addColumn(parentName(add.fieldNames()), leafName(add.fieldNames()), type, add.comment());
-
+    if (add.isNullable()) {
+      pendingUpdate.addColumn(parentName(add.fieldNames()), leafName(add.fieldNames()), type, add.comment());
+    } else {
+      pendingUpdate.allowIncompatibleChanges()

Review comment:
       I'm not sure that this should call `allowIncompatibleChanges()` because adding a required column when there are no existing values will break reading the new column in any table with data in it. The only time it is safe to add a required column is if there is no data in the table.
   
   What about throwing an exception here instead? I agree that the column should not be optional if NOT NULL was specified.
   
   Another alternative is to check whether the table has data and allow the incompatible change if it doesn't have any rows.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org