You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "aokolnychyi (via GitHub)" <gi...@apache.org> on 2023/03/22 23:56:48 UTC

[GitHub] [iceberg] aokolnychyi opened a new pull request, #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

aokolnychyi opened a new pull request, #7177:
URL: https://github.com/apache/iceberg/pull/7177

   This PR migrates `AncestorsOfProcedure` to use `ProcedureInput`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "flyrain (via GitHub)" <gi...@apache.org>.
flyrain commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1148460180


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public Boolean asBoolean(ProcedureParameter param, Boolean defaultValue) {
     validateParamType(param, DataTypes.BooleanType);
     int ordinal = ordinal(param);
-    return args.isNullAt(ordinal) ? defaultValue : args.getBoolean(ordinal);
+    return args.isNullAt(ordinal) ? defaultValue : (Boolean) args.getBoolean(ordinal);

Review Comment:
   Looks like the literal `null` is special. It doesn't unbox. +1 for your solution. 
   ```
   Boolean bv = null;
   Boolean b = true? bv: true;  //throw NPE
   Boolean b = true? bv: (Boolean)true;  // not throw NPE
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#issuecomment-1489170083

   Thanks for reviewing, @flyrain @amogh-jahagirdar!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1145535106


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public boolean asBoolean(ProcedureParameter param, boolean defaultValue) {

Review Comment:
   I overlooked initially that we won't be able to have reasonable naming for retrieving longs (`long` is reserved). I switched to `asXXX` like in `PropertyUtil` and renamed all existing methods.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1145535106


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public boolean asBoolean(ProcedureParameter param, boolean defaultValue) {

Review Comment:
   I overlooked initially that we won't be able to have reasonable naming for retrieving longs (`long` is reserved). I switched to `asXXX` like in `PropertyUtil` and renamed all existing methods.



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public Boolean asBoolean(ProcedureParameter param, Boolean defaultValue) {

Review Comment:
   I overlooked initially that we won't be able to have reasonable naming for retrieving longs (long is reserved). I switched to asXXX like in PropertyUtil and renamed all existing methods.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#issuecomment-1480402640

   cc @szehon-ho @amogh-jahagirdar @flyrain @singhpk234 @ajantha-bhat


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1145612841


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public Boolean asBoolean(ProcedureParameter param, Boolean defaultValue) {

Review Comment:
   I overlooked initially that we won't be able to have reasonable naming for retrieving longs (long is reserved). I switched to `asXXX` like in `PropertyUtil` and renamed all existing methods.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1151408416


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -153,7 +166,7 @@ private <K, V> Map<K, V> map(
     return convertedMap;
   }
 
-  public Identifier ident(ProcedureParameter param) {
+  public Identifier asIdent(ProcedureParameter param) {

Review Comment:
   That makes sense, will reduce the amount of changes too. I'll update.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "amogh-jahagirdar (via GitHub)" <gi...@apache.org>.
amogh-jahagirdar commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1146572529


##########
spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAncestorsOfProcedure.java:
##########
@@ -156,7 +156,7 @@ public void testInvalidAncestorOfCases() {
     AssertHelpers.assertThrows(
         "Should reject calls with empty table identifier",
         IllegalArgumentException.class,
-        "Cannot handle an empty identifier for argument table",
+        "Cannot handle an empty identifier for parameter 'table'",

Review Comment:
   Does the exception message need to change? 



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -153,7 +166,7 @@ private <K, V> Map<K, V> map(
     return convertedMap;
   }
 
-  public Identifier ident(ProcedureParameter param) {
+  public Identifier asIdent(ProcedureParameter param) {

Review Comment:
   I'm not sure we actually want to use `asIdent` here, I think it makes sense for obtaining the other types but ident by  itself should be sufficient



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1151430707


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -153,7 +166,7 @@ private <K, V> Map<K, V> map(
     return convertedMap;
   }
 
-  public Identifier ident(ProcedureParameter param) {
+  public Identifier asIdent(ProcedureParameter param) {

Review Comment:
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "flyrain (via GitHub)" <gi...@apache.org>.
flyrain commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1147849826


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public Boolean asBoolean(ProcedureParameter param, Boolean defaultValue) {
     validateParamType(param, DataTypes.BooleanType);
     int ordinal = ordinal(param);
-    return args.isNullAt(ordinal) ? defaultValue : args.getBoolean(ordinal);
+    return args.isNullAt(ordinal) ? defaultValue : (Boolean) args.getBoolean(ordinal);

Review Comment:
   I cannot reproduce the NPE locally. Run this script on Java8 and Java11. Neither triggered the NPE. Maybe Java has fixed the issue?
   ```
   jshell> int a = 1
   a ==> 1
   
   jshell> Boolean b = a== 1 ? null: true
   b ==> null
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi merged pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi merged PR #7177:
URL: https://github.com/apache/iceberg/pull/7177


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1151409241


##########
spark/v3.3/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestAncestorsOfProcedure.java:
##########
@@ -156,7 +156,7 @@ public void testInvalidAncestorOfCases() {
     AssertHelpers.assertThrows(
         "Should reject calls with empty table identifier",
         IllegalArgumentException.class,
-        "Cannot handle an empty identifier for argument table",
+        "Cannot handle an empty identifier for parameter 'table'",

Review Comment:
   We reworked error messages in `ProcedureInput` and since this procedure now consumes it, I had to update the test.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1145612369


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public Boolean asBoolean(ProcedureParameter param, Boolean defaultValue) {
     validateParamType(param, DataTypes.BooleanType);
     int ordinal = ordinal(param);
-    return args.isNullAt(ordinal) ? defaultValue : args.getBoolean(ordinal);
+    return args.isNullAt(ordinal) ? defaultValue : (Boolean) args.getBoolean(ordinal);

Review Comment:
   An explicit cast is needed to prevent a NPE, probably same as [here](https://stackoverflow.com/questions/25417438/java-autoboxing-and-ternary-operator-madness).



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -62,31 +62,43 @@ public boolean isProvided(ProcedureParameter param) {
     return !args.isNullAt(ordinal);
   }
 
-  public boolean bool(ProcedureParameter param, boolean defaultValue) {
+  public Boolean asBoolean(ProcedureParameter param, Boolean defaultValue) {
     validateParamType(param, DataTypes.BooleanType);
     int ordinal = ordinal(param);
-    return args.isNullAt(ordinal) ? defaultValue : args.getBoolean(ordinal);
+    return args.isNullAt(ordinal) ? defaultValue : (Boolean) args.getBoolean(ordinal);

Review Comment:
   An explicit cast is needed to prevent a NPE when `defaultValue` is null, probably same as [here](https://stackoverflow.com/questions/25417438/java-autoboxing-and-ternary-operator-madness).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #7177: Spark 3.3: Use ProcedureInput in AncestorsOfProcedure

Posted by "amogh-jahagirdar (via GitHub)" <gi...@apache.org>.
amogh-jahagirdar commented on code in PR #7177:
URL: https://github.com/apache/iceberg/pull/7177#discussion_r1146559185


##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/ProcedureInput.java:
##########
@@ -153,7 +166,7 @@ private <K, V> Map<K, V> map(
     return convertedMap;
   }
 
-  public Identifier ident(ProcedureParameter param) {
+  public Identifier asIdent(ProcedureParameter param) {

Review Comment:
   I'm not sure we actually want to use `asIdent` here, I think it makes sense for obtaining the other types but ident by  itself should be sufficient



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org