You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "stevenzwu (via GitHub)" <gi...@apache.org> on 2023/05/03 20:14:36 UTC

[GitHub] [iceberg] stevenzwu opened a new pull request, #7517: API: StructProjection returns null projection object for null nested struct value

stevenzwu opened a new pull request, #7517:
URL: https://github.com/apache/iceberg/pull/7517

   Adjusted the same behavior in Flink RowDataProjection.
   
   closes issue #7507 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1189229609


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   I agree with this change. The only issue is what happens when struct is null. That could still happen if `wrap` is never called.
   
   I think I would favor the original null check (rather than the precondition) and the new behavior to return null instead of a `StructProjection` that wraps null. That way if somehow a null slips in, we at least just produce null. That seems to fit the intent as well, which is to return `null` when there is no value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186480410


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   > This API is widely used and it does not seem like a totally weird use case to wrap top-level nulls.
   
   that is a fair point. will bring back the null check. but will 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1188874499


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   question: Can this cause any performance degradation? Can we check places where this is invoked?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "rdblue (via GitHub)" <gi...@apache.org>.
rdblue commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1189230061


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   The `struct` field is defaulted to `null`. Plus, a check here would create a failure rather than gracefully handling a null struct. If the struct is somehow null, then the projection should be as well right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1188912538


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   I feel the removal of if-check and adding Preconditions check should go hand-to-hand. Either we do both or don't change it all. Otherwise, we can get NPE for wrapping root null object.
   
   One Preconditions check shouldn't cause performance degradation. Overall, we removed one if check and added a new check in Preconditions.
   
   Here are a few usages on the `StructProjection#wrap(StructLike)` 
   * core: transform on `CloseableIterable<StructLike>` for manifest or file entries
   * data: delete filter on record
   * flink: always wrap non-null `RowDataWrapper` object for EqualityFieldKeySelector
   * spark: always wrap non-null `InternalRowWrapper` object for wrapping partition data
   
   @RussellSpitzer @rdblue do you see any potential null root values for the core and data module usage of wrap.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1185471395


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   Even if we need to bring back line 192-196, it may still be good to return null projection 
   ```
   Struct(1, null)
   ->
   before this change: StructProjection(1, StructProjection(null))
   after this change:    StructProjection(1, null)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#issuecomment-1545137699

   I am going to merge this PR. the only change on handling the nested null struct is everyone have agreed upon. We have kept the old behavior regarding the null root struct and null check.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1185471395


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   Even if we need to bring back line 192-196, it may still be good to return null projection for nested null struct
   ```
   Struct(1, null)
   ->
   before this change: StructProjection(1, StructProjection(null))
   after this change:    StructProjection(1, null)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1189241221


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   @rdblue you are saying wrapping a null root struct should be allowed (not throwing exception), right?
   
   ```
   StructProjection projection = StructProjection.create(schema, projectedSchema);
   projection.wrap(null); // allowed
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#issuecomment-1540544835

   > We probably want to double check places that rely on this class and see if the new precondition may cause any performance degradation in a critical path.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186593846


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   @RussellSpitzer @aokolnychyi I have a second thought. The behaviors of wrapping null root value seems a little odd. Note that `StructProjection` is also a `StructLike`.
   
   ```
   struct = null
   projection = StructProjection(null)
   ```
   
   `struct == null` would be true. but `projection == null` would be false. `StructProjection(null)` behaves more like an empty struct.
   
   also note that line 192-196 were originally added to deal with wrapping of nested null struct (issue #2738 and PR #3240). And the wrapping of nested null struct is avoided in this PR.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #7517: API: StructProjection returns null projection object for null nested struct value

Posted by "RussellSpitzer (via GitHub)" <gi...@apache.org>.
RussellSpitzer commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1185449600


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   is there a potential npe here if "struct" is null?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186593846


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   @RussellSpitzer @aokolnychyi I have a second thought. The behaviors of wrapping null root value seems a little odd. Note that `StructProjection` is also a `StructLike`.
   
   ```
   struct = null
   projection = StructProjection(null)
   ```
   
   `struct == null` would be true. but `projection == null` would be false. `StructProjection(null)` behaves more like an empty struct.
   
   also note that line 192-196 were originally added to deal with nested null struct (issue #2738 and PR #), which is avoided in the PR #3240.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186593846


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   @RussellSpitzer @aokolnychyi I have a second thought. The behaviors of wrapping null root value seems a little odd. Note that `StructProjection` is also a `StructLike`.
   
   ```
   struct = null
   projection = StructProjection(null)
   ```
   
   `struct == null` would be true. but `projection == null` would be false. `StructProjection(null)` behaves more like an empty struct.
   
    line 192-196 were originally added to deal with wrapping nested null struct (issue #2738 and PR #3240). the wrapping of nested null struct is avoided in this PR.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1189241221


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   ~@rdblue you are saying wrapping a null root struct should be allowed (not throwing exception), right?~
   
   ```
   StructProjection projection = StructProjection.create(schema, projectedSchema);
   projection.wrap(null); // allowed
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1185469804


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   good question. this change avoided the wrapping null at nested level, [which is the main reason we added the lines from 192-196](https://github.com/apache/iceberg/issues/2738) that are removed in this PR. 
   
   I guess it can potentially happen if we use `StructProjection` to wrap a null `struct` at the root level. If it is a valid scenario, we need to bring back line 192-196. If it is not a valid scenario, we should add `Preconditions` check in the wrap method.
   ```
     public StructProjection wrap(StructLike newStruct) {
       this.struct = newStruct;
       return this;
     }
   ```
   
   My take is that StructProjection shouldn't wrap a null root value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186429003


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   I agree about the behavior to return nested nulls without wrapping but I am not sure it is a good idea to remove lines 192-196. This API is widely used and it does not seem like a totally weird use case to wrap top-level nulls. Otherwise, every caller would have to check if the passed value is null. If someone has to do that check, better hide it inside this class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1188912538


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   I feel the removal of if-check and adding Preconditions check should go hand-to-hand. Either we do both or don't change it all. Otherwise, we can get NPE for wrapping root null object.
   
   One Preconditions check shouldn't cause performance degradation. Also, we removed one if check and added a new check in Preconditions. So it is net neutral in the end too.
   
   Here are the usages on the `StructProjection#wrap(StructLike)` inside Iceberg repo
   * core: transform on `CloseableIterable<StructLike>` for manifest or file entries
   * data: delete filter on record
   * flink: always wrap non-null `RowDataWrapper` object for EqualityFieldKeySelector
   * spark: always wrap non-null `InternalRowWrapper` object for wrapping partition data
   
   @RussellSpitzer @rdblue do you see any potential null root values for the core and data module usage of wrap.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu merged pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu merged PR #7517:
URL: https://github.com/apache/iceberg/pull/7517


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] chenjunjiedada commented on pull request #7517: API: StructProjection returns null projection object for null nested struct value

Posted by "chenjunjiedada (via GitHub)" <gi...@apache.org>.
chenjunjiedada commented on PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#issuecomment-1534901852

   LGTM, Thanks @stevenzwu for digging and fixing!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1185469804


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   good question. this change avoided the wrapping null at nested level, [which is the main reason we added the lines from 192-196](https://github.com/apache/iceberg/issues/2738) that are removed in this PR. 
   
   I guess it can potentially happen if we use `StructProjection` to wrap a null `struct` at the root level. If it is a valid scenario, we need to bring back line 192-196. If it is not a valid scenario, we should add `Preconditions` check in the wrap method.
   ```
     public StructProjection wrap(StructLike newStruct) {
       this.struct = newStruct;
       return this;
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1187755399


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   After a closer look, you are probably right. Let me check usages but seems reasonable to remove that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1187984395


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   I don't see any blockers. Let's remove those lines and get this in.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "aokolnychyi (via GitHub)" <gi...@apache.org>.
aokolnychyi commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186431977


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);
+      return nestedStruct == null

Review Comment:
   minor: ternary operators split on multiple lines are bit hard to read, in my view. I'd consider using `if`.



##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);
+      return nestedStruct == null

Review Comment:
   minor: Ternary operators split on multiple lines are bit hard to read, in my view. I'd consider using `if`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1186480410


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   > This API is widely used and it does not seem like a totally weird use case to wrap top-level nulls.
   
   that is a fair point. will bring back the null check. will update comments to reflect the latest state.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #7517: API: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#issuecomment-1533816571

   @rdblue @aokolnychyi @RussellSpitzer @Reo-LEI can you please take a look? I think it is a bit cleaner. not much benefit to have a projection object wrapping a null nested struct value.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1188912538


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -174,6 +174,7 @@ private StructProjection(StructType structType, StructType projection, boolean a
   }
 
   public StructProjection wrap(StructLike newStruct) {
+    Preconditions.checkState(newStruct != null, "Invalid root struct for wrapping: null");

Review Comment:
   I feel the removal of if-check and adding Preconditions check should go hand-to-hand. Either we do both or don't change it all. Otherwise, we can get NPE for wrapping root null object.
   
   One Preconditions check shouldn't cause performance degradation. Also, we removed one if check and added a new check in Preconditions. So it is net neutral in the end too.
   
   Here are a few usages on the `StructProjection#wrap(StructLike)` 
   * core: transform on `CloseableIterable<StructLike>` for manifest or file entries
   * data: delete filter on record
   * flink: always wrap non-null `RowDataWrapper` object for EqualityFieldKeySelector
   * spark: always wrap non-null `InternalRowWrapper` object for wrapping partition data
   
   @RussellSpitzer @rdblue do you see any potential null root values for the core and data module usage of wrap.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7517: API, Flink: StructProjection returns null projection object for null nested struct value

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7517:
URL: https://github.com/apache/iceberg/pull/7517#discussion_r1189282347


##########
api/src/main/java/org/apache/iceberg/util/StructProjection.java:
##########
@@ -189,16 +189,12 @@ public int size() {
 
   @Override
   public <T> T get(int pos, Class<T> javaClass) {
-    if (struct == null) {
-      // Return a null struct when projecting a nested required field from an optional struct.
-      // See more details in issue #2738.
-      return null;
-    }
-
     int structPos = positionMap[pos];
-
     if (nestedProjections[pos] != null) {
-      return javaClass.cast(nestedProjections[pos].wrap(struct.get(structPos, StructLike.class)));
+      StructLike nestedStruct = struct.get(structPos, StructLike.class);

Review Comment:
   > The only issue is what happens when struct is null. That could still happen if wrap is never called.
   
   This is a good point. will revert to the original null check then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org