You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/18 15:20:56 UTC

[GitHub] [iceberg] Fokko opened a new pull request, #6220: API: Pass in the types

Fokko opened a new pull request, #6220:
URL: https://github.com/apache/iceberg/pull/6220

   Debugging some regression that we have at Trino, this PR seems to fix a part of it. Looks like there is another codepath that also creates PartitionSpecs outside of the builder. Looking into that now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
Fokko commented on PR #6220:
URL: https://github.com/apache/iceberg/pull/6220#issuecomment-1320264231

   Trino is now passing without changes on their side:  
   ![image](https://user-images.githubusercontent.com/1134248/202756378-40a74538-704a-4ba5-8ed8-c26ff151660f.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6220:
URL: https://github.com/apache/iceberg/pull/6220#discussion_r1027344805


##########
api/src/main/java/org/apache/iceberg/transforms/ProjectionUtil.java:
##########
@@ -231,7 +231,9 @@ static <S, T> UnboundPredicate<T> truncateArrayStrict(
   static <T> UnboundPredicate<T> projectTransformPredicate(
       Transform<?, T> transform, String partitionName, BoundPredicate<?> pred) {
     if (pred.term() instanceof BoundTransform
-        && transform.equals(((BoundTransform<?, ?>) pred.term()).transform())) {
+        && transform

Review Comment:
   We should be able to roll this back once we deprecate the old `apply` method in favor of the `bind` approach. Then we don't need to keep two classes around for those.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #6220:
URL: https://github.com/apache/iceberg/pull/6220#issuecomment-1321229086

   Thanks, @Fokko! Good find.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a diff in pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
rdblue commented on code in PR #6220:
URL: https://github.com/apache/iceberg/pull/6220#discussion_r1027159134


##########
core/src/main/java/org/apache/iceberg/util/SortOrderUtil.java:
##########
@@ -67,16 +66,16 @@ public static SortOrder buildSortOrder(Schema schema, PartitionSpec spec, SortOr
 
     // make a map of the partition fields that need to be included in the clustering produced by the
     // sort order
-    Map<Pair<Transform<?, ?>, Integer>, PartitionField> requiredClusteringFields =
+    Map<Pair<String, Integer>, PartitionField> requiredClusteringFields =
         requiredClusteringFields(spec);
 
     // remove any partition fields that are clustered by the sort order by iterating over a prefix
     // in the sort order.
     // this will stop when a non-partition field is found, or when the sort field only satisfies the
     // partition field.
     for (SortField sortField : sortOrder.fields()) {
-      Pair<Transform<?, ?>, Integer> sourceAndTransform =
-          Pair.of(sortField.transform(), sortField.sourceId());
+      Pair<String, Integer> sourceAndTransform =
+          Pair.of(sortField.transform().dedupName(), sortField.sourceId());

Review Comment:
   I think this needs to be `toString` rather than `dedupName`. The `dedupName` method should not really be public and shouldn't be used for cases like this. We want the actual transform here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #6220:
URL: https://github.com/apache/iceberg/pull/6220


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6220:
URL: https://github.com/apache/iceberg/pull/6220#discussion_r1027140078


##########
api/src/main/java/org/apache/iceberg/transforms/ProjectionUtil.java:
##########
@@ -231,7 +231,9 @@ static <S, T> UnboundPredicate<T> truncateArrayStrict(
   static <T> UnboundPredicate<T> projectTransformPredicate(
       Transform<?, T> transform, String partitionName, BoundPredicate<?> pred) {
     if (pred.term() instanceof BoundTransform
-        && transform.equals(((BoundTransform<?, ?>) pred.term()).transform())) {
+        && transform

Review Comment:
   I had to change this to make `testProjectionNames` pass. In the case of `hour` transform, we'll be comparing `transforms` which is an instance of `Timestamps` with `Hours`. The `toString()` both produce `hour`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Fokko commented on a diff in pull request #6220: API: Make the PartitionSpec less lazy

Posted by GitBox <gi...@apache.org>.
Fokko commented on code in PR #6220:
URL: https://github.com/apache/iceberg/pull/6220#discussion_r1027160417


##########
core/src/main/java/org/apache/iceberg/util/SortOrderUtil.java:
##########
@@ -67,16 +66,16 @@ public static SortOrder buildSortOrder(Schema schema, PartitionSpec spec, SortOr
 
     // make a map of the partition fields that need to be included in the clustering produced by the
     // sort order
-    Map<Pair<Transform<?, ?>, Integer>, PartitionField> requiredClusteringFields =
+    Map<Pair<String, Integer>, PartitionField> requiredClusteringFields =
         requiredClusteringFields(spec);
 
     // remove any partition fields that are clustered by the sort order by iterating over a prefix
     // in the sort order.
     // this will stop when a non-partition field is found, or when the sort field only satisfies the
     // partition field.
     for (SortField sortField : sortOrder.fields()) {
-      Pair<Transform<?, ?>, Integer> sourceAndTransform =
-          Pair.of(sortField.transform(), sortField.sourceId());
+      Pair<String, Integer> sourceAndTransform =
+          Pair.of(sortField.transform().dedupName(), sortField.sourceId());

Review Comment:
   Got it, I've updated the PR! 👍🏻 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org