You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/04/06 00:26:05 UTC

[GitHub] [iceberg] flyrain opened a new pull request #2421: Spark: Add table sort-order to reserved table properties

flyrain opened a new pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421


   Unit test is added. The integration test is blocked by https://github.com/apache/iceberg/issues/2382.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814497120


   Thanks for doing this, @flyrain!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814583011


   Thanks for the PR, @flyrain! Thanks for reviewing, @RussellSpitzer!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#discussion_r607440106



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -880,4 +885,59 @@ public static TableIdentifier identifierToTableIdentifier(Identifier identifier)
     Option<String> database = namespace.length == 1 ? Option.apply(namespace[0]) : Option.empty();
     return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database);
   }
+
+  private static class DescribeSortOrderVisitor implements SortOrderVisitor<String> {
+    private static final DescribeSortOrderVisitor INSTANCE = new DescribeSortOrderVisitor();
+
+    private DescribeSortOrderVisitor() {
+    }
+
+    @Override
+    public String field(String sourceName, int sourceId,
+                        org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("%s %s %s", sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String bucket(String sourceName, int sourceId, int numBuckets,
+                         org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", numBuckets, sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String truncate(String sourceName, int sourceId, int width,
+                           org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", sourceName, width, direction, nullOrder);

Review comment:
       The tests don't cover all these cases, maybe we should add them in?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi edited a comment on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi edited a comment on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814496947


   This looks good apart from the point found by @RussellSpitzer. I'd be great to add a separate suite for all artificial properties we add in `SparkTable` where we could test different expression types but not a blocker for this PR. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#discussion_r608282465



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -880,4 +885,59 @@ public static TableIdentifier identifierToTableIdentifier(Identifier identifier)
     Option<String> database = namespace.length == 1 ? Option.apply(namespace[0]) : Option.empty();
     return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database);
   }
+
+  private static class DescribeSortOrderVisitor implements SortOrderVisitor<String> {
+    private static final DescribeSortOrderVisitor INSTANCE = new DescribeSortOrderVisitor();
+
+    private DescribeSortOrderVisitor() {
+    }
+
+    @Override
+    public String field(String sourceName, int sourceId,
+                        org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("%s %s %s", sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String bucket(String sourceName, int sourceId, int numBuckets,
+                         org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", numBuckets, sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String truncate(String sourceName, int sourceId, int width,
+                           org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", sourceName, width, direction, nullOrder);

Review comment:
       Looks correct to me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on a change in pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
flyrain commented on a change in pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#discussion_r608277608



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -880,4 +885,59 @@ public static TableIdentifier identifierToTableIdentifier(Identifier identifier)
     Option<String> database = namespace.length == 1 ? Option.apply(namespace[0]) : Option.empty();
     return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database);
   }
+
+  private static class DescribeSortOrderVisitor implements SortOrderVisitor<String> {
+    private static final DescribeSortOrderVisitor INSTANCE = new DescribeSortOrderVisitor();
+
+    private DescribeSortOrderVisitor() {
+    }
+
+    @Override
+    public String field(String sourceName, int sourceId,
+                        org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("%s %s %s", sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String bucket(String sourceName, int sourceId, int numBuckets,
+                         org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", numBuckets, sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String truncate(String sourceName, int sourceId, int width,
+                           org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", sourceName, width, direction, nullOrder);

Review comment:
       Fixed the comments in the new commits.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#discussion_r608239668



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -880,4 +885,59 @@ public static TableIdentifier identifierToTableIdentifier(Identifier identifier)
     Option<String> database = namespace.length == 1 ? Option.apply(namespace[0]) : Option.empty();
     return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database);
   }
+
+  private static class DescribeSortOrderVisitor implements SortOrderVisitor<String> {
+    private static final DescribeSortOrderVisitor INSTANCE = new DescribeSortOrderVisitor();
+
+    private DescribeSortOrderVisitor() {
+    }
+
+    @Override
+    public String field(String sourceName, int sourceId,
+                        org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("%s %s %s", sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String bucket(String sourceName, int sourceId, int numBuckets,
+                         org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", numBuckets, sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String truncate(String sourceName, int sourceId, int width,
+                           org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", sourceName, width, direction, nullOrder);

Review comment:
       Should we create a separate suite for artificial table properties like sort-order, current-snapshot-id?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on a change in pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#discussion_r607439629



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -880,4 +885,59 @@ public static TableIdentifier identifierToTableIdentifier(Identifier identifier)
     Option<String> database = namespace.length == 1 ? Option.apply(namespace[0]) : Option.empty();
     return org.apache.spark.sql.catalyst.TableIdentifier.apply(table, database);
   }
+
+  private static class DescribeSortOrderVisitor implements SortOrderVisitor<String> {
+    private static final DescribeSortOrderVisitor INSTANCE = new DescribeSortOrderVisitor();
+
+    private DescribeSortOrderVisitor() {
+    }
+
+    @Override
+    public String field(String sourceName, int sourceId,
+                        org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("%s %s %s", sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String bucket(String sourceName, int sourceId, int numBuckets,
+                         org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", numBuckets, sourceName, direction, nullOrder);
+    }
+
+    @Override
+    public String truncate(String sourceName, int sourceId, int width,
+                           org.apache.iceberg.SortDirection direction, NullOrder nullOrder) {
+      return String.format("bucket(%s, %s) %s %s", sourceName, width, direction, nullOrder);

Review comment:
       Truncate?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
flyrain commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814561820


   @aokolnychyi, will submit a follow-up PR later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi edited a comment on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi edited a comment on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814496947


   This looks good apart from the point found by @RussellSpitzer. I'd be great to add a separate suite for all artificial properties we add in `SparkTable` where we could test different expression types but not a blocker for this PR.
   
   Something like `TestReservedTableProperties` or similar.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi merged pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi merged pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
flyrain commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-813738522


   Can you take a look? @aokolnychyi , @RussellSpitzer , @karuppayya 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814496947


   This looks good apart from the point found by @RussellSpitzer. I'd be great to add a separate suite for all artificial properties we add in `SparkTable` where we could test different expression types. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814539174


   @flyrain, would you be interested to submit a follow-up PR to prohibit changes like you mentioned?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] flyrain commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
flyrain commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814535078


   Thanks @aokolnychyi and @RussellSpitzer for the review. I've resolved the comments in the new commits. Please take a look. Besides, user can use the command `ALTER TABLE local.db.student SET TBLPROPERTIES ('sort-order'='id asc')` to set the a `sort-order` properties. It is harmless since it won't take effect. But it is confusing. We should throw an error if user does it. We can include the change in another PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814486592


   Let me take a look at this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on pull request #2421: Spark: Add table sort-order to reserved table properties

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on pull request #2421:
URL: https://github.com/apache/iceberg/pull/2421#issuecomment-814486396


   I think the primary focus of this PR is to show the current sort order as a table property rather than to add sort-order to the list of reserved properties. We will need a separate PR to prohibit modifying reserved properties but that's separate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org