You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/01/09 03:38:47 UTC

[GitHub] [iceberg] felixYyu opened a new pull request #3862: Spark: Supports partition management in V2 Catalog

felixYyu opened a new pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862


   this PR fix #[https://github.com/apache/iceberg/issues/3558](url)
   
   ```
   sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);
   
   sql("ALTER TABLE %s DROP IF EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);
   
   sql("SHOW PARTITIONS %s", tableName);
   +---------------------------------------------------+
   |partition                                          |
   +---------------------------------------------------+
   |id_bucket=2/data_trunc=2022/ts_hour=2022-01-08-23|
   +---------------------------------------------------+
   ```
   
   @RussellSpitzer @rdblue please review this PR code, thanks very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#issuecomment-1008817701


   [https://spark.apache.org/docs/latest/sql-ref-syntax-aux-show-partitions.html](url)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784094937



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    List<InternalRow> rows = Lists.newArrayList();
+    StructType schema = partitionSchema();
+    StructField[] fields = schema.fields();
+    Map<String, String> partitionFilter = Maps.newHashMap();
+    if (names.length > 0) {
+      int idx = 0;
+      while (idx < names.length) {
+        DataType dataType = schema.apply(names[idx]).dataType();
+        partitionFilter.put(names[idx], String.valueOf(ident.get(idx, dataType)));
+        idx += 1;
+      }
+    }
+    String fileFormat = icebergTable.properties()
+            .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+    List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),

Review comment:
       This method is just for getting the "partitions" of a file based table. Iceberg should be using its own internal "partitions" table for this info. See https://github.com/apache/iceberg/blob/f932c55fd8c07c875984889ecfbb1fd7c219f726/core/src/main/java/org/apache/iceberg/PartitionsTable.java#L35
   
   Since this is spark we should be using 
   
   https://github.com/RussellSpitzer/iceberg/blob/56839c8c07b83edcfd11765ffb7303da811a65fc/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java#L610
   
   or 
   
   https://github.com/apache/iceberg/blob/827b6c86108eec7b6de25f61022c7f8b5dda481c/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java#L614-L618
   
   To use the distributed versions of this code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805112627



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/sql/TestPartitionedWrites.java
##########
@@ -176,4 +177,27 @@ public void testViewsReturnRecentResults() {
         ImmutableList.of(row(1L, "a"), row(1L, "a")),
         sql("SELECT * FROM tmp"));
   }
+
+  @Test
+  public void testAddPartition() {
+    // only check V2 command [IF NOT EXISTS] syntax
+    AssertHelpers.assertThrows("Cannot explicitly create partitions in Iceberg tables",
+            UnsupportedOperationException.class,
+            () -> sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_trunc=2)", tableName));
+  }
+
+  @Test
+  public void testDropPartition() {
+    // only check V2 command [IF EXISTS] syntax
+    AssertHelpers.assertThrows("Cannot explicitly drop partitions in Iceberg tables",
+            UnsupportedOperationException.class,
+            () -> sql("ALTER TABLE %s DROP IF EXISTS PARTITION (id_trunc=0)", tableName));
+  }
+
+  @Test
+  public void testShowPartitions() {

Review comment:
       Tests here should probably  check the return values of these methods, and test on a table with multiple partition values




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r806300891



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +283,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);
+    if (names.length > 0) {
+      StructType schema = partitionSchema();
+      df.collectAsList().forEach(row -> {
+        GenericRowWithSchema genericRow = (GenericRowWithSchema) row.apply(0);
+        boolean exits = true;
+        int index = 0;
+        while (index < names.length) {
+          DataType dataType = schema.apply(names[index]).dataType();

Review comment:
       It looks like we are trying to align the metadata table schema with the current table schema. I think we should still just be displaying metadata table partition values as is but if we choose to go this route I think we have an issue here still. 
   
   Consider a table
   ```
   Add Partition Column Identity (a)
   Remove Partition Column identity (a)
   Drop Column a
   Add Column a
   Add partition Column Identity (a)
   ```
   
   This should result in a row which has multiple "a"'s in the partition spec (at least I believe this is the current behavior). We should make sure we are correctly projecting columns in those cases. I think it is also ok for this just to be a light wrapper around the Metadata Table for Partitions and just list the partitions in the extended schema it provides.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805631668



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#issuecomment-1036944810


   Can you spare time in review @RussellSpitzer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780740957



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +324,77 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    if (!icebergTable.spec().isUnpartitioned()){
+      if (names.length > 0){
+        return new InternalRow[]{ident};
+      }
+      else {
+        String fileFormat = icebergTable.properties()
+                .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+        List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),
+                new Path(icebergTable.location().concat("\\data")), fileFormat);
+        List<InternalRow> rows = Lists.newArrayList();
+        StructType schema = partitionSchema();
+        StructField[] fields = schema.fields();
+        partitions.forEach(p -> {
+          int i = 0;
+          Map<String, String> values = p.getValues();
+          List<Object> dataTypeVal = Lists.newArrayList();
+          while (i < fields.length) {
+            DataType dataType = schema.apply(fields[i].name()).dataType();
+            dataTypeVal.add(Spark3Util.convertPartitionType(values.get(fields[i].name()), dataType));
+            i += 1;
+          }
+          rows.add(new GenericInternalRow(dataTypeVal.toArray()));
+        });
+        return rows.toArray(new InternalRow[0]);
+      }
+    }
+    else{

Review comment:
       also updated, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r792335434



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions

Review comment:
       ok, fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793199489



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       spark sql ShowPartitionsExec Physical plan node for showing partitions，if do not transformation，the test result is follow, so I think should transformation：
   
   1.do not transformation case
   hours(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_hour=429536                               |
   +---------------------------------------------------+
   ```
   days(ts): error message
       `requirement failed: Literal must have a corresponding value to date, but class Date found.`
   months(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_month=588                                 |
   +---------------------------------------------------+
   ```
   years(ts)：
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_year=49                                       |
   +---------------------------------------------------+
   ```
    




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r792337116



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();

Review comment:
       if don't converting, not the desired result. so
   ```
   hours(ts):
       ts_hour=429536
       days(ts):
       requirement failed: Literal must have a corresponding value to date, but class Date found.
       months(ts):
       ts_month=588
       years(ts)
       ts_year=49
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793199489



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       iceberg Timestamps and Dates dataType transform, the hour month year three partition schema datatype are all IntegerType, so spark sql ShowPartitionsExec `row.get(i, dataType)` with int type show.
   ```
   public Type getResultType(Type sourceType) {
       if (granularity == ChronoUnit.DAYS) {
         return Types.DateType.get();
       }
       return Types.IntegerType.get();
     }
   ```
   spark sql ShowPartitionsExec Physical plan node for showing partitions，if do not transformation，the test result is follow, so I think should transformation：
   
   1.do not transformation case
   hours(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_hour=429536                                   |
   +---------------------------------------------------+
   ```
   days(ts): error message
       `requirement failed: Literal must have a corresponding value to date, but class Date found.`
   months(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_month=588                                     |
   +---------------------------------------------------+
   ```
   years(ts)：
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_year=49                                           |
   +---------------------------------------------------+
   ```
    




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r810732223



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +283,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);
+    if (names.length > 0) {
+      StructType schema = partitionSchema();
+      df.collectAsList().forEach(row -> {
+        GenericRowWithSchema genericRow = (GenericRowWithSchema) row.apply(0);
+        boolean exits = true;
+        int index = 0;
+        while (index < names.length) {
+          DataType dataType = schema.apply(names[index]).dataType();

Review comment:
       cc @jackye1995 + @szehon-ho can you take a look.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#issuecomment-1011822539


   updated partition filter  with #3745


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r792335677



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    List<InternalRow> rows = Lists.newArrayList();
+    StructType schema = partitionSchema();
+    StructField[] fields = schema.fields();
+    Map<String, String> partitionFilter = Maps.newHashMap();
+    if (names.length > 0) {
+      int idx = 0;
+      while (idx < names.length) {
+        DataType dataType = schema.apply(names[idx]).dataType();
+        partitionFilter.put(names[idx], String.valueOf(ident.get(idx, dataType)));
+        idx += 1;
+      }
+    }
+    String fileFormat = icebergTable.properties()
+            .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+    List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784091744



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions

Review comment:
       Should also through UnsupportedOpException here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793201558



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       2.do transformation case
   hours(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_hour=2019-01-01-08                                   |
   +---------------------------------------------------+
   ```
   days(ts): 
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_day=2019-01-01                                   |
   +---------------------------------------------------+
   ```
   months(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_month=2019-01                                     |
   +---------------------------------------------------+
   ```
   years(ts)：
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_year=2019                                           |
   +---------------------------------------------------+
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780742434



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -850,4 +865,47 @@ public String unknown(String sourceName, int sourceId, String transform,
       return String.format("%s(%s) %s %s", transform, sourceName, direction, nullOrder);
     }
   }
+
+  public static Object convertPartitionType(Object value, DataType dataType) {
+    if (value == null && dataType instanceof NullType) {
+      return null;
+    }
+    String old = String.valueOf(value);
+    if (dataType instanceof BooleanType) {
+      return DatatypeConverter.parseBoolean(old);
+    }
+    else if (dataType instanceof ByteType) {
+      return DatatypeConverter.parseByte(old);
+    }
+    else if (dataType instanceof ShortType) {
+      return DatatypeConverter.parseShort(old);
+    }
+    else if (dataType instanceof IntegerType) {
+      return DatatypeConverter.parseInt(old);
+    }
+    else if (dataType instanceof DateType) {
+      // days(ts) or date(ts) partition schema DataType
+      return DateTimeUtil.daysFromDate(LocalDate.parse(old));
+    }
+    else if (dataType instanceof FloatType) {
+      return DatatypeConverter.parseFloat(old);
+    }
+    else if (dataType instanceof DoubleType) {
+      return DatatypeConverter.parseDouble(old);
+    }
+    else if (dataType instanceof LongType) {
+      return DatatypeConverter.parseLong(old);
+    }
+    else if (dataType instanceof DecimalType) {
+      return DatatypeConverter.parseDecimal(old);
+    }
+    if (dataType instanceof BinaryType) {
+      return DatatypeConverter.parseHexBinary(old);
+    }
+    else if (dataType instanceof StringType) {
+      return UTF8String.fromString(old);
+    } else {
+      return value;
+    }

Review comment:
       fix




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r792786379



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       I'm not sure I understand the transformation here, what is the goal?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805632664



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/sql/TestPartitionedWrites.java
##########
@@ -176,4 +177,27 @@ public void testViewsReturnRecentResults() {
         ImmutableList.of(row(1L, "a"), row(1L, "a")),
         sql("SELECT * FROM tmp"));
   }
+
+  @Test
+  public void testAddPartition() {
+    // only check V2 command [IF NOT EXISTS] syntax
+    AssertHelpers.assertThrows("Cannot explicitly create partitions in Iceberg tables",
+            UnsupportedOperationException.class,
+            () -> sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_trunc=2)", tableName));
+  }
+
+  @Test
+  public void testDropPartition() {
+    // only check V2 command [IF EXISTS] syntax
+    AssertHelpers.assertThrows("Cannot explicitly drop partitions in Iceberg tables",
+            UnsupportedOperationException.class,
+            () -> sql("ALTER TABLE %s DROP IF EXISTS PARTITION (id_trunc=0)", tableName));
+  }
+
+  @Test
+  public void testShowPartitions() {

Review comment:
       added check the return values.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#issuecomment-1010685579


   cc @rdblue @jackye1995 can you help me to review. thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] smallx commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

smallx commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780741610



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -850,4 +865,47 @@ public String unknown(String sourceName, int sourceId, String transform,
       return String.format("%s(%s) %s %s", transform, sourceName, direction, nullOrder);
     }
   }
+
+  public static Object convertPartitionType(Object value, DataType dataType) {
+    if (value == null && dataType instanceof NullType) {
+      return null;
+    }
+    String old = String.valueOf(value);
+    if (dataType instanceof BooleanType) {
+      return DatatypeConverter.parseBoolean(old);
+    }
+    else if (dataType instanceof ByteType) {
+      return DatatypeConverter.parseByte(old);
+    }
+    else if (dataType instanceof ShortType) {
+      return DatatypeConverter.parseShort(old);
+    }
+    else if (dataType instanceof IntegerType) {
+      return DatatypeConverter.parseInt(old);
+    }
+    else if (dataType instanceof DateType) {
+      // days(ts) or date(ts) partition schema DataType
+      return DateTimeUtil.daysFromDate(LocalDate.parse(old));
+    }
+    else if (dataType instanceof FloatType) {
+      return DatatypeConverter.parseFloat(old);
+    }
+    else if (dataType instanceof DoubleType) {
+      return DatatypeConverter.parseDouble(old);
+    }
+    else if (dataType instanceof LongType) {
+      return DatatypeConverter.parseLong(old);
+    }
+    else if (dataType instanceof DecimalType) {
+      return DatatypeConverter.parseDecimal(old);
+    }
+    if (dataType instanceof BinaryType) {
+      return DatatypeConverter.parseHexBinary(old);
+    }
+    else if (dataType instanceof StringType) {
+      return UTF8String.fromString(old);
+    } else {
+      return value;
+    }

Review comment:
       这几行代码风格又不一致了. 另外, 对于上面的代码, 虽然这样的风格也可以, 但保持和项目中其他风格统一会更好.

##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +324,75 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    if (!icebergTable.spec().isUnpartitioned()){
+      if (names.length > 0){
+        return new InternalRow[]{ident};
+      } else {
+        String fileFormat = icebergTable.properties()
+                .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+        List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),
+                new Path(icebergTable.location().concat("\\data")), fileFormat);
+        List<InternalRow> rows = Lists.newArrayList();
+        StructType schema = partitionSchema();
+        StructField[] fields = schema.fields();
+        partitions.forEach(p -> {
+          int i = 0;
+          Map<String, String> values = p.getValues();
+          List<Object> dataTypeVal = Lists.newArrayList();
+          while (i < fields.length) {
+            DataType dataType = schema.apply(fields[i].name()).dataType();
+            dataTypeVal.add(Spark3Util.convertPartitionType(values.get(fields[i].name()), dataType));
+            i += 1;
+          }
+          rows.add(new GenericInternalRow(dataTypeVal.toArray()));
+        });
+        return rows.toArray(new InternalRow[0]);
+      }
+    } else{

Review comment:
       小虫: `else{` -> `else {`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805112551



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -301,7 +393,7 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
       String snapshotIdFromOptions = options.get(SparkReadOptions.SNAPSHOT_ID);
       String value = snapshotId.toString();
       Preconditions.checkArgument(snapshotIdFromOptions == null || snapshotIdFromOptions.equals(value),
-          "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);
+              "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);

Review comment:
       Looks like the formatting here is still off?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793205518



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -301,7 +393,7 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
       String snapshotIdFromOptions = options.get(SparkReadOptions.SNAPSHOT_ID);
       String value = snapshotId.toString();
       Preconditions.checkArgument(snapshotIdFromOptions == null || snapshotIdFromOptions.equals(value),
-          "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);
+              "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);

Review comment:
       Error operation, Unwanted




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793199489



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       spark sql ShowPartitionsExec Physical plan node for showing partitions，if do not transformation，the test result is follow, so I think should transformation：
   
   1.do not transformation case
   hours(ts):
   ```
   +---------------------------------------------------+
   |partition                                                               |
   +---------------------------------------------------+
   |ts_hour=429536                                                   |
   +---------------------------------------------------+
   ```
   days(ts): error message
       `requirement failed: Literal must have a corresponding value to date, but class Date found.`
   months(ts):
   ```
   +---------------------------------------------------+
   |partition                                                               |
   +---------------------------------------------------+
   |ts_month=588                                                   |
   +---------------------------------------------------+
   ```
   years(ts)：
   ```
   +---------------------------------------------------+
   |partition                                                               |
   +---------------------------------------------------+
   |ts_year=49                                                           |
   +---------------------------------------------------+
   ```
    




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780739988



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -850,4 +865,46 @@ public String unknown(String sourceName, int sourceId, String transform,
       return String.format("%s(%s) %s %s", transform, sourceName, direction, nullOrder);
     }
   }
+
+  public static Object convertPartitionType(Object value, DataType dataType) {
+    if (value == null && dataType instanceof NullType) {
+      return null;
+    }
+    String old = String.valueOf(value);
+    if (dataType instanceof BooleanType) {
+      return DatatypeConverter.parseBoolean(old);
+    }
+    else if (dataType instanceof ByteType) {

Review comment:
       ```
   if (...) {
     ...
   } else if (...) {
     ...
   }
   
   or
   
   if (...) {
     ...
   } else if (...) {
     ...
   }
   else{
     ...
   }
   ```
   above two code styles are well




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r806300891



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +283,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);
+    if (names.length > 0) {
+      StructType schema = partitionSchema();
+      df.collectAsList().forEach(row -> {
+        GenericRowWithSchema genericRow = (GenericRowWithSchema) row.apply(0);
+        boolean exits = true;
+        int index = 0;
+        while (index < names.length) {
+          DataType dataType = schema.apply(names[index]).dataType();

Review comment:
       It looks like we are trying to align the metadata table schema with the current table schema. I think we should still just be displaying metadata table partition values as is but if we choose to go this route I think we have an issue here still. 
   
   Consider a table
   ```
   Add Partition Column Identity (a)
   Remove Partition Column identity (a)
   Drop Column a
   Add Column a
   Add partition Column Identity (a)
   ```
   
   This should result in a row which has multiple "a"'s in the partition spec (at least I believe this is the current behavior). We should make sure we are correctly projecting columns in those cases. I think it is also ok for this just to be a light wrapper around the Metadata Table for Partitions and just list the partitions in the extended schema it provides.

##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +283,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);
+    if (names.length > 0) {
+      StructType schema = partitionSchema();
+      df.collectAsList().forEach(row -> {
+        GenericRowWithSchema genericRow = (GenericRowWithSchema) row.apply(0);
+        boolean exits = true;
+        int index = 0;
+        while (index < names.length) {
+          DataType dataType = schema.apply(names[index]).dataType();

Review comment:
       It looks like we are trying to align the metadata table schema with the current table schema. I think we should still just be displaying metadata table partition values as is but if we choose to go this route I think we have an issue here still. 
   
   Consider a table
   ```
   Add Partition Column Identity (a)
   Remove Partition Column identity (a)
   Drop Column a
   Add Column a
   Add partition Column Identity (a)
   ```
   
   This should result in a row which has multiple "a"'s in the partition spec (at least I believe this is the current behavior). We should make sure we are correctly projecting columns in those cases. I think it is also ok for this just to be a light wrapper around the Metadata Table for Partitions and just list the partitions in the extended schema it provides.
   
   I guess this may be a little odd for unpartitioned tables since they may still show that partitions do exist but this is probably more accurate ...
   @jackye1995 + @szehon-ho Any thoughts?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r792784795



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -301,7 +393,7 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
       String snapshotIdFromOptions = options.get(SparkReadOptions.SNAPSHOT_ID);
       String value = snapshotId.toString();
       Preconditions.checkArgument(snapshotIdFromOptions == null || snapshotIdFromOptions.equals(value),
-          "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);
+              "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);

Review comment:
       stray whitespace change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793199489



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       spark sql ShowPartitionsExec Physical plan node for showing partitions，if do not transformation，the test result is follow, so I think should transformation：
   
   1.do not transformation case
   hours(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_hour=429536                           |
   +---------------------------------------------------+
   ```
   days(ts): error message
       `requirement failed: Literal must have a corresponding value to date, but class Date found.`
   months(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_month=588                             |
   +---------------------------------------------------+
   ```
   years(ts)：
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_year=49                                   |
   +---------------------------------------------------+
   ```
    




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r836857611



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -283,6 +294,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);

Review comment:
       I still think we should basically be returning this object collected and transformed into an array internal rows




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] szehon-ho commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

szehon-ho commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r813470752



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +283,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);
+    if (names.length > 0) {
+      StructType schema = partitionSchema();
+      df.collectAsList().forEach(row -> {
+        GenericRowWithSchema genericRow = (GenericRowWithSchema) row.apply(0);
+        boolean exits = true;
+        int index = 0;
+        while (index < names.length) {
+          DataType dataType = schema.apply(names[index]).dataType();

Review comment:
       Yea I think if we can have it be a light wrapper for metadata 'partition' table, it will be better (instead of Partioning.partitionType to get the schma), is it possible here?  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780741049



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -850,4 +865,46 @@ public String unknown(String sourceName, int sourceId, String transform,
       return String.format("%s(%s) %s %s", transform, sourceName, direction, nullOrder);
     }
   }
+
+  public static Object convertPartitionType(Object value, DataType dataType) {
+    if (value == null && dataType instanceof NullType) {
+      return null;
+    }
+    String old = String.valueOf(value);
+    if (dataType instanceof BooleanType) {
+      return DatatypeConverter.parseBoolean(old);
+    }
+    else if (dataType instanceof ByteType) {

Review comment:
       added this code, thanks
   ```
   } else {
         return value;
       }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780742455



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +324,75 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    if (!icebergTable.spec().isUnpartitioned()){
+      if (names.length > 0){
+        return new InternalRow[]{ident};
+      } else {
+        String fileFormat = icebergTable.properties()
+                .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+        List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),
+                new Path(icebergTable.location().concat("\\data")), fileFormat);
+        List<InternalRow> rows = Lists.newArrayList();
+        StructType schema = partitionSchema();
+        StructField[] fields = schema.fields();
+        partitions.forEach(p -> {
+          int i = 0;
+          Map<String, String> values = p.getValues();
+          List<Object> dataTypeVal = Lists.newArrayList();
+          while (i < fields.length) {
+            DataType dataType = schema.apply(fields[i].name()).dataType();
+            dataTypeVal.add(Spark3Util.convertPartitionType(values.get(fields[i].name()), dataType));
+            i += 1;
+          }
+          rows.add(new GenericInternalRow(dataTypeVal.toArray()));
+        });
+        return rows.toArray(new InternalRow[0]);
+      }
+    } else{

Review comment:
       fix




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784091487



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions

Review comment:
       I think we should throw an Unsupported Op Exception here, ie "Cannot explicitly create partitions in Iceberg tables" or something like that




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784094937



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    List<InternalRow> rows = Lists.newArrayList();
+    StructType schema = partitionSchema();
+    StructField[] fields = schema.fields();
+    Map<String, String> partitionFilter = Maps.newHashMap();
+    if (names.length > 0) {
+      int idx = 0;
+      while (idx < names.length) {
+        DataType dataType = schema.apply(names[idx]).dataType();
+        partitionFilter.put(names[idx], String.valueOf(ident.get(idx, dataType)));
+        idx += 1;
+      }
+    }
+    String fileFormat = icebergTable.properties()
+            .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+    List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),

Review comment:
       This method is just for getting the "partitions" of a file based table. Iceberg should be using its own internal "partitions" table for this info. See https://github.com/apache/iceberg/blob/f932c55fd8c07c875984889ecfbb1fd7c219f726/core/src/main/java/org/apache/iceberg/PartitionsTable.java#L35




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805112473



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");

Review comment:
       I don't understand why we can't just display "Partition" here using the schema of this field of the DF




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805112582



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +405,29 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  private static Object[] transform(StructType schema, Object[] values) {
+    Object[] resultValues = new Object[values.length];

Review comment:
       I don't think we need this, I believe we should be able to just use the rows as they come out of the Partition metadata table




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r793199489



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +290,80 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Types.StructType structType = Partitioning.partitionType(table());
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(structType.fields().size());
+    structType.fields().forEach(nestedField -> {
+      if (nestedField.name().endsWith("hour") ||
+              nestedField.name().endsWith("month") ||
+              nestedField.name().endsWith("year")) {
+        structFields.add(Types.NestedField.optional(nestedField.fieldId(), nestedField.name(), Types.StringType.get()));
+      } else {
+        // do nothing
+        structFields.add(nestedField);
+      }
+    });
+
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS)
+            .select("partition");
+    StructType schema = partitionSchema();
+    if (names.length > 0) {

Review comment:
       spark sql ShowPartitionsExec Physical plan node for showing partitions，if do not transformation，the test result is follow, so I think should transformation：
   
   1.do not transformation case
   hours(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_hour=429536                                   |
   +---------------------------------------------------+
   ```
   days(ts): error message
       `requirement failed: Literal must have a corresponding value to date, but class Date found.`
   months(ts):
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_month=588                                     |
   +---------------------------------------------------+
   ```
   years(ts)：
   ```
   +---------------------------------------------------+
   |partition                                       |
   +---------------------------------------------------+
   |ts_year=49                                           |
   +---------------------------------------------------+
   ```
    




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r792335542



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r839327732



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -283,6 +294,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);

Review comment:
       I think so, will this PR continue to advance?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784100319



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();

Review comment:
       we should probably get this from the Partitions Metadata table as well rather than converting it again here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r784094937



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +284,67 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    List<InternalRow> rows = Lists.newArrayList();
+    StructType schema = partitionSchema();
+    StructField[] fields = schema.fields();
+    Map<String, String> partitionFilter = Maps.newHashMap();
+    if (names.length > 0) {
+      int idx = 0;
+      while (idx < names.length) {
+        DataType dataType = schema.apply(names[idx]).dataType();
+        partitionFilter.put(names[idx], String.valueOf(ident.get(idx, dataType)));
+        idx += 1;
+      }
+    }
+    String fileFormat = icebergTable.properties()
+            .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+    List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),

Review comment:
       This method is just for getting the "partitions" of a file based table. Iceberg should be using its own internal "partitions" table for this info. See https://github.com/apache/iceberg/blob/f932c55fd8c07c875984889ecfbb1fd7c219f726/core/src/main/java/org/apache/iceberg/PartitionsTable.java#L35
   
   Since this is spark we should be using 
   
   https://github.com/RussellSpitzer/iceberg/blob/56839c8c07b83edcfd11765ffb7303da811a65fc/spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/SparkTableUtil.java#L610
   
   To use the distributed versions of this code.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780749274



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/sql/TestPartitionedWrites.java
##########
@@ -170,10 +170,25 @@ public void testViewsReturnRecentResults() {
         ImmutableList.of(row(1L, "a")),
         sql("SELECT * FROM tmp"));
 
-    sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
+    sql("INSERT INTO TABLE %s VALUES (1, 'a', CURRENT_TIMESTAMP())", tableName);
 
     assertEquals("View should have expected rows",
         ImmutableList.of(row(1L, "a"), row(1L, "a")),
         sql("SELECT * FROM tmp"));
   }
+
+  @Test
+  public void testAddPartition() {
+    sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);
+  }
+
+  @Test
+  public void testDropPartition() {
+    sql("ALTER TABLE %s DROP IF EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r810870656



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -301,7 +393,7 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
       String snapshotIdFromOptions = options.get(SparkReadOptions.SNAPSHOT_ID);
       String value = snapshotId.toString();
       Preconditions.checkArgument(snapshotIdFromOptions == null || snapshotIdFromOptions.equals(value),
-          "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);
+              "Cannot override snapshot ID more than once: %s", snapshotIdFromOptions);

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780749222



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/sql/TestPartitionedWrites.java
##########
@@ -170,10 +170,25 @@ public void testViewsReturnRecentResults() {
         ImmutableList.of(row(1L, "a")),
         sql("SELECT * FROM tmp"));
 
-    sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
+    sql("INSERT INTO TABLE %s VALUES (1, 'a', CURRENT_TIMESTAMP())", tableName);
 
     assertEquals("View should have expected rows",
         ImmutableList.of(row(1L, "a"), row(1L, "a")),
         sql("SELECT * FROM tmp"));
   }
+
+  @Test
+  public void testAddPartition() {
+    sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);

Review comment:
       fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780740945



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +324,77 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    if (!icebergTable.spec().isUnpartitioned()){
+      if (names.length > 0){
+        return new InternalRow[]{ident};
+      }
+      else {

Review comment:
       updated, thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] smallx commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

smallx commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r780732820



##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/sql/TestPartitionedWrites.java
##########
@@ -170,10 +170,25 @@ public void testViewsReturnRecentResults() {
         ImmutableList.of(row(1L, "a")),
         sql("SELECT * FROM tmp"));
 
-    sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
+    sql("INSERT INTO TABLE %s VALUES (1, 'a', CURRENT_TIMESTAMP())", tableName);
 
     assertEquals("View should have expected rows",
         ImmutableList.of(row(1L, "a"), row(1L, "a")),
         sql("SELECT * FROM tmp"));
   }
+
+  @Test
+  public void testAddPartition() {
+    sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);
+  }
+
+  @Test
+  public void testDropPartition() {
+    sql("ALTER TABLE %s DROP IF EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);

Review comment:
       Same as above.

##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +324,77 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    if (!icebergTable.spec().isUnpartitioned()){
+      if (names.length > 0){
+        return new InternalRow[]{ident};
+      }
+      else {

Review comment:
       code style:
   ```java
   } else {
   ```

##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +324,77 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  @Override
+  public StructType partitionSchema() {
+    Schema schema = icebergTable.spec().schema();
+    List<PartitionField> fields = icebergTable.spec().fields();
+    List<Types.NestedField> structFields = Lists.newArrayListWithExpectedSize(fields.size());
+    fields.forEach(f -> {
+      Type resultType = Types.StringType.get();
+      Type sourceType = schema.findType(f.sourceId());
+      if (!f.name().endsWith("hour") && !f.name().endsWith("month")) {
+        resultType = f.transform().getResultType(sourceType);
+      }
+      structFields.add(Types.NestedField.optional(f.fieldId(), f.name(), resultType));
+    });
+    return (StructType) SparkSchemaUtil.convert(Types.StructType.of(structFields));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    // use Iceberg SQL extensions
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    // use Iceberg SQL extensions
+    return false;
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support [show partitions] syntax
+    if (!icebergTable.spec().isUnpartitioned()){
+      if (names.length > 0){
+        return new InternalRow[]{ident};
+      }
+      else {
+        String fileFormat = icebergTable.properties()
+                .getOrDefault(TableProperties.DEFAULT_FILE_FORMAT, TableProperties.DEFAULT_FILE_FORMAT_DEFAULT);
+        List<SparkTableUtil.SparkPartition> partitions = Spark3Util.getPartitions(sparkSession(),
+                new Path(icebergTable.location().concat("\\data")), fileFormat);
+        List<InternalRow> rows = Lists.newArrayList();
+        StructType schema = partitionSchema();
+        StructField[] fields = schema.fields();
+        partitions.forEach(p -> {
+          int i = 0;
+          Map<String, String> values = p.getValues();
+          List<Object> dataTypeVal = Lists.newArrayList();
+          while (i < fields.length) {
+            DataType dataType = schema.apply(fields[i].name()).dataType();
+            dataTypeVal.add(Spark3Util.convertPartitionType(values.get(fields[i].name()), dataType));
+            i += 1;
+          }
+          rows.add(new GenericInternalRow(dataTypeVal.toArray()));
+        });
+        return rows.toArray(new InternalRow[0]);
+      }
+    }
+    else{

Review comment:
       Same as above.

##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/Spark3Util.java
##########
@@ -850,4 +865,46 @@ public String unknown(String sourceName, int sourceId, String transform,
       return String.format("%s(%s) %s %s", transform, sourceName, direction, nullOrder);
     }
   }
+
+  public static Object convertPartitionType(Object value, DataType dataType) {
+    if (value == null && dataType instanceof NullType) {
+      return null;
+    }
+    String old = String.valueOf(value);
+    if (dataType instanceof BooleanType) {
+      return DatatypeConverter.parseBoolean(old);
+    }
+    else if (dataType instanceof ByteType) {

Review comment:
       The code style should be consistent with the existing code:
   ```java
   if (...) {
     ...
   } else if (...) {
     ...
   }
   ```

##########
File path: spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/sql/TestPartitionedWrites.java
##########
@@ -170,10 +170,25 @@ public void testViewsReturnRecentResults() {
         ImmutableList.of(row(1L, "a")),
         sql("SELECT * FROM tmp"));
 
-    sql("INSERT INTO TABLE %s VALUES (1, 'a')", tableName);
+    sql("INSERT INTO TABLE %s VALUES (1, 'a', CURRENT_TIMESTAMP())", tableName);
 
     assertEquals("View should have expected rows",
         ImmutableList.of(row(1L, "a"), row(1L, "a")),
         sql("SELECT * FROM tmp"));
   }
+
+  @Test
+  public void testAddPartition() {
+    sql("ALTER TABLE %s ADD IF NOT EXISTS PARTITION (id_bucket=2, data_trunc='2022', ts_hour='2022-01-08-23')", tableName);

Review comment:
       It may be better to check the newly added partition.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] felixYyu commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

felixYyu commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r805632126



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -313,4 +405,29 @@ private static CaseInsensitiveStringMap addSnapshotId(CaseInsensitiveStringMap o
 
     return options;
   }
+
+  private static Object[] transform(StructType schema, Object[] values) {
+    Object[] resultValues = new Object[values.length];

Review comment:
       deleted this code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a change in pull request #3862: Spark: Supports partition management in V2 Catalog

Posted by GitBox <gi...@apache.org>.

RussellSpitzer commented on a change in pull request #3862:
URL: https://github.com/apache/iceberg/pull/3862#discussion_r806300891



##########
File path: spark/v3.2/spark/src/main/java/org/apache/iceberg/spark/source/SparkTable.java
##########
@@ -272,6 +283,65 @@ public void deleteWhere(Filter[] filters) {
     }
   }
 
+  @Override
+  public StructType partitionSchema() {
+    return (StructType) SparkSchemaUtil.convert(Partitioning.partitionType(table()));
+  }
+
+  @Override
+  public void createPartition(InternalRow ident, Map<String, String> properties) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Cannot explicitly create partitions in Iceberg tables");
+  }
+
+  @Override
+  public boolean dropPartition(InternalRow ident) {
+    throw new UnsupportedOperationException("Cannot explicitly drop partitions in Iceberg tables");
+  }
+
+  @Override
+  public void replacePartitionMetadata(InternalRow ident, Map<String, String> properties)
+          throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public Map<String, String> loadPartitionMetadata(InternalRow ident) throws UnsupportedOperationException {
+    throw new UnsupportedOperationException("Iceberg partitions do not support metadata");
+  }
+
+  @Override
+  public InternalRow[] listPartitionIdentifiers(String[] names, InternalRow ident) {
+    // support show partitions
+    List<InternalRow> rows = Lists.newArrayList();
+    Dataset<Row> df = SparkTableUtil.loadMetadataTable(sparkSession(), icebergTable, MetadataTableType.PARTITIONS);
+    if (names.length > 0) {
+      StructType schema = partitionSchema();
+      df.collectAsList().forEach(row -> {
+        GenericRowWithSchema genericRow = (GenericRowWithSchema) row.apply(0);
+        boolean exits = true;
+        int index = 0;
+        while (index < names.length) {
+          DataType dataType = schema.apply(names[index]).dataType();

Review comment:
       It looks like we are trying to align the metadata table schema with the current table schema. I think we should still just be displaying metadata table partition values as is but if we choose to go this route I think we have an issue here still. 
   
   Consider a table
   ```
   Add Partition Column Identity (a)
   Remove Partition Column identity (a)
   Drop Column a
   Add Column a
   Add partition Column Identity (a)
   ```
   
   This should result in a row which has multiple "a"'s in the partition spec (at least I believe this is the current behavior). We should make sure we are correctly projecting columns in those cases. I think it is also ok for this just to be a light wrapper around the Metadata Table for Partitions and just list the partitions in the extended schema it provides.
   
   I guess this may be a little odd for unpartitioned tables since they may still show that partitions do exist but this is probably more accurate ...
   @jackye1995 + @szehon-ho Any thoughts?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org