You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/07/28 20:23:14 UTC

[GitHub] [iceberg] jackye1995 opened a new pull request #2887: Core: allow creating v2 tables through table property

jackye1995 opened a new pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887


   Based on the discussion on dev list, it would be good to have a way to directly create experimental v2 tables. This is my current proposal:
   
   When `format-version` is specifying in the table properties section of a CREATE TABLE DDL, we use the specific format version, otherwise use default version
   
   Some considerations I had:
   1. `format-version` is not a part of `TableProperties`, but just a property used at table creation time. The value is never stored as actual table property, and updating this property has no effect on table version.
   2. Because of 1, we do not use a traditional property syntax like `format.version` but instead use `table-format` to distinguish it from other actual table properties.
   3. I am trying to make this a long-term solution for future format versions, so I made the property a public variable that can be imported.
   4. The table properties section of a CREATE TABLE DDL is used because it exists in all SQL dialects, which means that all engines can have this functionality to test experimental feature through this change.
   
   It is a bit hard to test this in unit test because format version is not publicly accessible. I have tested with Spark on EMR and manually verified that the metadata file shows the correct version.
   
   @rdblue @openinx 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#issuecomment-892030925


   @rdblue fixed based on your comments, should be good for another look


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681976935



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -57,31 +58,29 @@
 
   private static final long ONE_MINUTE = TimeUnit.MINUTES.toMillis(1);
 
-  /**
-   * @deprecated will be removed in 0.9.0; use newTableMetadata(Schema, PartitionSpec, String, Map) instead.
-   */
-  @Deprecated
-  public static TableMetadata newTableMetadata(TableOperations ops,
-                                               Schema schema,
-                                               PartitionSpec spec,
-                                               String location,
-                                               Map<String, String> properties) {
-    return newTableMetadata(schema, spec, SortOrder.unsorted(), location, properties, DEFAULT_TABLE_FORMAT_VERSION);
-  }
-
   public static TableMetadata newTableMetadata(Schema schema,
                                                PartitionSpec spec,
                                                SortOrder sortOrder,
                                                String location,
                                                Map<String, String> properties) {
-    return newTableMetadata(schema, spec, sortOrder, location, properties, DEFAULT_TABLE_FORMAT_VERSION);
+    return newTableMetadata(schema, spec, sortOrder, location, unreservedProperties(properties),
+        PropertyUtil.propertyAsInt(properties, TableProperties.RESERVED_PROPERTY_FORMAT_VERSION,
+            DEFAULT_TABLE_FORMAT_VERSION));
   }
 
   public static TableMetadata newTableMetadata(Schema schema,
                                                PartitionSpec spec,
                                                String location,
                                                Map<String, String> properties) {
-    return newTableMetadata(schema, spec, SortOrder.unsorted(), location, properties, DEFAULT_TABLE_FORMAT_VERSION);
+    return newTableMetadata(schema, spec, SortOrder.unsorted(), location, unreservedProperties(properties),
+        PropertyUtil.propertyAsInt(properties, TableProperties.RESERVED_PROPERTY_FORMAT_VERSION,

Review comment:
       nit: I think defining some helper variables will make it a little bit easier to read. I usually try to avoid splitting code over multiple lines if possible.
   
   ```
   SortOrder sortOrder = SortOrder.unsorted();
   int formatVersion = PropertyUtil.propertyAsInt(properties,
       TableProperties.RESERVED_PROPERTY_FORMAT_VERSION,
       DEFAULT_TABLE_FORMAT_VERSION);
   return newTableMetadata(schema, spec, sortOrder, location, unreservedProperties(properties), formatVersion); 
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
openinx commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r679769740



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -679,10 +700,23 @@ private TableMetadata setCurrentSnapshotTo(Snapshot snapshot) {
 
   public TableMetadata replaceProperties(Map<String, String> newProperties) {
     ValidationException.check(newProperties != null, "Cannot set properties to null");
-    return new TableMetadata(null, formatVersion, uuid, location,
+    Map<String, String> validProperties = newProperties;
+    int newFormatVersion = formatVersion;
+    if (newProperties.containsKey(PROPERTY_FORMAT_VERSION)) {
+      validProperties = Maps.newHashMap(newProperties);
+      newFormatVersion = Integer.parseInt(validProperties.remove(PROPERTY_FORMAT_VERSION));
+    }

Review comment:
       I see lots of this code block in this PR, I'd like to have a small abstracted static method:
   
   ```java
     private static int parseFormatVersion(Map<String, String> properties, int defaultFormatVersion) {
       return properties.containsKey(PROPERTY_FORMAT_VERSION) ?
           Integer.parseInt(properties.remove(PROPERTY_FORMAT_VERSION)) : defaultFormatVersion;
     }
   ```
   
   Then we could simplify this block as: 
   
   ```java
       Map<String, String> validProperties = Maps.newHashMap(newProperties);
       int newFormatVersion = parseFormatVersion(validProperties, formatVersion);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681979804



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.Set;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: incomplete or unstable versions cannot be selected using this property.
+   */
+  public static final String RESERVED_PROPERTY_FORMAT_VERSION = "format-version";

Review comment:
       nit: Do we add the `RESERVED_PROPERTY_` prefix to be explicit? I am wondering whether it is worth making the var name longer as we have `RESERVED_PROPERTIES` below.
   
   Just asking. I am fine either way.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r682133840



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.Set;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: incomplete or unstable versions cannot be selected using this property.

Review comment:
       I don't think that this is actually enforced. Should we add a test and validate the versions that come from table properties?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680574231



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -679,10 +700,23 @@ private TableMetadata setCurrentSnapshotTo(Snapshot snapshot) {
 
   public TableMetadata replaceProperties(Map<String, String> newProperties) {
     ValidationException.check(newProperties != null, "Cannot set properties to null");
-    return new TableMetadata(null, formatVersion, uuid, location,
+    Map<String, String> validProperties = newProperties;
+    int newFormatVersion = formatVersion;
+    if (newProperties.containsKey(PROPERTY_FORMAT_VERSION)) {
+      validProperties = Maps.newHashMap(newProperties);
+      newFormatVersion = Integer.parseInt(validProperties.remove(PROPERTY_FORMAT_VERSION));
+    }
+
+    TableMetadata metadata = new TableMetadata(null, formatVersion, uuid, location,
         lastSequenceNumber, System.currentTimeMillis(), lastColumnId, currentSchemaId, schemas, defaultSpecId, specs,
-        lastAssignedPartitionId, defaultSortOrderId, sortOrders, newProperties, currentSnapshotId, snapshots,
-        snapshotLog, addPreviousFile(file, lastUpdatedMillis, newProperties));
+        lastAssignedPartitionId, defaultSortOrderId, sortOrders, validProperties, currentSnapshotId, snapshots,
+        snapshotLog, addPreviousFile(file, lastUpdatedMillis, validProperties));
+
+    if (formatVersion != newFormatVersion) {
+      metadata = metadata.upgradeToFormatVersion(newFormatVersion);

Review comment:
       It seems a bit weird to call another instance method to produce a second `TableMetadata` here. I'm thinking that maybe the builder pattern would work better for complicated cases... Seems fine for now though.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680571693



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -47,6 +47,19 @@
  * Metadata for a table.
  */
 public class TableMetadata implements Serializable {
+
+  /**
+   * Iceberg always use the latest stable format version when creating a new table.

Review comment:
       This isn't correct after this commit. I think it should be "Reserved table property for table format version. Iceberg will default a new table's format version to the latest stable and recommended version."
   
   Also, I think this should note that incomplete or unstable versions cannot be selected using this property. When we add v3, this should not be able to set it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681897381



##########
File path: flink/src/test/java/org/apache/iceberg/flink/TestFlinkCatalogTable.java
##########
@@ -239,6 +241,29 @@ public void testCreatePartitionTable() throws TableNotExistException {
     Assert.assertEquals(Collections.singletonList("dt"), catalogTable.getPartitionKeys());
   }
 
+  @Test
+  public void testCreateTableWithFormatV2ThroughTableProperty() throws Exception {
+    sql("CREATE TABLE tl(id BIGINT) WITH ('format-version'='2')");
+
+    Table table = table("tl");
+    Assert.assertEquals("should create table using format v2",
+        2, ((BaseTable) table).operations().current().formatVersion());
+  }
+
+  @Test
+  public void testUpgradeTableWithFormatV2ThroughTableProperty() throws Exception {

Review comment:
       It would be nice to have a test that validates you can't downgrade a table using the table property.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681883365



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.List;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableList;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: incomplete or unstable versions cannot be selected using this property.
+   */
+  public static final String RESERVED_PROPERTY_FORMAT_VERSION = "format-version";
+
+  /**
+   * Reserved Iceberg table properties list.
+   * <p>
+   * Reserved table properties are only used to control behaviors during the construction of table metadata.
+   * The value of these properties are not persisted as a part of the table metadata.
+   */
+  public static final List<String> RESERVED_PROPERTIES = ImmutableList.of(

Review comment:
       The code to filter out reserved properties calls `contains`, so I think this should be an `ImmutableSet` instead of a list.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680571762



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -47,6 +47,19 @@
  * Metadata for a table.
  */
 public class TableMetadata implements Serializable {
+
+  /**
+   * Iceberg always use the latest stable format version when creating a new table.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: If you set the Iceberg format to an unreleased version, software stability is not guaranteed.

Review comment:
       This reserved property should not allow setting the format version to anything other than a stable and adopted version.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r682133599



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.Set;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.

Review comment:
       Nit: this sentence conflicts with the note that the property isn't used for incomplete or unstable versions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681882937



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -752,9 +760,15 @@ public TableMetadata buildReplacement(Schema updatedSchema, PartitionSpec update
       sortOrdersBuilder.add(freshSortOrder);
     }
 
-    Map<String, String> newProperties = Maps.newHashMap();
-    newProperties.putAll(this.properties);
-    newProperties.putAll(updatedProperties);
+    // prepare new table properties
+    Map<String, String> newRawProperties = Maps.newHashMap();
+    newRawProperties.putAll(this.properties);
+    newRawProperties.putAll(updatedProperties);
+    Map<String, String> newProperties = unreservedProperties(newRawProperties);

Review comment:
       You could have `unreservedProperties` return something else, like a filtered `Stream` if you wanted to still reuse it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r682134272



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.Set;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: incomplete or unstable versions cannot be selected using this property.
+   */
+  public static final String FORMAT_VERSION = "format-version";
+
+  /**
+   * Reserved Iceberg table properties list.
+   * <p>
+   * Reserved table properties are only used to control behaviors during the construction of table metadata.

Review comment:
       Is this correct? I think that the properties are used when constructing table metadata, but also when updating it like in `replaceProperties`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r679450041



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -47,6 +47,19 @@
  * Metadata for a table.
  */
 public class TableMetadata implements Serializable {
+
+  /**
+   * Iceberg always use the latest stable format version when creating a new table.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: If you set the Iceberg format to an unreleased version, software stability is not guaranteed.
+   */
+  public static final String PROPERTY_FORMAT_VERSION = "format-version";

Review comment:
       Yes you are right, updated!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#issuecomment-888718478


   > If we are going to set the table version through a reserved property, then I think we should build the reserved property into TableMetadata so it is consistent across all catalogs.
   
   @rdblue thanks for the comment! I was thinking to only allow people to create table in this way and update property will not have any effect, that's why I did it in the table builder directly. But now I think about it I realize it will also be beneficial to use this syntax to migrate existing tables to v2:
   
   ```sql
   ALTER TABLE SET TBLPROPERTIES ('format-version'='2')
   ```
   
   So it does make sense to do it directly in `TableMetadata`. I have updated the code and also tested with Spark to verify it works, please let me know if there is any other concern.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
openinx commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r679780479



##########
File path: core/src/test/java/org/apache/iceberg/TestTableMetadata.java
##########
@@ -756,4 +756,54 @@ public void testUpdateSchema() {
     Assert.assertEquals("Should return expected last column id",
         6, threeSchemaTable.lastColumnId());
   }
+
+  @Test
+  public void testCreateV2MetadataThroughTableProperty() {

Review comment:
       > It is a bit hard to test this in unit test because format version is not publicly accessible. I have tested with Spark on EMR and manually verified that the metadata file shows the correct version.
   
   I think we could access the format version by using the following code: 
   
   ```java
   Table table = ...
   TableOperations ops = ((BaseTable) table).operations();
   TableMetadata meta = ops.current();
   int formatVersion = meta.formatVersion();
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681881767



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -764,7 +778,7 @@ public TableMetadata buildReplacement(Schema updatedSchema, PartitionSpec update
       schemasBuilder.add(new Schema(freshSchemaId, freshSchema.columns(), freshSchema.identifierFieldIds()));
     }
 
-    return new TableMetadata(null, formatVersion, uuid, newLocation,
+    return new TableMetadata(null, newFormatVersion, uuid, newLocation,

Review comment:
       I think that this also needs to call `upgradeToFormatVersion` to ensure that there is no version rollback.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680572652



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -90,6 +103,14 @@ static TableMetadata newTableMetadata(Schema schema,
                                         String location,
                                         Map<String, String> properties,
                                         int formatVersion) {
+    // check if there is format version override in properties
+    Map<String, String> validProperties = properties;

Review comment:
       It would be better to rename the method argument to result in fewer changes and minimize the chance that a reference was missed. Also, if we are introducing reserved table properties, then we should make this more generic, like this:
   
   ```java
       Map<String, String> requestedProperties = rawProperties.entrySet().stream()
           .filter(entry -> !RESERVED_PROPERTIES.contains(entry.getKey()))
           .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
       int formatVersion = PropertyUtil.propertyAsInt(
           rawProperties, PROPERTY_FORMAT_VERSION, DEFAULT_TABLE_FORMAT_VERSION);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680573143



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -90,6 +103,14 @@ static TableMetadata newTableMetadata(Schema schema,
                                         String location,
                                         Map<String, String> properties,
                                         int formatVersion) {
+    // check if there is format version override in properties
+    Map<String, String> validProperties = properties;
+    int formatVersionOverride = formatVersion;

Review comment:
       I think this needs to handle the conflict between `formatVersion` and `properties`. Since all of these methods are internal, we can change them however we like, but we should not have 2 ways to set the table version. I think there are two options:
   
   1. Set `tableVersion` from properties in the 2 overrides and validate that properties does not contain `format-version` here.
   2. Move the implementation to the override above that doesn't accept `formatVersion`. Then update this override to convert `formatVersion` to a property and validate that the properties passed here don't contain `format-version`.
   
   I would go for option 2. That makes handling a set of reserved properties cleaner.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680573474



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -90,6 +103,14 @@ static TableMetadata newTableMetadata(Schema schema,
                                         String location,
                                         Map<String, String> properties,
                                         int formatVersion) {
+    // check if there is format version override in properties
+    Map<String, String> validProperties = properties;
+    int formatVersionOverride = formatVersion;

Review comment:
       One more option I just thought about. I've been considering adding a builder here. We could use a builder and then set some validation so that `formatVersion` can only be configured once.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r682142503



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.Set;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: incomplete or unstable versions cannot be selected using this property.

Review comment:
       This is currently enforced because `SUPPORTED_TABLE_FORMAT_VERSION` is 2 and there are no unsupported versions. When we introduce v3, I think we should add a flag to `upgradeToFormatVersion` that controls whether to do the `SUPPORTED_TABLE_FORMAT_VERSION` check. That way the existing check validates this and there is a way in the future to "manually" update the format version to unsupported versions for testing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#issuecomment-888630004


   Thanks for working on this, @jackye1995!
   
   I think we want this to be more general than doing it in `BaseMetastoreCatalog`. If we are going to set the table version through a reserved property, then I think we should build the reserved property into `TableMetadata` so it is consistent across all catalogs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681329433



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -47,6 +48,7 @@
  * Metadata for a table.
  */
 public class TableMetadata implements Serializable {
+

Review comment:
       Nit: unnecessary whitespace change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681329287



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -57,31 +59,29 @@
 
   private static final long ONE_MINUTE = TimeUnit.MINUTES.toMillis(1);
 
-  /**
-   * @deprecated will be removed in 0.9.0; use newTableMetadata(Schema, PartitionSpec, String, Map) instead.

Review comment:
       Thanks for removing this!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681879159



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -57,31 +58,29 @@
 
   private static final long ONE_MINUTE = TimeUnit.MINUTES.toMillis(1);
 
-  /**
-   * @deprecated will be removed in 0.9.0; use newTableMetadata(Schema, PartitionSpec, String, Map) instead.
-   */
-  @Deprecated
-  public static TableMetadata newTableMetadata(TableOperations ops,
-                                               Schema schema,
-                                               PartitionSpec spec,
-                                               String location,
-                                               Map<String, String> properties) {
-    return newTableMetadata(schema, spec, SortOrder.unsorted(), location, properties, DEFAULT_TABLE_FORMAT_VERSION);
-  }
-
   public static TableMetadata newTableMetadata(Schema schema,
                                                PartitionSpec spec,
                                                SortOrder sortOrder,
                                                String location,
                                                Map<String, String> properties) {
-    return newTableMetadata(schema, spec, sortOrder, location, properties, DEFAULT_TABLE_FORMAT_VERSION);
+    return newTableMetadata(schema, spec, sortOrder, location, unreservedProperties(properties),

Review comment:
       I think that the `newTableMetadata` method that accepts `formatVersion` needs to ensure that there is no `format-version` property set. Otherwise, there is still a case where someone can set both.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
openinx commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r679774697



##########
File path: core/src/test/java/org/apache/iceberg/TestTableMetadata.java
##########
@@ -756,4 +756,54 @@ public void testUpdateSchema() {
     Assert.assertEquals("Should return expected last column id",
         6, threeSchemaTable.lastColumnId());
   }
+
+  @Test
+  public void testCreateV2MetadataThroughTableProperty() {

Review comment:
       Should we also add the unit tests in flink/spark/hive module to verify the end-to-end DDL work ?  If so, I think we could make it to be a separate PR for this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue merged pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680573814



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -679,10 +700,23 @@ private TableMetadata setCurrentSnapshotTo(Snapshot snapshot) {
 
   public TableMetadata replaceProperties(Map<String, String> newProperties) {
     ValidationException.check(newProperties != null, "Cannot set properties to null");
-    return new TableMetadata(null, formatVersion, uuid, location,
+    Map<String, String> validProperties = newProperties;
+    int newFormatVersion = formatVersion;
+    if (newProperties.containsKey(PROPERTY_FORMAT_VERSION)) {
+      validProperties = Maps.newHashMap(newProperties);
+      newFormatVersion = Integer.parseInt(validProperties.remove(PROPERTY_FORMAT_VERSION));
+    }

Review comment:
       We also already have `PropertyUtil` that you can use to make this one line.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681981826



##########
File path: core/src/main/java/org/apache/iceberg/TableProperties.java
##########
@@ -19,11 +19,38 @@
 
 package org.apache.iceberg;
 
+import java.util.Set;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet;
+
 public class TableProperties {
 
   private TableProperties() {
   }
 
+  /**
+   * Reserved table property for table format version.
+   * <p>
+   * Iceberg will default a new table's format version to the latest stable and recommended version.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: incomplete or unstable versions cannot be selected using this property.
+   */
+  public static final String RESERVED_PROPERTY_FORMAT_VERSION = "format-version";

Review comment:
       Yeah that's a good point, I had the prefix because there was no `RESERVED_PROPERTIES` set. But now it becomes a bit redundant.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r680572652



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -90,6 +103,14 @@ static TableMetadata newTableMetadata(Schema schema,
                                         String location,
                                         Map<String, String> properties,
                                         int formatVersion) {
+    // check if there is format version override in properties
+    Map<String, String> validProperties = properties;

Review comment:
       It would be better to rename the method argument (maybe `rawProperties`) to result in fewer changes and minimize the chance that a reference was missed. Also, if we are introducing reserved table properties, then we should make this more generic, like this:
   
   ```java
       Map<String, String> requestedProperties = rawProperties.entrySet().stream()
           .filter(entry -> !RESERVED_PROPERTIES.contains(entry.getKey()))
           .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
       int formatVersion = PropertyUtil.propertyAsInt(
           rawProperties, PROPERTY_FORMAT_VERSION, DEFAULT_TABLE_FORMAT_VERSION);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
openinx commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r678776709



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -47,6 +47,19 @@
  * Metadata for a table.
  */
 public class TableMetadata implements Serializable {
+
+  /**
+   * Iceberg always use the latest stable format version when creating a new table.
+   * This reserved property keyword allows users to override the Iceberg format version of the table metadata.
+   * <p>
+   * If this table property exists when creating a table, the table will use the specified format version.
+   * If a table updates this property, it will try to upgrade to the specified format version.
+   * This helps developers to try out new features in an unreleased version or migrate existing tables to a new version.
+   * <p>
+   * Note: If you set the Iceberg format to an unreleased version, software stability is not guaranteed.
+   */
+  public static final String PROPERTY_FORMAT_VERSION = "format-version";

Review comment:
       Do we miss to override the default format version to be 2 in the [replaceTransaction](https://github.com/apache/iceberg/blob/21e1922a8ddb93a82388ea86a5f500d9f23885b3/core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java#L256) code path ? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] jackye1995 commented on pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
jackye1995 commented on pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#issuecomment-891344911


   @openinx thanks for the suggestion, I added tests for Spark, Flink and Hive.
   
   @rdblue I agree with the suggestions for the reserved table properties, I was a bit hesitated about if I should generalize it at this stage or not. If we want to do that, I have moved those static variables to `TableProperties` which I think is a better place to expose it, please let me know if you agree with this placement or not.
   
   For the builder pattern, yes I absolutely agree that we should have a builder given the number of arguments in the static methods and the variation of inputs. I can do that with a separated PR, and I was hoping this can get into 0.12.0 since we have voted to finalize format v2. It would be a bit awkward if the spec is finalized but people cannot try it out. Please let me know if this is possible. If so I can focus on getting this PR merged first, and then I can do a best effort to get the builder in, which can be before or after 0.12.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#issuecomment-892207451


   The latest changes are only to Javadoc, so I'm going to merge this based on the test result for 34094f2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on a change in pull request #2887: Core: allow creating v2 tables through table property

Posted by GitBox <gi...@apache.org>.
rdblue commented on a change in pull request #2887:
URL: https://github.com/apache/iceberg/pull/2887#discussion_r681880940



##########
File path: core/src/main/java/org/apache/iceberg/TableMetadata.java
##########
@@ -752,9 +760,15 @@ public TableMetadata buildReplacement(Schema updatedSchema, PartitionSpec update
       sortOrdersBuilder.add(freshSortOrder);
     }
 
-    Map<String, String> newProperties = Maps.newHashMap();
-    newProperties.putAll(this.properties);
-    newProperties.putAll(updatedProperties);
+    // prepare new table properties
+    Map<String, String> newRawProperties = Maps.newHashMap();
+    newRawProperties.putAll(this.properties);
+    newRawProperties.putAll(updatedProperties);
+    Map<String, String> newProperties = unreservedProperties(newRawProperties);

Review comment:
       I think it would be better to construct the properties directly rather than building a map and then rebuilding it. You should be able to use the same `filter` to add just the non-reserved properties from `updatedProperties`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org