You are viewing a plain text version of this content. The canonical link for it is here.
Posted to gitbox@hive.apache.org by GitBox <gi...@apache.org> on 2021/05/26 08:07:47 UTC

[GitHub] [hive] lcspinter commented on a change in pull request #2316: HIVE-25161: Implement partitioned CTAS for iceberg tables

lcspinter commented on a change in pull request #2316:
URL: https://github.com/apache/hive/pull/2316#discussion_r639492832



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
##########
@@ -736,6 +740,21 @@ protected void initializeOp(Configuration hconf) throws HiveException {
     }
   }
 
+  private boolean skipPartitionCheck() {
+    return Optional.ofNullable(conf).map(FileSinkDesc::getTableInfo)
+        .map(TableDesc::getProperties)
+        .map(props -> props.getProperty(hive_metastoreConstants.META_TABLE_STORAGE))
+        .map(handler -> {
+          try {
+            return HiveUtils.getStorageHandler(hconf, handler);
+          } catch (HiveException e) {
+            return null;
+          }
+        })
+        .map(HiveStorageHandler::alwaysUnpartitioned)

Review comment:
       Wouldn't this end up in a null pointer exception, when we have a HiveException? 

##########
File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java
##########
@@ -328,6 +345,7 @@ static void overlayTableProperties(Configuration configuration, TableDesc tableD
     map.put(InputFormatConfig.TABLE_IDENTIFIER, props.getProperty(Catalogs.NAME));
     map.put(InputFormatConfig.TABLE_LOCATION, table.location());
     map.put(InputFormatConfig.TABLE_SCHEMA, schemaJson);
+    props.put(InputFormatConfig.PARTITION_SPEC, PartitionSpecParser.toJson(table.spec()));

Review comment:
       It is not related to this change, but it seems to me that the javadoc and the naming of the method are not in sync.  Maybe we should separate the logic which is strictly related to storing serializable table data from the code which updates table properties.

##########
File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergSerDe.java
##########
@@ -151,7 +152,23 @@ public void initialize(@Nullable Configuration configuration, Properties serDePr
   private void createTableForCTAS(Configuration configuration, Properties serDeProperties) {
     serDeProperties.setProperty(TableProperties.ENGINE_HIVE_ENABLED, "true");
     serDeProperties.setProperty(InputFormatConfig.TABLE_SCHEMA, SchemaParser.toJson(tableSchema));
+
+    // build partition spec, if any
+    if (serDeProperties.getProperty(serdeConstants.LIST_PARTITION_COLUMNS) != null) {
+      String[] partCols = serDeProperties.getProperty(serdeConstants.LIST_PARTITION_COLUMNS).split(",");

Review comment:
       Are we certain that the partition column name cannot contain `,`?  

##########
File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java
##########
@@ -540,6 +540,43 @@ public void testCTASFromHiveTable() {
     Assert.assertArrayEquals(new Object[]{2L, "Linda", "Finance"}, objects.get(1));
   }
 
+  @Test
+  public void testCTASPartitionedFromHiveTable() throws TException, InterruptedException {
+    Assume.assumeTrue("CTAS target table is supported fully only for HiveCatalog tables." +

Review comment:
       Can we do a similar check to in production code as well? It would be good to warn the end user about this limitation.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org
For additional commands, e-mail: gitbox-help@hive.apache.org