You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/06/13 02:34:59 UTC

[GitHub] [hudi] leesf commented on a change in pull request #1720: [HUDI-1003] Handle partitions correctly for syncing hudi non-parititioned table to hive

leesf commented on a change in pull request #1720:
URL: https://github.com/apache/hudi/pull/1720#discussion_r439701681



##########
File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -247,7 +247,13 @@ private[hudi] object HoodieSparkSqlWriter {
     hiveSyncConfig.hivePass = parameters(HIVE_PASS_OPT_KEY)
     hiveSyncConfig.jdbcUrl = parameters(HIVE_URL_OPT_KEY)
     hiveSyncConfig.partitionFields =
-      ListBuffer(parameters(HIVE_PARTITION_FIELDS_OPT_KEY).split(",").map(_.trim).filter(!_.isEmpty).toList: _*)
+      // Set partitionFields to empty, when the NonPartitionedExtractor is used
+      if (classOf[NonPartitionedExtractor].getName.equals(parameters(HIVE_PARTITION_EXTRACTOR_CLASS_OPT_KEY))) {
+        log.warn(s"Parameter '$HIVE_PARTITION_FIELDS_OPT_KEY' is ignored, since the NonPartitionedExtractor is used")
+        Array.empty[String].toList
+      } else {
+        ListBuffer(parameters(HIVE_PARTITION_FIELDS_OPT_KEY).split(",").map(_.trim).filter(!_.isEmpty).toList: _*)
+      }

Review comment:
       I think we would move the logic to hudi-hive module, using sparkdatasource writing data to hudi and sync to hive is one way, also, users may also use api(HiveSyncTool) to sync to hive, we should handle this case as well. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org