You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/28 14:37:32 UTC

[GitHub] [hudi] lw309637554 commented on a change in pull request #2475: [HUDI-1527] automatically infer the data directory, users only need to specify the table directory

lw309637554 commented on a change in pull request #2475:
URL: https://github.com/apache/hudi/pull/2475#discussion_r584307500



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
##########
@@ -84,6 +88,26 @@ class DefaultSource extends RelationProvider
     val tablePath = DataSourceUtils.getTablePath(fs, globPaths.toArray)
     log.info("Obtained hudi table path: " + tablePath)
 
+    if (path.nonEmpty) {
+      val _path = path.get.stripSuffix("/")
+      val pathTmp = new Path(_path).makeQualified(fs.getUri, fs.getWorkingDirectory)
+      // If the user specifies the table path, the data path is automatically inferred
+      if (pathTmp.toString.equals(tablePath)) {
+        val sparkEngineContext = new HoodieSparkEngineContext(sqlContext.sparkContext)
+        val fsBackedTableMetadata =
+          new FileSystemBackedTableMetadata(sparkEngineContext, new SerializableConfiguration(fs.getConf), tablePath, false)
+        val partitionPaths = fsBackedTableMetadata.getAllPartitionPaths

Review comment:
       @teeyog  hello, now infer the partition for getallpartition paths from metadata table. 
   The partition mode is set as hoodie.datasource.write.partitionpath.field when write the hudi table. Can we persist the hoodie.datasource.write.partitionpath.field to metatable? Then read just get the properties , not get all the partition path? cc @vinothchandar 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org