You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "codope (via GitHub)" <gi...@apache.org> on 2023/04/13 06:48:08 UTC

[GitHub] [hudi] codope commented on a diff in pull request #8402: [HUDI-6048] Check if partition exists before list partition by path prefix

codope commented on code in PR #8402:
URL: https://github.com/apache/hudi/pull/8402#discussion_r1165076903


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkHoodieTableFileIndex.scala:
##########
@@ -299,7 +299,9 @@ class SparkHoodieTableFileIndex(spark: SparkSession,
       // prefix to try to reduce the scope of the required file-listing
       val relativePartitionPathPrefix = composeRelativePartitionPath(staticPartitionColumnNameValuePairs)
 
-      if (staticPartitionColumnNameValuePairs.length == partitionColumnNames.length) {
+      if (!metaClient.getFs.exists(new Path(getBasePath, relativePartitionPathPrefix))) {

Review Comment:
   `fs.exists` call is costly. This will impact latency. How often do we run into this scenario? FS cache is invalidated on each refresh anyway, so I am wondering if we really need to do fs.exists check everytime.
   Can we not simply catch the exception and continue? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org