You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/09/23 09:20:12 UTC

[GitHub] [hudi] danny0405 commented on a change in pull request #3703: [HUDI-2480] FileSlice after pending compaction-requested instant-time…

danny0405 commented on a change in pull request #3703:
URL: https://github.com/apache/hudi/pull/3703#discussion_r714617062



##########
File path: hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/MergeOnReadSnapshotRelation.scala
##########
@@ -151,8 +151,9 @@ class MergeOnReadSnapshotRelation(val sqlContext: SQLContext,
       // Load files from the global paths if it has defined to be compatible with the original mode
       val inMemoryFileIndex = HoodieSparkUtils.createInMemoryFileIndex(sqlContext.sparkSession, globPaths.get)
       val fsView = new HoodieTableFileSystemView(metaClient,
-        metaClient.getActiveTimeline.getCommitsTimeline
-          .filterCompletedInstants, inMemoryFileIndex.allFiles().toArray)
+        // file-slice after pending compaction-requested instant-time is also considered valid
+        metaClient.getCommitsAndCompactionTimeline.filterCompletedAndCompactionInstants,
+        inMemoryFileIndex.allFiles().toArray)

Review comment:
       Hi @vinothchandar @nsivabalan, i need your help for this view, i dive into the code a little, and the line confused me: https://github.com/apache/hudi/blob/5515a0d319cbac835c65f6d21898ac1399d77ea3/hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java#L443, and this line:
   https://github.com/apache/hudi/blob/5515a0d319cbac835c65f6d21898ac1399d77ea3/hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java#L120,
   
   the point i'm confused at is how we can decide the log files with base commit time of a pending compaction action is committed successfully ? I see some code to compare the timestamp but that is not enough, some intermediate or corrupt files may also have the log files with pending compaction instant time as base commit time right ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org