You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/02 16:11:00 UTC

[GitHub] [hudi] nsivabalan commented on a change in pull request #5213: [HUDI-3776] Fix BloomIndex incorrectly using ColStats to lookup record location

nsivabalan commented on a change in pull request #5213:
URL: https://github.com/apache/hudi/pull/5213#discussion_r841092654



##########
File path: hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
##########
@@ -138,6 +134,28 @@ public HoodieBloomIndex(HoodieWriteConfig config, BaseHoodieBloomIndexHelper blo
         partitionRecordKeyPairs, fileComparisonPairs, partitionToFileInfo, recordsPerPartition);
   }
 
+  private List<Pair<String, BloomIndexFileInfo>> getBloomIndexFileInfoForPartitions(HoodieEngineContext context,
+                                                                                    HoodieTable hoodieTable,
+                                                                                    List<String> affectedPartitionPathList) {
+    List<Pair<String, BloomIndexFileInfo>> fileInfoList = new ArrayList<>();
+
+    if (config.getBloomIndexPruneByRanges()) {
+      // load column ranges from metadata index if column stats index is enabled and column_stats metadata partition is available
+      if (config.isMetadataColumnStatsIndexEnabled()

Review comment:
       just so we are on same page. we will call it out in our release notes, that if someone wishes to disable certain partitions in MDT, whats the right way to go about. 
   From what we discussed offline:
   We have to fix disabling of any partition in MDT in similar way to how we handle disabling Metadata table completely. We have to delete the directory and update table config for sure. 
   So, we maintain the tableConfig in pristine state. If it says certain partitions is good to use, it should be in a state to be consumable. If not, pipeline should fail (for eg, if someone manually deletes one of the MDT partition). 
   if table config says, its not in usable state, no writers or readers should every use it. 
   
   So, coming back to this patch, just relying on completedMetadataPartitions should be good enough in my opinion. but we can let this proceed. It is just an extra guard. But in general, lets try to maintain the table config in good state at any point in time. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org