You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/02/17 11:34:52 UTC

[GitHub] [hudi] vinothchandar commented on a change in pull request #2494: [HUDI-1552] Improve performance of key lookups from base file in Metadata Table.

vinothchandar commented on a change in pull request #2494:
URL: https://github.com/apache/hudi/pull/2494#discussion_r577537335



##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java
##########
@@ -112,13 +113,59 @@ private void initIfNeeded() {
 
   @Override
   protected Option<HoodieRecord<HoodieMetadataPayload>> getRecordByKeyFromMetadata(String key) {
+    // This function can be called in parallel through multiple threads. For each thread, we determine the thread-local
+    // versions of the baseFile and logRecord readers to use.
+    // - If reuse is enabled, we use the same readers and dont close them
+    // - if reuse is disabled, we open new readers in each thread and close them
+    HoodieFileReader localFileReader = null;
+    HoodieMetadataMergedLogRecordScanner localLogRecordScanner = null;
+    synchronized (this) {

Review comment:
       if we synchronize access here, why synchronize at the lower levels?

##########
File path: hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java
##########
@@ -112,13 +113,59 @@ private void initIfNeeded() {
 
   @Override
   protected Option<HoodieRecord<HoodieMetadataPayload>> getRecordByKeyFromMetadata(String key) {
+    // This function can be called in parallel through multiple threads. For each thread, we determine the thread-local
+    // versions of the baseFile and logRecord readers to use.
+    // - If reuse is enabled, we use the same readers and dont close them
+    // - if reuse is disabled, we open new readers in each thread and close them
+    HoodieFileReader localFileReader = null;
+    HoodieMetadataMergedLogRecordScanner localLogRecordScanner = null;
+    synchronized (this) {
+      if (!metadataConfig.enableReuse()) {

Review comment:
       can we please go back to the previous style of lazily opening/closing depending on the config and hiding it from the flow of reading and processing the values. 
   
   Happy to jump of a call if needed. But typically, reusing fewer variables and fewer branching the better. This does add signficant tax when reading the code. Thats why I changed them this way




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org