You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "nsivabalan (via GitHub)" <gi...@apache.org> on 2023/03/17 20:20:01 UTC

[GitHub] [hudi] nsivabalan opened a new pull request, #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

nsivabalan opened a new pull request, #8223:
URL: https://github.com/apache/hudi/pull/8223

   ### Change Logs
   
   Fixing a corner case bug where compaction in MDT could get triggered w/ partially failed commit in DT. 
   
   Currently the logic to deduce pending instants in MDT is as below
   
   a = we get latest completed delta commit from MDT.
   Find any inflights in DT timeline **before** {a}
   and if we don't find any such inflights, we will go ahead and may be compact MDT.
   
   But what incase the latest delta commit in MDT succeeded in MDT, but failed in DT. so, it could potentially result in triggering compaction in MDT which should not happen. 
   
   So, the right fix is 
   
   a = we get latest completed delta commit from MDT.
   Find any inflights in DT timeline **before or equals** to {a}
   This should take care of not triggering compaction in MDT when here are inflights in DT which is committed to MDT. 
   
   ### Impact
   
   Stabilizes metadata table. 
   
   ### Risk level (write none, low medium or high below)
   
   medium
   
   ### Documentation Update
   
   N/A
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1476598790

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787",
       "triggerID" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4e5c7fc2d673c73abfab7929fdafc286d317b22",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15811",
       "triggerID" : "d4e5c7fc2d673c73abfab7929fdafc286d317b22",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dcd6dc67221858f64171b8799dd410df228c30a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787) 
   * d4e5c7fc2d673c73abfab7929fdafc286d317b22 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15811) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1474440744

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 20827b4d293981f971d183ad127d0f94cb82c2d0 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1474969326

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787",
       "triggerID" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dcd6dc67221858f64171b8799dd410df228c30a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1474434388

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 20827b4d293981f971d183ad127d0f94cb82c2d0 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan merged pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan merged PR #8223:
URL: https://github.com/apache/hudi/pull/8223


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1476917518

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787",
       "triggerID" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4e5c7fc2d673c73abfab7929fdafc286d317b22",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15811",
       "triggerID" : "d4e5c7fc2d673c73abfab7929fdafc286d317b22",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * d4e5c7fc2d673c73abfab7929fdafc286d317b22 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15811) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1474930302

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787",
       "triggerID" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 20827b4d293981f971d183ad127d0f94cb82c2d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775) 
   * 8dcd6dc67221858f64171b8799dd410df228c30a Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] nsivabalan commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "nsivabalan (via GitHub)" <gi...@apache.org>.
nsivabalan commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1477158594

   CI is green
   <img width="1178" alt="image" src="https://user-images.githubusercontent.com/513218/226498623-fb6065ff-73fc-41d6-b6fa-ce4243f2becb.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1474565624

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 20827b4d293981f971d183ad127d0f94cb82c2d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on a diff in pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on code in PR #8223:
URL: https://github.com/apache/hudi/pull/8223#discussion_r1140992117


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1046,7 +1046,9 @@ protected void compactIfNecessary(BaseHoodieWriteClient writeClient, String inst
         .lastInstant().orElseThrow(() -> new HoodieMetadataException("No completed deltacommit in metadata table"))
         .getTimestamp();
     List<HoodieInstant> pendingInstants = dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
-        .findInstantsBefore(latestDeltaCommitTimeInMetadataTable).getInstants();
+        // ignore pending indexing action

Review Comment:
   How about we add some documents to help the understanding of the use case, sth like this:
   
   ```java
   delta_c1 (F3, F4) (MDT)
   delta_c1 (F1, F2) (DT)
         //
   c2.inflight (compaction triggers in DT)
   
   
   c2 (F7, F8) (compaction complete in MDT)
   c2 fails to commit to DT
         //
   delta_c4 (F9, F10) (MDT)
       -- can we trigger MDT compaction here? The answer is no!
       1. we have no instant filtering for HFile reader
       2. the avro log merge reader has instant filtering, but it can only filter out the invalid instants, not rollback
   delta_c4 (F11, F12) (DT)
         //
   r5 (to rollback c2) (MDT)
   -F7, -F8
   r5 (to rollback c2) (DT)
         //
   delta_c6 (F13, F14) (MDT) -- now the compaction can be triggered
   delta_c6 (F15, F16) (DT)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1474926470

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 20827b4d293981f971d183ad127d0f94cb82c2d0 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775) 
   * 8dcd6dc67221858f64171b8799dd410df228c30a UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #8223: [HUDI-5950] Fixing pending instant deduction to trigger compaction in MDT

Posted by "hudi-bot (via GitHub)" <gi...@apache.org>.
hudi-bot commented on PR #8223:
URL: https://github.com/apache/hudi/pull/8223#issuecomment-1476587128

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15775",
       "triggerID" : "20827b4d293981f971d183ad127d0f94cb82c2d0",
       "triggerType" : "PUSH"
     }, {
       "hash" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787",
       "triggerID" : "8dcd6dc67221858f64171b8799dd410df228c30a",
       "triggerType" : "PUSH"
     }, {
       "hash" : "d4e5c7fc2d673c73abfab7929fdafc286d317b22",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "d4e5c7fc2d673c73abfab7929fdafc286d317b22",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 8dcd6dc67221858f64171b8799dd410df228c30a Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=15787) 
   * d4e5c7fc2d673c73abfab7929fdafc286d317b22 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org