You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/10/19 08:24:06 UTC

[GitHub] [hudi] Zouxxyy opened a new pull request, #6999: [5057] Fix msck repair hudi table

Zouxxyy opened a new pull request, #6999:
URL: https://github.com/apache/hudi/pull/6999

   ### Change Logs
   
   _Describe context and summary for this change. Highlight if any code was copied._
   
   ### Impact
   
   _Describe any public API or user-facing feature change or any performance impact._
   
   ### Risk level (write none, low medium or high below)
   
   _If medium or high, explain what verification was done to mitigate the risks._
   
   ### Documentation Update
   
   _Describe any necessary documentation update if there is any new feature, config, or user-facing change_
   
   - _The config description must be updated if new configs are added or the default value of the configs are changed_
   - _Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
     ticket number here and follow the [instruction](https://hudi.apache.org/contribute/developer-setup#website) to make
     changes to the website._
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287599781

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a5e62636e794ec2c0eeb7ff3293d510ed0dd802e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429) 
   * 19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron merged pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron merged PR #6999:
URL: https://github.com/apache/hudi/pull/6999


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on a diff in pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on code in PR #6999:
URL: https://github.com/apache/hudi/pull/6999#discussion_r1002863123


##########
hudi-spark-datasource/hudi-spark2/src/main/scala/org/apache/spark/sql/HoodieSpark2CatalystPlanUtils.scala:
##########
@@ -74,4 +74,15 @@ object HoodieSpark2CatalystPlanUtils extends HoodieCatalystPlansUtils {
   override def getRelationTimeTravel(plan: LogicalPlan): Option[(LogicalPlan, Option[Expression], Option[String])] = {
     throw new IllegalStateException(s"Should not call getRelationTimeTravel for spark2")
   }
+
+  override def isRepairTable(plan: LogicalPlan): Boolean = {
+    plan.isInstanceOf[AlterTableRecoverPartitionsCommand]
+  }
+
+  override def getRepairTableChildren(plan: LogicalPlan): Option[(TableIdentifier, Boolean, Boolean, String)] = {
+    plan match {
+      case c: AlterTableRecoverPartitionsCommand =>
+        Some((c.tableName, true, false, c.cmd))

Review Comment:
   please explain why use `true` as the 2nd param default value, and `false` as the 3rd one in Code.



##########
hudi-spark-datasource/hudi-spark3.1.x/src/main/scala/org/apache/spark/sql/HoodieSpark31CatalystPlanUtils.scala:
##########
@@ -31,4 +33,15 @@ object HoodieSpark31CatalystPlanUtils extends HoodieSpark3CatalystPlanUtils {
   }
 
   override def projectOverSchema(schema: StructType, output: AttributeSet): ProjectionOverSchema = ProjectionOverSchema(schema)
+
+  override def isRepairTable(plan: LogicalPlan): Boolean = {
+    plan.isInstanceOf[AlterTableRecoverPartitionsCommand]
+  }
+
+  override def getRepairTableChildren(plan: LogicalPlan): Option[(TableIdentifier, Boolean, Boolean, String)] = {
+    plan match {
+      case c: AlterTableRecoverPartitionsCommand =>
+        Some((c.tableName, true, false, c.cmd))

Review Comment:
   ditto



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1288538266

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448) 
   * 5b091c27f98965150dbe479e39c948806ca02786 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1290338068

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548",
       "triggerID" : "1289876774",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fdb6da0cc26ab9801b8227814866f49227fd36bc",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12566",
       "triggerID" : "fdb6da0cc26ab9801b8227814866f49227fd36bc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cc6cc698b62de048b296b58d430245486a560476 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548) 
   * fdb6da0cc26ab9801b8227814866f49227fd36bc Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12566) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1290331355

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548",
       "triggerID" : "1289876774",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fdb6da0cc26ab9801b8227814866f49227fd36bc",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "fdb6da0cc26ab9801b8227814866f49227fd36bc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cc6cc698b62de048b296b58d430245486a560476 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530) Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548) 
   * fdb6da0cc26ab9801b8227814866f49227fd36bc UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1288531311

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448) 
   * 5b091c27f98965150dbe479e39c948806ca02786 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on a diff in pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on code in PR #6999:
URL: https://github.com/apache/hudi/pull/6999#discussion_r1002869098


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/RepairHoodieTableCommand.scala:
##########
@@ -0,0 +1,221 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hudi.command
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.{FileSystem, Path, PathFilter}
+import org.apache.hadoop.mapred.{FileInputFormat, JobConf}
+
+import org.apache.hudi.common.table.HoodieTableConfig
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec
+import org.apache.spark.sql.catalyst.catalog._
+import org.apache.spark.sql.execution.command.PartitionStatistics
+import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
+import org.apache.spark.util.{SerializableConfiguration, ThreadUtils}
+
+import java.util.concurrent.TimeUnit.MILLISECONDS
+
+import scala.util.control.NonFatal
+
+/**
+ * Command for repair hudi table's partitions.
+ * Use hoodieCatalogTable.getPartitionPaths() to get partitions instead of scanning the file system.
+ */
+case class RepairHoodieTableCommand(tableName: TableIdentifier,
+                                    enableAddPartitions: Boolean,
+                                    enableDropPartitions: Boolean,
+                                    cmd: String = "MSCK REPAIR TABLE") extends HoodieLeafRunnableCommand {
+
+  // These are list of statistics that can be collected quickly without requiring a scan of the data
+  // see https://github.com/apache/hive/blob/master/
+  //   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java
+  val NUM_FILES = "numFiles"
+  val TOTAL_SIZE = "totalSize"
+  val DDL_TIME = "transient_lastDdlTime"
+
+  private def getPathFilter(hadoopConf: Configuration): PathFilter = {
+    // Dummy jobconf to get to the pathFilter defined in configuration
+    // It's very expensive to create a JobConf(ClassUtil.findContainingJar() is slow)
+    val jobConf = new JobConf(hadoopConf, this.getClass)
+    val pathFilter = FileInputFormat.getInputPathFilter(jobConf)
+    new PathFilter {
+      override def accept(path: Path): Boolean = {
+        val name = path.getName
+        if (name != "_SUCCESS" && name != "_temporary" && !name.startsWith(".")) {
+          pathFilter == null || pathFilter.accept(path)
+        } else {
+          false
+        }
+      }
+    }
+  }
+
+  override def run(spark: SparkSession): Seq[Row] = {
+    val catalog = spark.sessionState.catalog
+    val table = catalog.getTableMetadata(tableName)
+    val tableIdentWithDB = table.identifier.quotedString
+    if (table.partitionColumnNames.isEmpty) {
+      throw new AnalysisException(
+        s"Operation not allowed: $cmd only works on partitioned tables: $tableIdentWithDB")
+    }
+
+    if (table.storage.locationUri.isEmpty) {
+      throw new AnalysisException(s"Operation not allowed: $cmd only works on table with " +
+        s"location provided: $tableIdentWithDB")
+    }
+
+    val root = new Path(table.location)
+    logInfo(s"Recover all the partitions in $root")
+
+    val hoodieCatalogTable = HoodieCatalogTable(spark, table.identifier)
+    val isHiveStyledPartitioning = hoodieCatalogTable.catalogProperties.
+      getOrElse(HoodieTableConfig.HIVE_STYLE_PARTITIONING_ENABLE.key, "true").equals("true")
+    val partitionSpecsAndLocs: Seq[(TablePartitionSpec, Path)] = hoodieCatalogTable.getPartitionPaths.map(partitionPath => {
+      var values = partitionPath.split('/')
+      if (isHiveStyledPartitioning) {
+        values = values.map(_.split('=')(1))
+      }
+      (table.partitionColumnNames.zip(values).toMap, new Path(root, partitionPath))
+    })
+
+    val droppedAmount = if (enableDropPartitions) {
+      dropPartitions(catalog, partitionSpecsAndLocs)
+    } else 0
+    val addedAmount = if (enableAddPartitions) {
+      val hadoopConf = spark.sessionState.newHadoopConf()
+      val fs = root.getFileSystem(hadoopConf)
+      val pathFilter = getPathFilter(hadoopConf)
+      val threshold = spark.sparkContext.conf.getInt("spark.rdd.parallelListingThreshold", 10)
+      val total = partitionSpecsAndLocs.length
+      val partitionStats = if (spark.sqlContext.conf.gatherFastStats) {
+        gatherPartitionStats(spark, partitionSpecsAndLocs, fs, pathFilter, threshold)
+      } else {
+        Map.empty[String, PartitionStatistics]
+      }
+      logInfo(s"Finished to gather the fast stats for all $total partitions.")
+      addPartitions(spark, table, partitionSpecsAndLocs, partitionStats)
+      total
+    } else 0
+    // Updates the table to indicate that its partition metadata is stored in the Hive metastore.
+    // This is always the case for Hive format tables, but is not true for Datasource tables created
+    // before Spark 2.1 unless they are converted via `msck repair table`.
+    spark.sessionState.catalog.alterTable(table.copy(tracksPartitionsInCatalog = true))
+    try {
+      spark.catalog.refreshTable(tableIdentWithDB)
+    } catch {
+      case NonFatal(e) =>
+        logError(s"Cannot refresh the table '$tableIdentWithDB'. A query of the table " +
+          "might return wrong result if the table was cached. To avoid such issue, you should " +
+          "uncache the table manually via the UNCACHE TABLE command after table recovering will " +
+          "complete fully.", e)
+    }
+    logInfo(s"Recovered all partitions: added ($addedAmount), dropped ($droppedAmount).")
+    Seq.empty[Row]
+  }
+
+  private def gatherPartitionStats(spark: SparkSession,

Review Comment:
   This method is not suitable for Hudi too. Gather the stat of the current snapshot, not the whole files in FileSystem.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1288634673

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448) 
   * 5b091c27f98965150dbe479e39c948806ca02786 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514) 
   * 2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1283747050

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4da3b7d783a84d0dfcb7662ce224bfec2f74e018 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330) 
   * 515e22fc935dec773329391538bc21ea8a026139 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287644758

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on a diff in pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on code in PR #6999:
URL: https://github.com/apache/hudi/pull/6999#discussion_r1002864497


##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestRepairTable.scala:
##########
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hudi
+
+import org.apache.hudi.DataSourceWriteOptions.{PARTITIONPATH_FIELD, PRECOMBINE_FIELD, RECORDKEY_FIELD}
+import org.apache.hudi.HoodieSparkUtils
+import org.apache.hudi.common.table.HoodieTableConfig.HIVE_STYLE_PARTITIONING_ENABLE
+import org.apache.hudi.config.HoodieWriteConfig.TBL_NAME
+
+import org.apache.spark.sql.SaveMode
+
+class TestRepairTable extends HoodieSparkSqlTestBase {
+
+  test("Test msck repair non-partitioned table") {
+    Seq("true", "false").foreach { hiveStylePartitionEnable =>
+      withTempDir { tmp =>
+        val tableName = generateTableName
+        val basePath = s"${tmp.getCanonicalPath}/$tableName"
+        spark.sql(
+          s"""
+             | create table $tableName (
+             |  id int,
+             |  name string,
+             |  ts long,
+             |  dt string,
+             |  hh string
+             | ) using hudi
+             | location '$basePath'
+             | tblproperties (
+             |  primaryKey = 'id',
+             |  preCombineField = 'ts',
+             |  hoodie.datasource.write.hive_style_partitioning = '$hiveStylePartitionEnable'
+             | )
+        """.stripMargin)
+
+        checkExceptionContain(s"msck repair table $tableName")(
+          s"Operation not allowed")
+      }
+    }
+  }
+
+  test("Test msck repair partitioned table") {
+    Seq("true", "false").foreach { hiveStylePartitionEnable =>
+      withTempDir { tmp =>
+        val tableName = generateTableName
+        val basePath = s"${tmp.getCanonicalPath}/$tableName"
+        spark.sql(
+          s"""
+             | create table $tableName (
+             |  id int,
+             |  name string,
+             |  ts long,
+             |  dt string,
+             |  hh string
+             | ) using hudi
+             | partitioned by (dt, hh)
+             | location '$basePath'
+             | tblproperties (
+             |  primaryKey = 'id',
+             |  preCombineField = 'ts',
+             |  hoodie.datasource.write.hive_style_partitioning = '$hiveStylePartitionEnable'
+             | )
+        """.stripMargin)
+        val table = spark.sessionState.sqlParser.parseTableIdentifier(tableName)
+
+        import spark.implicits._
+        val df = Seq((1, "a1", 1000, "2022-10-06", "11"), (2, "a2", 1001, "2022-10-06", "12"))
+          .toDF("id", "name", "ts", "dt", "hh")
+        df.write.format("hudi")
+          .option(RECORDKEY_FIELD.key, "id")
+          .option(PRECOMBINE_FIELD.key, "ts")
+          .option(PARTITIONPATH_FIELD.key, "dt, hh")
+          .option(HIVE_STYLE_PARTITIONING_ENABLE.key, hiveStylePartitionEnable)
+          .mode(SaveMode.Append)
+          .save(basePath)
+
+        assertResult(Seq())(spark.sessionState.catalog.listPartitionNames(table))
+        spark.sql(s"msck repair table $tableName")
+        spark.sql(s"msck repair table $tableName")

Review Comment:
   why execute this sql twice.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1289176501

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517) 
   * cc6cc698b62de048b296b58d430245486a560476 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1283739622

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4da3b7d783a84d0dfcb7662ce224bfec2f74e018 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330) 
   * 515e22fc935dec773329391538bc21ea8a026139 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1284272527

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4da3b7d783a84d0dfcb7662ce224bfec2f74e018 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330) 
   * eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1289185689

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517) 
   * cc6cc698b62de048b296b58d430245486a560476 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1289898037

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548",
       "triggerID" : "1289876774",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * cc6cc698b62de048b296b58d430245486a560476 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1283635505

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4da3b7d783a84d0dfcb7662ce224bfec2f74e018 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1284265074

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4da3b7d783a84d0dfcb7662ce224bfec2f74e018 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330) 
   * eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49 UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1284761222

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1284882140

   @Zouxxyy I see `scanPartitions` and `dropPartitions` are the major logic for `RepairTableCommand`. But both of them depends on the partition directories in FileSystem, that is not suitable for Hudi. In Hudi, even if one partition directory exists, this partition still may have been dropped before. So better to use the partition information of the current snapshot instead of directories in FileSystem. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287061750

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350) 
   * a5e62636e794ec2c0eeb7ff3293d510ed0dd802e UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1291241847

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12548",
       "triggerID" : "1289876774",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "fdb6da0cc26ab9801b8227814866f49227fd36bc",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12566",
       "triggerID" : "fdb6da0cc26ab9801b8227814866f49227fd36bc",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * fdb6da0cc26ab9801b8227814866f49227fd36bc Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12566) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on a diff in pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on code in PR #6999:
URL: https://github.com/apache/hudi/pull/6999#discussion_r1002864134


##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/TestRepairTable.scala:
##########
@@ -0,0 +1,163 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hudi
+
+import org.apache.hudi.DataSourceWriteOptions.{PARTITIONPATH_FIELD, PRECOMBINE_FIELD, RECORDKEY_FIELD}
+import org.apache.hudi.HoodieSparkUtils
+import org.apache.hudi.common.table.HoodieTableConfig.HIVE_STYLE_PARTITIONING_ENABLE
+import org.apache.hudi.config.HoodieWriteConfig.TBL_NAME
+
+import org.apache.spark.sql.SaveMode
+
+class TestRepairTable extends HoodieSparkSqlTestBase {
+
+  test("Test msck repair non-partitioned table") {
+    Seq("true", "false").foreach { hiveStylePartitionEnable =>
+      withTempDir { tmp =>
+        val tableName = generateTableName
+        val basePath = s"${tmp.getCanonicalPath}/$tableName"
+        spark.sql(
+          s"""
+             | create table $tableName (
+             |  id int,
+             |  name string,
+             |  ts long,
+             |  dt string,
+             |  hh string
+             | ) using hudi
+             | location '$basePath'
+             | tblproperties (
+             |  primaryKey = 'id',
+             |  preCombineField = 'ts',
+             |  hoodie.datasource.write.hive_style_partitioning = '$hiveStylePartitionEnable'
+             | )
+        """.stripMargin)
+
+        checkExceptionContain(s"msck repair table $tableName")(
+          s"Operation not allowed")
+      }
+    }
+  }
+
+  test("Test msck repair partitioned table") {
+    Seq("true", "false").foreach { hiveStylePartitionEnable =>
+      withTempDir { tmp =>
+        val tableName = generateTableName
+        val basePath = s"${tmp.getCanonicalPath}/$tableName"
+        spark.sql(
+          s"""
+             | create table $tableName (
+             |  id int,
+             |  name string,
+             |  ts long,
+             |  dt string,
+             |  hh string
+             | ) using hudi
+             | partitioned by (dt, hh)
+             | location '$basePath'
+             | tblproperties (
+             |  primaryKey = 'id',
+             |  preCombineField = 'ts',
+             |  hoodie.datasource.write.hive_style_partitioning = '$hiveStylePartitionEnable'
+             | )
+        """.stripMargin)
+        val table = spark.sessionState.sqlParser.parseTableIdentifier(tableName)
+
+        import spark.implicits._
+        val df = Seq((1, "a1", 1000, "2022-10-06", "11"), (2, "a2", 1001, "2022-10-06", "12"))
+          .toDF("id", "name", "ts", "dt", "hh")
+        df.write.format("hudi")
+          .option(RECORDKEY_FIELD.key, "id")
+          .option(PRECOMBINE_FIELD.key, "ts")
+          .option(PARTITIONPATH_FIELD.key, "dt, hh")
+          .option(HIVE_STYLE_PARTITIONING_ENABLE.key, hiveStylePartitionEnable)
+          .mode(SaveMode.Append)
+          .save(basePath)
+
+        assertResult(Seq())(spark.sessionState.catalog.listPartitionNames(table))
+        spark.sql(s"msck repair table $tableName")
+        spark.sql(s"msck repair table $tableName")
+        assertResult(Seq("dt=2022-10-06/hh=11", "dt=2022-10-06/hh=12"))(spark.sessionState.catalog.listPartitionNames(table))

Review Comment:
   nit:  code format.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1289727413

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     }, {
       "hash" : "cc6cc698b62de048b296b58d430245486a560476",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530",
       "triggerID" : "cc6cc698b62de048b296b58d430245486a560476",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * cc6cc698b62de048b296b58d430245486a560476 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12530) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on a diff in pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on code in PR #6999:
URL: https://github.com/apache/hudi/pull/6999#discussion_r1002867035


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/command/RepairHoodieTableCommand.scala:
##########
@@ -0,0 +1,221 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hudi.command
+
+import org.apache.hadoop.conf.Configuration
+import org.apache.hadoop.fs.{FileSystem, Path, PathFilter}
+import org.apache.hadoop.mapred.{FileInputFormat, JobConf}
+
+import org.apache.hudi.common.table.HoodieTableConfig
+
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec
+import org.apache.spark.sql.catalyst.catalog._
+import org.apache.spark.sql.execution.command.PartitionStatistics
+import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
+import org.apache.spark.util.{SerializableConfiguration, ThreadUtils}
+
+import java.util.concurrent.TimeUnit.MILLISECONDS
+
+import scala.util.control.NonFatal
+
+/**
+ * Command for repair hudi table's partitions.
+ * Use hoodieCatalogTable.getPartitionPaths() to get partitions instead of scanning the file system.
+ */
+case class RepairHoodieTableCommand(tableName: TableIdentifier,
+                                    enableAddPartitions: Boolean,
+                                    enableDropPartitions: Boolean,
+                                    cmd: String = "MSCK REPAIR TABLE") extends HoodieLeafRunnableCommand {
+
+  // These are list of statistics that can be collected quickly without requiring a scan of the data
+  // see https://github.com/apache/hive/blob/master/
+  //   common/src/java/org/apache/hadoop/hive/common/StatsSetupConst.java
+  val NUM_FILES = "numFiles"
+  val TOTAL_SIZE = "totalSize"
+  val DDL_TIME = "transient_lastDdlTime"
+
+  private def getPathFilter(hadoopConf: Configuration): PathFilter = {
+    // Dummy jobconf to get to the pathFilter defined in configuration
+    // It's very expensive to create a JobConf(ClassUtil.findContainingJar() is slow)
+    val jobConf = new JobConf(hadoopConf, this.getClass)
+    val pathFilter = FileInputFormat.getInputPathFilter(jobConf)
+    new PathFilter {
+      override def accept(path: Path): Boolean = {
+        val name = path.getName
+        if (name != "_SUCCESS" && name != "_temporary" && !name.startsWith(".")) {
+          pathFilter == null || pathFilter.accept(path)
+        } else {
+          false
+        }
+      }
+    }
+  }
+
+  override def run(spark: SparkSession): Seq[Row] = {
+    val catalog = spark.sessionState.catalog
+    val table = catalog.getTableMetadata(tableName)
+    val tableIdentWithDB = table.identifier.quotedString
+    if (table.partitionColumnNames.isEmpty) {
+      throw new AnalysisException(
+        s"Operation not allowed: $cmd only works on partitioned tables: $tableIdentWithDB")
+    }
+
+    if (table.storage.locationUri.isEmpty) {
+      throw new AnalysisException(s"Operation not allowed: $cmd only works on table with " +
+        s"location provided: $tableIdentWithDB")
+    }
+
+    val root = new Path(table.location)
+    logInfo(s"Recover all the partitions in $root")
+
+    val hoodieCatalogTable = HoodieCatalogTable(spark, table.identifier)
+    val isHiveStyledPartitioning = hoodieCatalogTable.catalogProperties.
+      getOrElse(HoodieTableConfig.HIVE_STYLE_PARTITIONING_ENABLE.key, "true").equals("true")

Review Comment:
   `toBoolean` can work.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1289166481

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] YannByron commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
YannByron commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1289876774

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287227501

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a5e62636e794ec2c0eeb7ff3293d510ed0dd802e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1284256050

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     } ]
   }-->
   ## CI report:
   
   * 515e22fc935dec773329391538bc21ea8a026139 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335) Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Zouxxyy commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
Zouxxyy commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1284240123

   @hudi-bot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1283643248

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 4da3b7d783a84d0dfcb7662ce224bfec2f74e018 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287069134

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49 Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350) 
   * a5e62636e794ec2c0eeb7ff3293d510ed0dd802e Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Zouxxyy commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
Zouxxyy commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287047034

   > @Zouxxyy I see `scanPartitions` and `dropPartitions` are the major logic for `RepairTableCommand`. But both of them depends on the partition directories in FileSystem, that is not suitable for Hudi. In Hudi, even if one partition directory exists, this partition still may have been dropped before. So better to use the partition information of the current snapshot instead of directories in FileSystem. WDYT?
   
   Good idea, done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] Zouxxyy commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
Zouxxyy commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1288485908

   @Zouxxyy Fixed all comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1287600595

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "FAILURE",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * a5e62636e794ec2c0eeb7ff3293d510ed0dd802e Azure: [FAILURE](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429) 
   * 19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #6999: [HUDI-5057] Fix msck repair hudi table

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #6999:
URL: https://github.com/apache/hudi/pull/6999#issuecomment-1288646008

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12335",
       "triggerID" : "515e22fc935dec773329391538bc21ea8a026139",
       "triggerType" : "PUSH"
     }, {
       "hash" : "515e22fc935dec773329391538bc21ea8a026139",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12347",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "4da3b7d783a84d0dfcb7662ce224bfec2f74e018",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12330",
       "triggerID" : "1284240123",
       "triggerType" : "MANUAL"
     }, {
       "hash" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12350",
       "triggerID" : "eb58602dcc2bde6ecbc79a74a0ee1ea5a8320f49",
       "triggerType" : "PUSH"
     }, {
       "hash" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12429",
       "triggerID" : "a5e62636e794ec2c0eeb7ff3293d510ed0dd802e",
       "triggerType" : "PUSH"
     }, {
       "hash" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "status" : "DELETED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12448",
       "triggerID" : "19f2ee38cba5f3ea77fa6c0a7f14a4a4f1344ad3",
       "triggerType" : "PUSH"
     }, {
       "hash" : "5b091c27f98965150dbe479e39c948806ca02786",
       "status" : "CANCELED",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514",
       "triggerID" : "5b091c27f98965150dbe479e39c948806ca02786",
       "triggerType" : "PUSH"
     }, {
       "hash" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517",
       "triggerID" : "2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 5b091c27f98965150dbe479e39c948806ca02786 Azure: [CANCELED](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12514) 
   * 2e2e3c8bad4a90c1e84642ac19fc2aa2dc20c735 Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=12517) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org