You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/25 09:19:15 UTC

[GitHub] [hudi] xiarixiaoyao opened a new pull request, #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

xiarixiaoyao opened a new pull request, #5973:
URL: https://github.com/apache/hudi/pull/5973

   …ionForTableType is flaky
   
   ## *Tips*
   - *Thank you very much for contributing to Apache Hudi.*
   - *Please review https://hudi.apache.org/contribute/how-to-contribute before opening a pull request.*
   
   ## What is the purpose of the pull request
   
   Fix TestHoodieSparkSqlWriter.testSchemaEvolutionForTableType failed random
   the reason for it is that:
   1)when we use glob path to read hudi table (like spark.read.format("hudi").load("/tmp/tableName///*")). spark will infer parquet schema auto,
   2)when evolution happen, hoodie table may exist different schema parquet files, spark choose a parquet file randomly to infer schema
   3)once spark choose a old parquet file,an old schema will be used which is wrong.
   
   Therefore we should specify the schema to the latest commit schema since the table schema evolution.
   *(For example: This pull request adds quick-start document.)*
   
   ## Brief change log
   
   *(for example:)*
     - *Modify AnnotationLocation checkstyle rule in checkstyle.xml*
   
   ## Verify this pull request
   
   *(Please pick either of the following options)*
   
   This pull request is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This pull request is already covered by existing tests, such as *(please describe tests)*.
   
   (or)
   
   This change added tests and can be verified as follows:
   
   *(example:)*
   
     - *Added integration tests for end-to-end.*
     - *Added HoodieClientWriteTest to verify the change.*
     - *Manually verified the change by running a job locally.*
   
   ## Committer checklist
   
    - [ ] Has a corresponding JIRA in PR title & commit
    
    - [ ] Commit message is descriptive of the change
    
    - [ ] CI is green
   
    - [ ] Necessary doc changes done or have another open PR
          
    - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xiarixiaoyao commented on pull request #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

Posted by GitBox <gi...@apache.org>.
xiarixiaoyao commented on PR #5973:
URL: https://github.com/apache/hudi/pull/5973#issuecomment-1166239204

   @nsivabalan @leesf @xushiyan  could you pls help me review this pr, thanks
   
   this bug is has fixed in 0.10.0, https://github.com/apache/hudi/blob/cc3896be2a023f8819883d745178503286ac2ab1/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala#L217
   but  current branch is missing these fix codes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] leesf merged pull request #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

Posted by GitBox <gi...@apache.org>.
leesf merged PR #5973:
URL: https://github.com/apache/hudi/pull/5973


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5973:
URL: https://github.com/apache/hudi/pull/5973#issuecomment-1166267062

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3c419b0ed59c591739333c8adb139bfbd47af09c",
       "status" : "SUCCESS",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9534",
       "triggerID" : "3c419b0ed59c591739333c8adb139bfbd47af09c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3c419b0ed59c591739333c8adb139bfbd47af09c Azure: [SUCCESS](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9534) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5973:
URL: https://github.com/apache/hudi/pull/5973#issuecomment-1166245786

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3c419b0ed59c591739333c8adb139bfbd47af09c",
       "status" : "PENDING",
       "url" : "https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9534",
       "triggerID" : "3c419b0ed59c591739333c8adb139bfbd47af09c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3c419b0ed59c591739333c8adb139bfbd47af09c Azure: [PENDING](https://dev.azure.com/apache-hudi-ci-org/785b6ef4-2f42-4a89-8f0e-5f0d7039a0cc/_build/results?buildId=9534) 
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] hudi-bot commented on pull request #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

Posted by GitBox <gi...@apache.org>.
hudi-bot commented on PR #5973:
URL: https://github.com/apache/hudi/pull/5973#issuecomment-1166245024

   <!--
   Meta data
   {
     "version" : 1,
     "metaDataEntries" : [ {
       "hash" : "3c419b0ed59c591739333c8adb139bfbd47af09c",
       "status" : "UNKNOWN",
       "url" : "TBD",
       "triggerID" : "3c419b0ed59c591739333c8adb139bfbd47af09c",
       "triggerType" : "PUSH"
     } ]
   }-->
   ## CI report:
   
   * 3c419b0ed59c591739333c8adb139bfbd47af09c UNKNOWN
   
   <details>
   <summary>Bot commands</summary>
     @hudi-bot supports the following commands:
   
    - `@hudi-bot run azure` re-run the last Azure build
   </details>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] xushiyan commented on pull request #5973: [HUDI-4296]Fix the bug that TestHoodieSparkSqlWriter.testSchemaEvolut…

Posted by GitBox <gi...@apache.org>.
xushiyan commented on PR #5973:
URL: https://github.com/apache/hudi/pull/5973#issuecomment-1166262504

   this PR looks the same as https://github.com/apache/hudi/pull/5948/ why revert and put another PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org