You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "weimingdiit (via GitHub)" <gi...@apache.org> on 2023/03/24 10:54:51 UTC

[GitHub] [hudi] weimingdiit opened a new issue, #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to write data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

weimingdiit opened a new issue, #8283:
URL: https://github.com/apache/hudi/issues/8283

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   A clear and concise description of the problem.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.
   2.
   3.
   4.
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version :
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on issue #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to insert overwrite data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on issue #8283:
URL: https://github.com/apache/hudi/issues/8283#issuecomment-1493654982

   @boundarymate the issue desc has point it out. It needs to be forward compatible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] boundarymate commented on issue #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to insert overwrite data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

Posted by "boundarymate (via GitHub)" <gi...@apache.org>.
boundarymate commented on issue #8283:
URL: https://github.com/apache/hudi/issues/8283#issuecomment-1490172590

   > #7365 look like this pr change the dynamic action. Before it, hudi's overwrite is always dynamic, and I check the doc in `https://hudi.apache.org/releases/release-0.13.0` didn't remind it. It will cause serious data problems if upgrade to 0.13.0, user will delete all data by mistake. May be hudi need use some config to make user know this action or limit cover the whole table.
   
   By the way, where is the bug in #7365 ? Can you point it out, thanks~


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on issue #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to insert overwrite data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on issue #8283:
URL: https://github.com/apache/hudi/issues/8283#issuecomment-1482748957

   @nsivabalan @yihua @XuQianJin-Stars @weimingdiit I think this need remind in doc or add check in 0.13.1, what about you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on issue #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to insert overwrite data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on issue #8283:
URL: https://github.com/apache/hudi/issues/8283#issuecomment-1482745990

   #7365 look like this pr change the dynamic action. Before it, hudi's overwrite is always dynamic, and I check the doc in `https://hudi.apache.org/releases/release-0.13.0` didn't remind it. It will cause serious data problems if upgrade to 0.13.0, user will delete all data by mistake. May be hudi need use some config to make user know this action or limit cover the whole table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] KnightChess commented on issue #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to insert overwrite data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

Posted by "KnightChess (via GitHub)" <gi...@apache.org>.
KnightChess commented on issue #8283:
URL: https://github.com/apache/hudi/issues/8283#issuecomment-1493656600

   create a ticket to tracking it https://issues.apache.org/jira/browse/HUDI-6021 @weimingdiit may be we can support some config or use existing parameters to check or tip it. like: dynamic param in spark, strict check in hive, or a new hudi config to controller it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] weimingdiit commented on issue #8283: [SUPPORT] In version 0.13.0, when using dynamic partition to insert overwrite data, the table will be cleared first, and then the corresponding partition data will be written. Is it not as expected? Why clean the table first?

Posted by "weimingdiit (via GitHub)" <gi...@apache.org>.
weimingdiit commented on issue #8283:
URL: https://github.com/apache/hudi/issues/8283#issuecomment-1493759703

   > create a ticket to tracking it https://issues.apache.org/jira/browse/HUDI-6021 @weimingdiit may be we can support some config or use existing parameters to check or tip it. like: dynamic param in spark, strict check in hive, or a new hudi config to controller it
   
   OK, good idea, I will try to fix this bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org