You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/06/24 01:11:52 UTC

[GitHub] [hudi] dongkelun commented on pull request #5633: [HUDI-4123] Fix the exception due to SqlSource return null checkpoint

dongkelun commented on PR #5633:
URL: https://github.com/apache/hudi/pull/5633#issuecomment-1165064752

   > sorry, I dont' understand why you are setting "--checkpoint earliest" w/ your spark-submit job. You should not set any checkpoint value if I am not wrong. can you help me understand. "earliest/latest" is meant for auto reset for kafka sources.
   
   First of all, you are absolutely correct. The reason why I set the value of checkpoint is that sqlsource in version 0.9.0 cannot extract data if checkpoint is not set,There will be the following logs:
   ```java
   No new data, source checkpoint has not changed. Nothing to commit. Old checkpoint=(Optional.empty). New Checkpoint=(null) 
   ```
   So I try to set checkpoint and set a meaningless value, and then I can extract the data, but there will be this exception when I extract again.
   
   In the new version, the problem that data cannot be extracted has been solved by adding the parameter ` --allow-commit-on-no-checkpoint-change',However, if the user mistakenly sets a checkpoint that should not be set, there will still be this exception, so I think we should solve this problem and avoid this exception
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org