You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "haggy (via GitHub)" <gi...@apache.org> on 2023/02/14 18:04:42 UTC

[GitHub] [hudi] haggy commented on issue #6278: [SUPPORT] Deltastreamer fails with data and timestamp related exception after upgrading to EMR 6.5 and spark3

haggy commented on issue #6278:
URL: https://github.com/apache/hudi/issues/6278#issuecomment-1430162140

   Just as an FYI, I ran into something very similar to this and was unable to get passed the error (below) with any of the `datetimeRebase*` configurations. 
   
   The TL;DR is that Hudi `0.12.1` does not appear to have this issue, whereas `0.11.1` does. 
   
   The long version:
   
   We are using Hudi `0.11.1` Deltastreamer ingesting from Kafka into S3. 
   Our workaround was to "freeze" the dataset that was causing this issue by running a second deltastreamer using `0.12.1` from the `checkpoint.key` of the primary deltastreamer into a staging location, then manually move the checkpoint for the primary process ahead of the records that were causing the issue by editing the latest `commit` instance file. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org