You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Istvan Darvas (Jira)" <ji...@apache.org> on 2022/02/01 18:06:00 UTC
[jira] [Created] (HUDI-3362) Hudi 0.8.0 cannot roleback MoR table
Istvan Darvas created HUDI-3362:
-----------------------------------
Summary: Hudi 0.8.0 cannot roleback MoR table
Key: HUDI-3362
URL: https://issues.apache.org/jira/browse/HUDI-3362
Project: Apache Hudi
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Istvan Darvas
Attachments: hoodie.zip, rollback_20220131215514.txt, rollback_log.txt, rollback_log_v2.txt
Hi Guys,
Environment: AWS EMR 6.4 / Hudi v0.8.0
Problem: I have a MoR table wich is ingested by DeltaStremer, and after a certain time, DeltaStremer stops working with a message like this:
{{diagnostics: User class threw exception: org.apache.hudi.exception.HoodieRollbackException: Found commits after time :20220131215051, please rollback greater commits first}}
It is usually a replace commit, I would say I am pretty sure in this.
I have a commits in the timeline:
20220131214354<-before
20220131215051<-error message
20220131215514<-after
So as it was told to me, I try to rollback with the following steps in hudi-cli:
1.) connect --path s3://scgps-datalake/iot_raw/ingress_pkg_decoded_rep / SUCCESS
2.) savepoint create --commit 20220131214354 --sparkMaster local[2] / SUCCESS
3.) savepoint rollback --savepoint 20220131214354 --sparkMaster local[2] / FAILED
3.) savepoint rollback --savepoint 20220131215514--sparkMaster local[2] / FAILED
Long story short, if I run a situation like this I am not able to solve it with the known methods ;) - My use-case is working progress, but I cannot go prod an issue like this.
My question, what would be the right steps / commands to solve an issue like this, and be able to restart deltastremer again.
This table, does not have dimension data, so I am happy to share the whole table if someone curiuous (if that is needed or would be helpful, lets talk in a private mail / slack about the sharing). ~15GB ;) it was stoped after a few run, actually after the 1st clustering.
I use this clustering config in the DeltaStremer:
hoodie.clustering.inline=true
hoodie.clustering.inline.enabled=true
hoodie.clustering.inline.max.commits=36
hoodie.clustering.plan.strategy.sort.columns=correlation_id
hoodie.clustering.plan.strategy.daybased.lookback.partitions=7
hoodie.clustering.plan.strategy.target.file.max.bytes=268435456
hoodie.clustering.plan.strategy.small.file.limit=134217728
hoodie.clustering.plan.strategy.max.bytes.per.group=671088640
I hope there is someone who can help me to tackle with this, becase if I able to solve this manually, I would be confident to go prod.
So thanks in advance,
Darvi
Slack Hudi: istvan darvas / U02NTACPHPU
--
This message was sent by Atlassian Jira
(v8.20.1#820001)