You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "voonhous (via GitHub)" <gi...@apache.org> on 2023/04/26 10:01:05 UTC
[GitHub] [hudi] voonhous commented on pull request #8579: [MINOR] Added docs for gotchas when using PartialUpdateAvroPayload
voonhous commented on PR #8579:
URL: https://github.com/apache/hudi/pull/8579#issuecomment-1523138078
> > Results will be different if combineAndUpdateValue is invoked in order without invoking preCombine.
>
> What is exactly lost here?
# PreCombine + combineAndGetUpdateValue
```
Table schema: {id: int , name: string, price: double, _ts: int}
recordKey: id
precombineField: _ts
Table initial state:
[1 a1_0 10.0 1001]
Table performs an update with an incoming batch that has the following results (2):
(precombine + combineAndGetUpdateValue)
[
[1 a1_0 11.0 999],
[1 a1_0 null 1001]
]
End state of the table:
[1 a1_0 11.0 1001]
```
This is so as the incoming batch at (2) will be `preCombine` before performing a `combineAndUpdateValue`.
# combineAndGetUpdateValue ONLY
Results will be different if `combineAndUpdateValue` is invoked in order without invoking `preCombine`.
Example:
```
Table initial state:
[1 a1_0 10.0 1001]
Table performs an update:
(combineAndGetUpdateValue)
[1 a1_0 11.0 999]
Table performs an update again:
(combineAndGetUpdateValue)
[1 a1_0 null 1001]
End state of the table:
[1 a1_0 10.0 1001]
```
[
[1 a1_0 11.0 999],
[1 a1_0 null 1001]
]
End state of the table:
[1 a1_0 11.0 1001]
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org