You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/06/20 15:27:06 UTC

[GitHub] [iceberg] rdblue commented on pull request #5061: Flink: upsert table join failed

rdblue commented on PR #5061:
URL: https://github.com/apache/iceberg/pull/5061#issuecomment-1160585124

   To me, it looks like the problem is that the join is producing CDC records. A join would normally not produce output records that don't match the filter, so it is strange to have `+I[2, ...]` and `-D[2, ...]` in the join output.
   
   I'm not sure how to solve this, but it looks like a problem with Flink. Flink shouldn't produce CDC rows for a straightforward join. It should just produce the regular SQL output set, which is:
   
   ```
   id | data
   ---+---------
    1 | 20220607
    3 | 20220505
   ```
   
   > We are currently using the new Sink API to fix it temporarily, FYI #4904 (add a configuration that specifies whether to enable the new version sink), that use sink instead of transform, but in this way, we can't get the CDC data.
   
   Do you mean that the new sink doesn't pass the `RowKind` and you only get in inserted rows?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org