You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Weike Dong (Jira)" <ji...@apache.org> on 2022/07/25 13:26:00 UTC

[jira] [Comment Edited] (FLINK-28674) EqualiserCodeGenerator generates wrong equaliser for Timestamp fields in BinaryRowData

    [ https://issues.apache.org/jira/browse/FLINK-28674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570903#comment-17570903 ] 

Weike Dong edited comment on FLINK-28674 at 7/25/22 1:25 PM:
-------------------------------------------------------------

Actually caused by missing timestamp settings in CDC connector options, not a problem in Equaliser.


was (Author: kyledong):
Actually caused by missing timestamp settings in CDC connector options.

> EqualiserCodeGenerator generates wrong equaliser for Timestamp fields in BinaryRowData
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-28674
>                 URL: https://issues.apache.org/jira/browse/FLINK-28674
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Runtime
>    Affects Versions: 1.13.6, 1.14.5, 1.15.1
>         Environment: Flink 1.13.6
>            Reporter: Weike Dong
>            Priority: Major
>         Attachments: image-2022-07-25-20-56-14-111.png, image-2022-07-25-20-59-31-933.png, image-2022-07-25-21-17-33-608.png
>
>
> Hi Devs,
> Recently I have discovered that the _equaliser.equals_ call in _org.apache.flink.table.runtime.operators.sink.SinkUpsertMaterializer#removeFirst_ generates wrong comparison results when two binary rows are the same, like
> !image-2022-07-25-20-56-14-111.png!
> After digging through the generated code for this equaliser, I have found that when the two input RowData are all instances of {_}BinaryRowData{_}, the _BinaryRowData#equals_ method is directly called to give the comparison result. 
> !image-2022-07-25-20-59-31-933.png!
> However, as you can see in the first snapshot, _BinaryRowData#equals_ cannot properly handle complex data types like {_}Timestamp{_}, so it returns _false_ even when the actual timestamp values are the same, causing _SinkUpsertMaterializer_ to falsely think that there are no matches in the states, hence printing errors like "The state is cleared because of state ttl", which eventually leads to the loss of -U data in the final results.
>  
> P.S. the _equals_ method of _BinaryRowData_ actually compares the underlying MemorySegments,
> !image-2022-07-25-21-17-33-608.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)