You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Robert Metzger (Jira)" <ji...@apache.org> on 2020/05/11 15:04:00 UTC
[jira] [Commented] (FLINK-17573) There is duplicate source data in ProcessWindowFunction

    [ https://issues.apache.org/jira/browse/FLINK-17573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104532#comment-17104532 ] 

Robert Metzger commented on FLINK-17573:
----------------------------------------

Thank you for opening a ticket.

Can you please post this question on the user@flink.apache.org mailing list (make sure to subscribe first).
This bug report is lacking a lot of information. I will close it unless you provide more details on the issue.

> There is duplicate source data in ProcessWindowFunction
> -------------------------------------------------------
>
>                 Key: FLINK-17573
>                 URL: https://issues.apache.org/jira/browse/FLINK-17573
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>            Reporter: Tammy zhang
>            Priority: Major
>
> i consumed kafka topic data, and keyby the stream, then use a ProcessWindowFunction in this keyedStream, and a strange phenomenon appeared, the process function's sourceData become duplicated, like:
> Input Data iterator:[H2update 623.0 2020-05-08 15:19:25.14, H2update 297.0 2020-05-08 15:19:28.501, H2update 832.0 2020-05-08 15:19:29.415]
>  data iterator end----------------------------------
> Input Data iterator:[H1400 59.0 2020-05-08 15:19:07.087, H1400 83.0 2020-05-08 15:19:09.521]
>  data iterator end----------------------------------
> Input Data iterator:[H2insert 455.0 2020-05-08 15:19:23.066, H2insert 910.0 2020-05-08 15:19:23.955, H2insert 614.0 2020-05-08 15:19:24.397, H2insert 556.0 2020-05-08 15:19:27.389, H2insert 922.0 2020-05-08 15:19:27.761, H2insert 165.0 2020-05-08 15:19:28.26]
>  data iterator end----------------------------------
> Input Data iterator:[H1400 59.0 2020-05-08 15:19:07.087, H1400 83.0 2020-05-08 15:19:09.521]
>  data iterator end----------------------------------
> Input Data iterator:[H2update 623.0 2020-05-08 15:19:25.14, H2update 297.0 2020-05-08 15:19:28.501, H2update 832.0 2020-05-08 15:19:29.415]
>  data iterator end----------------------------------
> Input Data iterator:[H2insert 455.0 2020-05-08 15:19:23.066, H2insert 910.0 2020-05-08 15:19:23.955, H2insert 614.0 2020-05-08 15:19:24.397, H2insert 556.0 2020-05-08 15:19:27.389, H2insert 922.0 2020-05-08 15:19:27.761, H2insert 165.0 2020-05-08 15:19:28.26]
>  data iterator end----------------------------------
> I can ensure that there is no duplication of kafka data, Could you help me point out where the problem is, thanks a lot



--
This message was sent by Atlassian Jira
(v8.3.4#803005)