You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Lijie Wang (Jira)" <ji...@apache.org> on 2022/10/27 07:45:00 UTC

[jira] [Comment Edited] (FLINK-29167) Time out-of-order optimization for merging multiple data streams into one data stream

    [ https://issues.apache.org/jira/browse/FLINK-29167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17624927#comment-17624927 ] 

Lijie Wang edited comment on FLINK-29167 at 10/27/22 7:44 AM:
--------------------------------------------------------------

Is your job bounded stream? Or unbounded stream?


was (Author: wanglijie95):
Is your job bounded stream? Or unbounded stream

>  Time out-of-order optimization for merging multiple data streams into one data stream
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-29167
>                 URL: https://issues.apache.org/jira/browse/FLINK-29167
>             Project: Flink
>          Issue Type: Improvement
>          Components: API / DataStream
>    Affects Versions: 1.14.2
>            Reporter: zhangyang
>            Priority: Major
>             Fix For: 1.14.2
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> Problem Description: 
>      I have many demand scenarios and need to combine more than 2 data streams (DataStreams) into one data stream. The business behind the data stream processing requires the time sequence of events to complete the scene requirements, so I use the union operator of flink to The confluence is completed, but the data after the confluence does not guarantee its original event time sequence.
> {code:java}
> dataStream0 = dataStream0.union(dataStreamArray);  {code}
> Design suggestion: 
>     When designing the source code, you can merge into the stream in the order of the array in the dataStreamArray instead of random order.
>  
> Solution suggestion: 
>    At present, I use windowAll to sort the data after the confluence in chronological order, and complete the overall scene realization, but the parallelism of windowAll can only be 1, which affects the performance of the entire directed acyclic graph. In addition, there are two confluence scene sorting scenes. I haven't thought of a good remedy, so I can only think that the union of the union is the sequence, which can save a lot of unnecessary trouble for the event-time stream merging.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)