You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by 刘建刚 <li...@gmail.com> on 2019/02/27 02:02:46 UTC

One source is much slower than the other side when join history data

      When consuming history data in join operator with eventTime, reading
data from one source is much slower than the other. As a result, the join
operator will cache much data from the faster source in order to wait the
slower source.
      The question is that how can I make the difference of consumers'
speed small?

Re: One source is much slower than the other side when join history data

Posted by liujiangang <li...@gmail.com>.
Thank you very much.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: One source is much slower than the other side when join history data

Posted by Konstantin Knauf <ko...@ververica.com>.
Hi,

this topic has been discussed a lot recently in the community as "Event
Time Alignment/Synchronization" [1,2]. These discussion should provide a
starting point.

Cheers,

Konstantin

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sharing-state-between-subtasks-td24489.html
[2] https://issues.apache.org/jira/browse/FLINK-10886



On Wed, Feb 27, 2019 at 3:03 AM 刘建刚 <li...@gmail.com> wrote:

>       When consuming history data in join operator with eventTime, reading
> data from one source is much slower than the other. As a result, the join
> operator will cache much data from the faster source in order to wait the
> slower source.
>       The question is that how can I make the difference of consumers'
> speed small?
>


-- 

Konstantin Knauf | Solutions Architect

+49 160 91394525

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen

Re: One source is much slower than the other side when join history data

Posted by Konstantin Knauf <ko...@ververica.com>.
Hi,

this topic has been discussed a lot recently in the community as "Event
Time Alignment/Synchronization" [1,2]. These discussion should provide a
starting point.

Cheers,

Konstantin

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sharing-state-between-subtasks-td24489.html
[2] https://issues.apache.org/jira/browse/FLINK-10886



On Wed, Feb 27, 2019 at 3:03 AM 刘建刚 <li...@gmail.com> wrote:

>       When consuming history data in join operator with eventTime, reading
> data from one source is much slower than the other. As a result, the join
> operator will cache much data from the faster source in order to wait the
> slower source.
>       The question is that how can I make the difference of consumers'
> speed small?
>


-- 

Konstantin Knauf | Solutions Architect

+49 160 91394525

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen