You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by jiaxl <jt...@hotmail.com> on 2018/07/24 15:15:40 UTC

is Flink's recovery speed still slow?

From conclusion of this paper  https://dl.acm.org/citation.cfm?id=3132750
<https://dl.acm.org/citation.cfm?id=3132750http://>  , Flink's recovery
speed is slower than that of Spark Streaming, which will be a problem in
large scale deployment where fault happens frequently. 
I'd like to know whether this is still a problem or not. Any advices are
appreciated.



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: is Flink's recovery speed still slow?

Posted by Chen Qin <qi...@gmail.com>.
As far as I learned from folks with better understanding than myself , barrier alignment might be only path to get deterministic output. 

Any state or outcome between barrier alignments requires second thought(like UDP packages from network). Currently, alignment is used only do heavyweight checkpointing. If folks decided to improve algorithm and use in other ways like auto scaling or secondary task shadowing is still TBD.

Chen

> On Jul 24, 2018, at 18:57, vino yang <ya...@gmail.com> wrote:
> 
> Hi jiaxl,
> 
> The paper you mentioned was published at 2017. I think it doesn't have much
> reference value now.
> Over time, both frameworks are constantly evolving.
> At the end of May this year, Flink has supported the major feature of local
> recovery in the latest release of version 1.5.
> This greatly improves the speed of recovery.
> Flink has not stopped the improvement of state recovery and fault
> tolerance.
> I think you can verify it yourself.
> 
> Thanks, vino.
> 
> 
> 2018-07-24 23:15 GMT+08:00 jiaxl <jt...@hotmail.com>:
> 
>> From conclusion of this paper  https://dl.acm.org/citation.cfm?id=3132750
>> <https://dl.acm.org/citation.cfm?id=3132750http://>  , Flink's recovery
>> speed is slower than that of Spark Streaming, which will be a problem in
>> large scale deployment where fault happens frequently.
>> I'd like to know whether this is still a problem or not. Any advices are
>> appreciated.
>> 
>> 
>> 
>> --
>> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>> 

Re: is Flink's recovery speed still slow?

Posted by vino yang <ya...@gmail.com>.
Hi jiaxl,

Thanks for your verification!

Yes, Flink is growing very fast. There is really not much benchmark or blog
to explore this topic, after all, the local recovery feature is released in
version 1.5. The time point is not long before, and this part is still
being improved and not very mature.

Thanks, vino.

2018-07-25 19:16 GMT+08:00 jiaxl <jt...@hotmail.com>:

> Hi vino,
>
> Thanks for your early reply.
>
> Since 2017, developers of Flink have done great job to improve the
> performance. But I didn't find papers or blogs as a response to that paper.
> So I asked this question here.
> Before asking this question, I was doing some experiment with Flink 1.5.1.
> But as you know, it takes some time to tune the system to its best state
> and
> then experiment can be done. So I expect that some experienced developers
> may have done some related research to share.
>
>
> Thanks again, jiaxl
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>

Re: is Flink's recovery speed still slow?

Posted by jiaxl <jt...@hotmail.com>.
Hi vino,

Thanks for your early reply. 

Since 2017, developers of Flink have done great job to improve the
performance. But I didn't find papers or blogs as a response to that paper.
So I asked this question here.
Before asking this question, I was doing some experiment with Flink 1.5.1.
But as you know, it takes some time to tune the system to its best state and
then experiment can be done. So I expect that some experienced developers
may have done some related research to share.


Thanks again, jiaxl



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/

Re: is Flink's recovery speed still slow?

Posted by vino yang <ya...@gmail.com>.
Hi jiaxl,

The paper you mentioned was published at 2017. I think it doesn't have much
reference value now.
Over time, both frameworks are constantly evolving.
At the end of May this year, Flink has supported the major feature of local
recovery in the latest release of version 1.5.
This greatly improves the speed of recovery.
Flink has not stopped the improvement of state recovery and fault
tolerance.
I think you can verify it yourself.

Thanks, vino.


2018-07-24 23:15 GMT+08:00 jiaxl <jt...@hotmail.com>:

> From conclusion of this paper  https://dl.acm.org/citation.cfm?id=3132750
> <https://dl.acm.org/citation.cfm?id=3132750http://>  , Flink's recovery
> speed is slower than that of Spark Streaming, which will be a problem in
> large scale deployment where fault happens frequently.
> I'd like to know whether this is still a problem or not. Any advices are
> appreciated.
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>