You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by cY <32...@qq.com> on 2014/10/11 18:30:03 UTC

a question about Exactly once processing

As we know, with acker storm can process data at least once. If we want it to process data exactly once, the status of stream must be saved.Trident(storm's DAG abstract) leaves this problem to users.I would like to know where flink save the status of stream 
                                  : )  Thanks a lot

Re: 回复: a question about Exactly once processing

Posted by Stephan Ewen <se...@apache.org>.
Hi!

That is an exciting project! Let us know if you stumble over any questions!

Stephan


On Sun, Oct 12, 2014 at 2:26 PM, cY <32...@qq.com> wrote:

> Hello
>       I'm the core contributor of JStorm (
> https://github.com/alibaba/jstorm‍) which is the java version of Storm
> with a lot of optimization. I want to build a DAG abstract on storm like
> RDD. However I found the flink and its exciting DAG abstract.So I have an
> idea to make a repository which use the Storm to replace the flink's
> runtime .It will combine flink and storm so that flink can focus on DAG
> (DAG optimization, DAG view, etc) and storm can focus on the engine
> (failover, serialization, etc). Of course it's just an idea and i'm still
> investigate the feasibility of this scheme.Besides flink is a wonderful
> repository and i'm very glad to contribute to it.
>           Thanks  a lot  : P
>            cY
>
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Paris Carbone";<pa...@kth.se>;
> 发送时间: 2014年10月12日(星期天) 下午3:56
> 收件人: "dev@flink.incubator.apache.org"<de...@flink.incubator.apache.org>;
>
> 主题: Re: 回复: a question about Exactly once processing
>
>
>
> Hello,
> Generally speaking the designed solution will most probably combine
> upstream backup with asynchronous state checkpointing. We are looking into
> ways to minimise the communication and recovery costs by exploiting
> properties of the stream (eg. recomputable segments-windows). Are you also
> working on fault tolerance?
>
> Paris
>
> On 11 Oct 2014, at 18:43, cY <32...@qq.com> wrote:
>
> > Thanks for reply
> > Is there any idea about it?This is such a challenge~
> > I care this problem a lot.  : )
> >
> >
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "Gyula Fóra";<gy...@apache.org>;
> > 发送时间: 2014年10月12日(星期天) 凌晨0:38
> > 收件人: "dev"<de...@flink.incubator.apache.org>;
> >
> > 主题: Re: a question about Exactly once processing
> >
> >
> >
> > Hey,
> >
> > There is currently an ongoing effort to enable stateful exactly once
> > processing guarantees in Flink Streaming, but there is no currently
> > available version that supports that. We expect this to be available in
> the
> > next release after 0.7
> >
> > Regards,
> > Gyula
> >
> > On Sat, Oct 11, 2014 at 6:31 PM, cY <32...@qq.com> wrote:
> >
> >> And How to make exactly once processing : P‍
> >>
> >>
> >> ------------------ Original ------------------
> >> From:  "cY";<32...@qq.com>;
> >> Date:  Sun, Oct 12, 2014 00:30 AM
> >> To:  "dev"<de...@flink.incubator.apache.org>;
> >>
> >> Subject:  a question about Exactly once processing
> >>
> >>
> >>
> >> As we know, with acker storm can process data at least once. If we want
> it
> >> to process data exactly once, the status of stream must be
> >> saved.Trident(storm's DAG abstract) leaves this problem to users.I would
> >> like to know where flink save the status of stream
> >>                                  : )  Thanks a lot
>

回复: 回复: a question about Exactly once processing

Posted by cY <32...@qq.com>.
Hello
      I'm the core contributor of JStorm (https://github.com/alibaba/jstorm‍) which is the java version of Storm with a lot of optimization. I want to build a DAG abstract on storm like RDD. However I found the flink and its exciting DAG abstract.So I have an idea to make a repository which use the Storm to replace the flink's runtime .It will combine flink and storm so that flink can focus on DAG (DAG optimization, DAG view, etc) and storm can focus on the engine (failover, serialization, etc). Of course it's just an idea and i'm still investigate the feasibility of this scheme.Besides flink is a wonderful repository and i'm very glad to contribute to it.
          Thanks  a lot  : P
           cY
                                                                                   




------------------ 原始邮件 ------------------
发件人: "Paris Carbone";<pa...@kth.se>;
发送时间: 2014年10月12日(星期天) 下午3:56
收件人: "dev@flink.incubator.apache.org"<de...@flink.incubator.apache.org>; 

主题: Re: 回复: a question about Exactly once processing



Hello,
Generally speaking the designed solution will most probably combine upstream backup with asynchronous state checkpointing. We are looking into ways to minimise the communication and recovery costs by exploiting properties of the stream (eg. recomputable segments-windows). Are you also working on fault tolerance?

Paris

On 11 Oct 2014, at 18:43, cY <32...@qq.com> wrote:

> Thanks for reply
> Is there any idea about it?This is such a challenge~ 
> I care this problem a lot.  : )
> 
> 
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "Gyula Fóra";<gy...@apache.org>;
> 发送时间: 2014年10月12日(星期天) 凌晨0:38
> 收件人: "dev"<de...@flink.incubator.apache.org>; 
> 
> 主题: Re: a question about Exactly once processing
> 
> 
> 
> Hey,
> 
> There is currently an ongoing effort to enable stateful exactly once
> processing guarantees in Flink Streaming, but there is no currently
> available version that supports that. We expect this to be available in the
> next release after 0.7
> 
> Regards,
> Gyula
> 
> On Sat, Oct 11, 2014 at 6:31 PM, cY <32...@qq.com> wrote:
> 
>> And How to make exactly once processing : P‍
>> 
>> 
>> ------------------ Original ------------------
>> From:  "cY";<32...@qq.com>;
>> Date:  Sun, Oct 12, 2014 00:30 AM
>> To:  "dev"<de...@flink.incubator.apache.org>;
>> 
>> Subject:  a question about Exactly once processing
>> 
>> 
>> 
>> As we know, with acker storm can process data at least once. If we want it
>> to process data exactly once, the status of stream must be
>> saved.Trident(storm's DAG abstract) leaves this problem to users.I would
>> like to know where flink save the status of stream
>>                                  : )  Thanks a lot

Re: 回复: a question about Exactly once processing

Posted by Paris Carbone <pa...@kth.se>.
Hello,
Generally speaking the designed solution will most probably combine upstream backup with asynchronous state checkpointing. We are looking into ways to minimise the communication and recovery costs by exploiting properties of the stream (eg. recomputable segments-windows). Are you also working on fault tolerance?

Paris

On 11 Oct 2014, at 18:43, cY <32...@qq.com> wrote:

> Thanks for reply
> Is there any idea about it?This is such a challenge~ 
> I care this problem a lot.  : )
> 
> 
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "Gyula Fóra";<gy...@apache.org>;
> 发送时间: 2014年10月12日(星期天) 凌晨0:38
> 收件人: "dev"<de...@flink.incubator.apache.org>; 
> 
> 主题: Re: a question about Exactly once processing
> 
> 
> 
> Hey,
> 
> There is currently an ongoing effort to enable stateful exactly once
> processing guarantees in Flink Streaming, but there is no currently
> available version that supports that. We expect this to be available in the
> next release after 0.7
> 
> Regards,
> Gyula
> 
> On Sat, Oct 11, 2014 at 6:31 PM, cY <32...@qq.com> wrote:
> 
>> And How to make exactly once processing : P‍
>> 
>> 
>> ------------------ Original ------------------
>> From:  "cY";<32...@qq.com>;
>> Date:  Sun, Oct 12, 2014 00:30 AM
>> To:  "dev"<de...@flink.incubator.apache.org>;
>> 
>> Subject:  a question about Exactly once processing
>> 
>> 
>> 
>> As we know, with acker storm can process data at least once. If we want it
>> to process data exactly once, the status of stream must be
>> saved.Trident(storm's DAG abstract) leaves this problem to users.I would
>> like to know where flink save the status of stream
>>                                  : )  Thanks a lot


回复: a question about Exactly once processing

Posted by cY <32...@qq.com>.
Thanks for reply
 Is there any idea about it?This is such a challenge~ 
I care this problem a lot.  : )




------------------ 原始邮件 ------------------
发件人: "Gyula Fóra";<gy...@apache.org>;
发送时间: 2014年10月12日(星期天) 凌晨0:38
收件人: "dev"<de...@flink.incubator.apache.org>; 

主题: Re: a question about Exactly once processing



Hey,

There is currently an ongoing effort to enable stateful exactly once
processing guarantees in Flink Streaming, but there is no currently
available version that supports that. We expect this to be available in the
next release after 0.7

Regards,
Gyula

On Sat, Oct 11, 2014 at 6:31 PM, cY <32...@qq.com> wrote:

> And How to make exactly once processing : P‍
>
>
> ------------------ Original ------------------
> From:  "cY";<32...@qq.com>;
> Date:  Sun, Oct 12, 2014 00:30 AM
> To:  "dev"<de...@flink.incubator.apache.org>;
>
> Subject:  a question about Exactly once processing
>
>
>
> As we know, with acker storm can process data at least once. If we want it
> to process data exactly once, the status of stream must be
> saved.Trident(storm's DAG abstract) leaves this problem to users.I would
> like to know where flink save the status of stream
>                                   : )  Thanks a lot
>

Re: a question about Exactly once processing

Posted by Gyula Fóra <gy...@apache.org>.
Hey,

There is currently an ongoing effort to enable stateful exactly once
processing guarantees in Flink Streaming, but there is no currently
available version that supports that. We expect this to be available in the
next release after 0.7

Regards,
Gyula

On Sat, Oct 11, 2014 at 6:31 PM, cY <32...@qq.com> wrote:

> And How to make exactly once processing : P‍
>
>
> ------------------ Original ------------------
> From:  "cY";<32...@qq.com>;
> Date:  Sun, Oct 12, 2014 00:30 AM
> To:  "dev"<de...@flink.incubator.apache.org>;
>
> Subject:  a question about Exactly once processing
>
>
>
> As we know, with acker storm can process data at least once. If we want it
> to process data exactly once, the status of stream must be
> saved.Trident(storm's DAG abstract) leaves this problem to users.I would
> like to know where flink save the status of stream
>                                   : )  Thanks a lot
>

Re: a question about Exactly once processing

Posted by cY <32...@qq.com>.
And How to make exactly once processing : P‍


------------------ Original ------------------
From:  "cY";<32...@qq.com>;
Date:  Sun, Oct 12, 2014 00:30 AM
To:  "dev"<de...@flink.incubator.apache.org>; 

Subject:  a question about Exactly once processing



As we know, with acker storm can process data at least once. If we want it to process data exactly once, the status of stream must be saved.Trident(storm's DAG abstract) leaves this problem to users.I would like to know where flink save the status of stream 
                                  : )  Thanks a lot