You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by mo...@gmail.com, mo...@gmail.com on 2019/02/02 16:11:17 UTC

Is the order guaranteed with Windowall

Hello,

I use Watermarks and a function to sort events at the end of my pipeline.
I've used this tutorial to sort my data: https://training.da-platform.com/exercises/carSort.html
SingleOutputStreamOperator<XXXX> sortStream = streamKeyed.process(new SortEventFunction())..

Then I want to apply a Window and use AggregateFunction to obtain a group of data. Thus when a trigger is launched, I can push all these data to my backend at the same time (with puts method for Hbase for example)
But the order here must be guaranteed.
Can I use a windowAll on that stream ?
sortStream.windowAll(...

Thanks in advance
David



Re: Is the order guaranteed with Windowall

Posted by mo...@gmail.com, mo...@gmail.com.
ok great. 
Thanks !

On 2019/02/04 18:00:16, Fabian Hueske <fh...@gmail.com> wrote: 
> Yes, I think that should work.
> 
> Best, Fabian
> 
> Am Mo., 4. Feb. 2019 um 18:35 Uhr schrieb morin.david.bzh@gmail.com <
> morin.david.bzh@gmail.com>:
> 
> > Hello Fabian,
> >
> > Thanks !
> > According to your answers on this post
> > https://stackoverflow.com/questions/50340107/order-of-events-with-chained-keyby-calls-on-same-key,
> > if I'm right I can use my sort function followed by a keyby and use a
> > Window for aggregate these events. And the order will be preserved if I use
> > the same Key and the same partionning. I'm right ?
> >
> >         SingleOutputStreamOperator<XXX> sortStream =
> > conversionStreamKeyed.process(new
> > SortEventFunction()).setParallelism(1).name("Sort events");
> >
> >        // use the same key
> >         KeyedStream<XX, String> sortStreamKeyed = sortStream.keyBy((XXX
> > event) -> event.getPartitionKey());
> >
> >
> > sortStreamKeyed.window(TumblingProcessingTimeWindow....setParallelism(1).name("Aggregate
> > events");
> >
> > Thanks
> > David
> >
> > On 2019/02/04 13:54:14, Fabian Hueske <fh...@gmail.com> wrote:
> > > Hi,
> > >
> > > A WindowAll is executed in a single task. If you sort the data before the
> > > window, the sorting must also happen in a single task, i.e., with
> > > parallelism 1.
> > > The reasons is that an operator somewhat randomly merges multiple input
> > > partitions. So even if each input partition is sorted, the merging will
> > > result in out-of-order data.
> > >
> > > Best,
> > > Fabian
> > >
> > > Am Sa., 2. Feb. 2019 um 17:11 Uhr schrieb morin.david.bzh@gmail.com <
> > > morin.david.bzh@gmail.com>:
> > >
> > > > Hello,
> > > >
> > > > I use Watermarks and a function to sort events at the end of my
> > pipeline.
> > > > I've used this tutorial to sort my data:
> > > > https://training.da-platform.com/exercises/carSort.html
> > > > SingleOutputStreamOperator<XXXX> sortStream = streamKeyed.process(new
> > > > SortEventFunction())..
> > > >
> > > > Then I want to apply a Window and use AggregateFunction to obtain a
> > group
> > > > of data. Thus when a trigger is launched, I can push all these data to
> > my
> > > > backend at the same time (with puts method for Hbase for example)
> > > > But the order here must be guaranteed.
> > > > Can I use a windowAll on that stream ?
> > > > sortStream.windowAll(...
> > > >
> > > > Thanks in advance
> > > > David
> > > >
> > > >
> > > >
> > >
> >
> 

Re: Is the order guaranteed with Windowall

Posted by Fabian Hueske <fh...@gmail.com>.
Yes, I think that should work.

Best, Fabian

Am Mo., 4. Feb. 2019 um 18:35 Uhr schrieb morin.david.bzh@gmail.com <
morin.david.bzh@gmail.com>:

> Hello Fabian,
>
> Thanks !
> According to your answers on this post
> https://stackoverflow.com/questions/50340107/order-of-events-with-chained-keyby-calls-on-same-key,
> if I'm right I can use my sort function followed by a keyby and use a
> Window for aggregate these events. And the order will be preserved if I use
> the same Key and the same partionning. I'm right ?
>
>         SingleOutputStreamOperator<XXX> sortStream =
> conversionStreamKeyed.process(new
> SortEventFunction()).setParallelism(1).name("Sort events");
>
>        // use the same key
>         KeyedStream<XX, String> sortStreamKeyed = sortStream.keyBy((XXX
> event) -> event.getPartitionKey());
>
>
> sortStreamKeyed.window(TumblingProcessingTimeWindow....setParallelism(1).name("Aggregate
> events");
>
> Thanks
> David
>
> On 2019/02/04 13:54:14, Fabian Hueske <fh...@gmail.com> wrote:
> > Hi,
> >
> > A WindowAll is executed in a single task. If you sort the data before the
> > window, the sorting must also happen in a single task, i.e., with
> > parallelism 1.
> > The reasons is that an operator somewhat randomly merges multiple input
> > partitions. So even if each input partition is sorted, the merging will
> > result in out-of-order data.
> >
> > Best,
> > Fabian
> >
> > Am Sa., 2. Feb. 2019 um 17:11 Uhr schrieb morin.david.bzh@gmail.com <
> > morin.david.bzh@gmail.com>:
> >
> > > Hello,
> > >
> > > I use Watermarks and a function to sort events at the end of my
> pipeline.
> > > I've used this tutorial to sort my data:
> > > https://training.da-platform.com/exercises/carSort.html
> > > SingleOutputStreamOperator<XXXX> sortStream = streamKeyed.process(new
> > > SortEventFunction())..
> > >
> > > Then I want to apply a Window and use AggregateFunction to obtain a
> group
> > > of data. Thus when a trigger is launched, I can push all these data to
> my
> > > backend at the same time (with puts method for Hbase for example)
> > > But the order here must be guaranteed.
> > > Can I use a windowAll on that stream ?
> > > sortStream.windowAll(...
> > >
> > > Thanks in advance
> > > David
> > >
> > >
> > >
> >
>

Re: Is the order guaranteed with Windowall

Posted by mo...@gmail.com, mo...@gmail.com.
Hello Fabian,

Thanks ! 
According to your answers on this post https://stackoverflow.com/questions/50340107/order-of-events-with-chained-keyby-calls-on-same-key, if I'm right I can use my sort function followed by a keyby and use a Window for aggregate these events. And the order will be preserved if I use the same Key and the same partionning. I'm right ?

        SingleOutputStreamOperator<XXX> sortStream = conversionStreamKeyed.process(new SortEventFunction()).setParallelism(1).name("Sort events");

       // use the same key
        KeyedStream<XX, String> sortStreamKeyed = sortStream.keyBy((XXX event) -> event.getPartitionKey());

      sortStreamKeyed.window(TumblingProcessingTimeWindow....setParallelism(1).name("Aggregate events");

Thanks
David

On 2019/02/04 13:54:14, Fabian Hueske <fh...@gmail.com> wrote: 
> Hi,
> 
> A WindowAll is executed in a single task. If you sort the data before the
> window, the sorting must also happen in a single task, i.e., with
> parallelism 1.
> The reasons is that an operator somewhat randomly merges multiple input
> partitions. So even if each input partition is sorted, the merging will
> result in out-of-order data.
> 
> Best,
> Fabian
> 
> Am Sa., 2. Feb. 2019 um 17:11 Uhr schrieb morin.david.bzh@gmail.com <
> morin.david.bzh@gmail.com>:
> 
> > Hello,
> >
> > I use Watermarks and a function to sort events at the end of my pipeline.
> > I've used this tutorial to sort my data:
> > https://training.da-platform.com/exercises/carSort.html
> > SingleOutputStreamOperator<XXXX> sortStream = streamKeyed.process(new
> > SortEventFunction())..
> >
> > Then I want to apply a Window and use AggregateFunction to obtain a group
> > of data. Thus when a trigger is launched, I can push all these data to my
> > backend at the same time (with puts method for Hbase for example)
> > But the order here must be guaranteed.
> > Can I use a windowAll on that stream ?
> > sortStream.windowAll(...
> >
> > Thanks in advance
> > David
> >
> >
> >
> 

Re: Is the order guaranteed with Windowall

Posted by Fabian Hueske <fh...@gmail.com>.
Hi,

A WindowAll is executed in a single task. If you sort the data before the
window, the sorting must also happen in a single task, i.e., with
parallelism 1.
The reasons is that an operator somewhat randomly merges multiple input
partitions. So even if each input partition is sorted, the merging will
result in out-of-order data.

Best,
Fabian

Am Sa., 2. Feb. 2019 um 17:11 Uhr schrieb morin.david.bzh@gmail.com <
morin.david.bzh@gmail.com>:

> Hello,
>
> I use Watermarks and a function to sort events at the end of my pipeline.
> I've used this tutorial to sort my data:
> https://training.da-platform.com/exercises/carSort.html
> SingleOutputStreamOperator<XXXX> sortStream = streamKeyed.process(new
> SortEventFunction())..
>
> Then I want to apply a Window and use AggregateFunction to obtain a group
> of data. Thus when a trigger is launched, I can push all these data to my
> backend at the same time (with puts method for Hbase for example)
> But the order here must be guaranteed.
> Can I use a windowAll on that stream ?
> sortStream.windowAll(...
>
> Thanks in advance
> David
>
>
>