You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Andrew Ge Wu <an...@eniro.com> on 2016/08/03 12:09:58 UTC

Parallel execution on AllWindows

Hi,

I have such task that I want to count window on a stream and execute them batch by batch.
Execute a count window may take some time, so I want it to be executed in parallel.
I read this part in the documentation when I found it automatically reduced parallelization to 1

* Note: This operation can be inherently non-parallel since all elements have to pass through
* the same operator instance. (Only for special cases, such as aligned time windows is
* it possible to perform this operation in parallel).
(It looks like the java doc is copied from timeWindowAll)

If I force all window function to run in parallel, what will happen?
Will a time/count window broadcast to all instances of the function? or will it be send to one of the instance so I can parallelize my work?


Thanks!



Andrew
-- 
Confidentiality Notice: This e-mail transmission may contain confidential 
or legally privileged information that is intended only for the individual 
or entity named in the e-mail address. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution, or reliance upon the contents of this e-mail is strictly 
prohibited and may be unlawful. If you have received this e-mail in error, 
please notify the sender immediately by return e-mail and delete all copies 
of this message.

Re: Parallel execution on AllWindows

Posted by Andrew Ge Wu <an...@eniro.com>.
Thanks for the quick response, everything is clear!

cheers!

Andrew
> On 03 Aug 2016, at 18:11, Aljoscha Krettek <al...@apache.org> wrote:
> 
> Hi,
> "rebalance" simply specifies the strategy to use when sending elements downstream to the next operator(s). There is no interaction or competition between the parallel window operator instances. Each will do windowing locally based on the elements that it receives from upstream.
> 
> Cheers,
> Aljoscha
> 
> On Wed, 3 Aug 2016 at 08:26 <andrew.ge-wu@eniro.com <ma...@eniro.com>> wrote:
> Hi Aljoscha
> 
> Thanks for the explanation.
> One other thing, when you say there is no coordination is that means rebalance() will not be honored, and each window operator instance will compete for the next available window?
> 
> Thanks
> 
> Andrew
> From mobile
> 
> From: Aljoscha Krettek
> Sent: Wednesday, August 3, 17:11
> Subject: Re: Parallel execution on AllWindows
> To: user@flink.apache.org <ma...@flink.apache.org>
> Hi,
> 
> if you manually force a parallelism different from 1 after a *windowAll() then you will get parallel execution of your window. For example, if you do this:
> 
> input.countWindowAll(100).setParallelism(5)
> 
> then you will get five parallel instances of the window operator that each wait for 100 elements before they fire the window. There is no global coordination between the parallel instances that would allow it to fire once 100 elements are received across the parallel instances.
> 
> Cheers,
> 
> Aljoscha
> 
> On Wed, 3 Aug 2016 at 05:10 Andrew Ge Wu <andrew.ge-wu@eniro.com <ma...@eniro.com>> wrote:
> 
>> Hi,
>> 
>> I have such task that I want to count window on a stream and execute them batch by batch.
>> 
>> Execute a count window may take some time, so I want it to be executed in parallel.
>> 
>> I read this part in the documentation when I found it automatically reduced parallelization to 1
>> 
>> * Note: This operation can be inherently non-parallel since all elements have to pass through
>> * the same operator instance. (Only for special cases, such as aligned time windows is
>> * it possible to perform this operation in parallel).
>> 
>> (It looks like the java doc is copied from timeWindowAll)
>> 
>> If I force all window function to run in parallel, what will happen?
>> 
>> Will a time/count window broadcast to all instances of the function? or will it be send to one of the instance so I can parallelize my work?
>> 
>> 
>> Thanks!
>> 
>> 
>> 
>> Andrew
>> 
>> Confidentiality Notice: This e-mail transmission may contain confidential or legally privileged information that is intended only for the individual or entity named in the e-mail address. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or reliance upon the contents of this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete all copies of this message.
> 
> 
> Confidentiality Notice: This e-mail transmission may contain confidential or legally privileged information that is intended only for the individual or entity named in the e-mail address. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or reliance upon the contents of this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete all copies of this message.


-- 
Confidentiality Notice: This e-mail transmission may contain confidential 
or legally privileged information that is intended only for the individual 
or entity named in the e-mail address. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution, or reliance upon the contents of this e-mail is strictly 
prohibited and may be unlawful. If you have received this e-mail in error, 
please notify the sender immediately by return e-mail and delete all copies 
of this message.

Re: Parallel execution on AllWindows

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
"rebalance" simply specifies the strategy to use when sending elements
downstream to the next operator(s). There is no interaction or competition
between the parallel window operator instances. Each will do windowing
locally based on the elements that it receives from upstream.

Cheers,
Aljoscha

On Wed, 3 Aug 2016 at 08:26 <an...@eniro.com> wrote:

> Hi Aljoscha
>
> Thanks for the explanation.
> One other thing, when you say there is no coordination is that means
> rebalance() will not be honored, and each window operator instance will
> compete for the next available window?
>
> Thanks
>
> Andrew
> From mobile
>
> From: Aljoscha Krettek
> Sent: Wednesday, August 3, 17:11
> Subject: Re: Parallel execution on AllWindows
> To: user@flink.apache.org
>
> Hi,
>
> if you manually force a parallelism different from 1 after a *windowAll()
> then you will get parallel execution of your window. For example, if you do
> this:
>
> input.countWindowAll(100).setParallelism(5)
>
> then you will get five parallel instances of the window operator that each
> wait for 100 elements before they fire the window. There is no global
> coordination between the parallel instances that would allow it to fire
> once 100 elements are received across the parallel instances.
>
> Cheers,
>
> Aljoscha
>
> On Wed, 3 Aug 2016 at 05:10 Andrew Ge Wu <an...@eniro.com> wrote:
>
> Hi,
>
> I have such task that I want to count window on a stream and execute them
> batch by batch.
>
> Execute a count window may take some time, so I want it to be executed in
> *parallel*.
>
> I read this part in the documentation when I found it automatically
> reduced parallelization to 1
>
> * Note: This operation can be inherently non-parallel since all elements
> have to pass through
> * the same operator instance. (Only for special cases, such as aligned
> time windows is
> * it possible to perform this operation in parallel).
>
> (It looks like the java doc is copied from timeWindowAll)
>
> If I force all window function to run in parallel, what will happen?
>
> Will a time/count window broadcast to all instances of the function? or
> will it be send to one of the instance so I can parallelize my work?
>
> Thanks!
>
>
> Andrew
>
> Confidentiality Notice: This e-mail transmission may contain confidential
> or legally privileged information that is intended only for the individual
> or entity named in the e-mail address. If you are not the intended
> recipient, you are hereby notified that any disclosure, copying,
> distribution, or reliance upon the contents of this e-mail is strictly
> prohibited and may be unlawful. If you have received this e-mail in error,
> please notify the sender immediately by return e-mail and delete all copies
> of this message.
>
>
>
> Confidentiality Notice: This e-mail transmission may contain confidential
> or legally privileged information that is intended only for the individual
> or entity named in the e-mail address. If you are not the intended
> recipient, you are hereby notified that any disclosure, copying,
> distribution, or reliance upon the contents of this e-mail is strictly
> prohibited and may be unlawful. If you have received this e-mail in error,
> please notify the sender immediately by return e-mail and delete all copies
> of this message.

Re: Parallel execution on AllWindows

Posted by an...@eniro.com.

Hi Aljoscha


Thanks for the explanation.

One other thing, when you say there is no coordination is that means rebalance() will not be honored, and each window operator instance will compete for the next available window?


Thanks



Andrew

From mobile



From: Aljoscha Krettek

Sent: Wednesday, August 3, 17:11

Subject: Re: Parallel execution on AllWindows

To: user@flink.apache.org



Hi,


if you manually force a parallelism different from 1 after a *windowAll() then you will get parallel execution of your window. For example, if you do this:



input.countWindowAll(100).setParallelism(5)



then you will get five parallel instances of the window operator that each wait for 100 elements before they fire the window. There is no global coordination between the parallel instances that would allow it to fire once 100 elements are received across the parallel instances.



Cheers,


Aljoscha



On Wed, 3 Aug 2016 at 05:10 Andrew Ge Wu <an...@eniro.com> wrote:


Hi,



I have such task that I want to count window on a stream and execute them batch by batch.


Execute a count window may take some time, so I want it to be executed in parallel.


I read this part in the documentation when I found it automatically reduced parallelization to 1



* Note: This operation can be inherently non-parallel since all elements have to pass through

* the same operator instance. (Only for special cases, such as aligned time windows is

* it possible to perform this operation in parallel).


(It looks like the java doc is copied from timeWindowAll)



If I force all window function to run in parallel, what will happen?


Will a time/count window broadcast to all instances of the function? or will it be send to one of the instance so I can parallelize my work?




Thanks!





Andrew



Confidentiality Notice: This e-mail transmission may contain confidential or legally privileged information that is intended only for the individual or entity named in the e-mail address. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or reliance upon the contents of this e-mail is strictly prohibited and may be unlawful. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete all copies of this message.








-- 
Confidentiality Notice: This e-mail transmission may contain confidential 
or legally privileged information that is intended only for the individual 
or entity named in the e-mail address. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution, or reliance upon the contents of this e-mail is strictly 
prohibited and may be unlawful. If you have received this e-mail in error, 
please notify the sender immediately by return e-mail and delete all copies 
of this message.

Re: Parallel execution on AllWindows

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
if you manually force a parallelism different from 1 after a *windowAll()
then you will get parallel execution of your window. For example, if you do
this:

input.countWindowAll(100).setParallelism(5)

then you will get five parallel instances of the window operator that each
wait for 100 elements before they fire the window. There is no global
coordination between the parallel instances that would allow it to fire
once 100 elements are received across the parallel instances.

Cheers,
Aljoscha

On Wed, 3 Aug 2016 at 05:10 Andrew Ge Wu <an...@eniro.com> wrote:

> Hi,
>
> I have such task that I want to count window on a stream and execute them
> batch by batch.
> Execute a count window may take some time, so I want it to be executed in
> *parallel*.
> I read this part in the documentation when I found it automatically
> reduced parallelization to 1
>
> * Note: This operation can be inherently non-parallel since all elements have to pass through
> * the same operator instance. (Only for special cases, such as aligned time windows is
> * it possible to perform this operation in parallel).
>
> (It looks like the java doc is copied from timeWindowAll)
>
> If I force all window function to run in parallel, what will happen?
> Will a time/count window broadcast to all instances of the function? or will it be send to one of the instance so I can parallelize my work?
>
>
> Thanks!
>
>
>
> Andrew
>
>
> Confidentiality Notice: This e-mail transmission may contain confidential
> or legally privileged information that is intended only for the individual
> or entity named in the e-mail address. If you are not the intended
> recipient, you are hereby notified that any disclosure, copying,
> distribution, or reliance upon the contents of this e-mail is strictly
> prohibited and may be unlawful. If you have received this e-mail in error,
> please notify the sender immediately by return e-mail and delete all copies
> of this message.