You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Shen Li <cs...@gmail.com> on 2016/06/26 20:29:54 UTC

Sliding-Windowed PCollectionView as SideInput

Hi,

I am little confused about how the runner should handle SideInput if it
comes from a sliding-windowed PCollection.

Say we have two PCollections A and B. Apply
Window.into(SlidingWindows.of...) on B, and create a View from it (call it
VB).

Then, a Pardo takes the PCollection A as the main input and VB as side
input: A.apply(ParDo.withSideInputs(VB).of(new DoFun() {...})).

In the DoFun.processElement(), when the user code calls
ProcessContext.sideInput(VB), the view of which window in VB should be
returned if the event time of the current element in A corresponds to
multiple sliding windows in B?


Thanks,

Shen

Re: Sliding-Windowed PCollectionView as SideInput

Posted by Shen Li <cs...@gmail.com>.
Hi Aljoscha,

Thanks for the explanation.

Shen

On Mon, Jun 27, 2016 at 4:38 AM, Aljoscha Krettek <al...@apache.org>
wrote:

> Hi,
> the WindowFn is responsible for mapping from main-input window to
> side-input window. Have a look at WindowFn.getSideInputWindow(). For
> SlidingWindows this takes the last possible sliding window as the
> side-input window.
>
> Cheers,
> Aljoscha
>
> On Sun, 26 Jun 2016 at 22:30 Shen Li <cs...@gmail.com> wrote:
>
> > Hi,
> >
> > I am little confused about how the runner should handle SideInput if it
> > comes from a sliding-windowed PCollection.
> >
> > Say we have two PCollections A and B. Apply
> > Window.into(SlidingWindows.of...) on B, and create a View from it (call
> it
> > VB).
> >
> > Then, a Pardo takes the PCollection A as the main input and VB as side
> > input: A.apply(ParDo.withSideInputs(VB).of(new DoFun() {...})).
> >
> > In the DoFun.processElement(), when the user code calls
> > ProcessContext.sideInput(VB), the view of which window in VB should be
> > returned if the event time of the current element in A corresponds to
> > multiple sliding windows in B?
> >
> >
> > Thanks,
> >
> > Shen
> >
>

Re: Sliding-Windowed PCollectionView as SideInput

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
the WindowFn is responsible for mapping from main-input window to
side-input window. Have a look at WindowFn.getSideInputWindow(). For
SlidingWindows this takes the last possible sliding window as the
side-input window.

Cheers,
Aljoscha

On Sun, 26 Jun 2016 at 22:30 Shen Li <cs...@gmail.com> wrote:

> Hi,
>
> I am little confused about how the runner should handle SideInput if it
> comes from a sliding-windowed PCollection.
>
> Say we have two PCollections A and B. Apply
> Window.into(SlidingWindows.of...) on B, and create a View from it (call it
> VB).
>
> Then, a Pardo takes the PCollection A as the main input and VB as side
> input: A.apply(ParDo.withSideInputs(VB).of(new DoFun() {...})).
>
> In the DoFun.processElement(), when the user code calls
> ProcessContext.sideInput(VB), the view of which window in VB should be
> returned if the event time of the current element in A corresponds to
> multiple sliding windows in B?
>
>
> Thanks,
>
> Shen
>