You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Binh Nguyen Van <bi...@gmail.com> on 2022/11/07 08:38:31 UTC
Single side input to multiple transforms
Hi,
I am writing a pipeline where I have one singleton side input that I want
to use in multiple different transforms. When I run the pipeline in Google
Dataflow I see multiple entries in the logs that have a message like this
Deduplicating side input tags, found non-unique side input key
org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.<init>:1204#4663620f501c9270
Is this something that I should avoid? If so how can I do that?
Thanks
-Binh
Re: Single side input to multiple transforms
Posted by Binh Nguyen Van <bi...@gmail.com>.
Hi,
No, it is a Java job.
This is an example code that causes the duplicate side input tag log entries
PCollectionView<MyData> sideInput = sideCollection.apply(View.asSingleton());
inputCollection
.apply(ParDo.of(new MyFn1()).withSideInputs(sideInput))
.apply(ParDo.of(new MyFn2()).withSideInputs(sideInput));
But if I create two separated views like this then the duplicate side input
tag log entries won’t appear
PCollectionView<MyData> sideInput1 = sideCollection.apply(View.asSingleton());
PCollectionView<MyData> sideInput2 = sideCollection.apply(View.asSingleton());
inputCollection
.apply(ParDo.of(new MyFn1()).withSideInputs(sideInput1))
.apply(ParDo.of(new MyFn2()).withSideInputs(sideInput2));
-Binh
On Mon, Nov 7, 2022 at 10:50 AM Reuven Lax via user <us...@beam.apache.org>
wrote:
> Is this a Python job?
>
> On Mon, Nov 7, 2022 at 12:38 AM Binh Nguyen Van <bi...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I am writing a pipeline where I have one singleton side input that I want
>> to use in multiple different transforms. When I run the pipeline in Google
>> Dataflow I see multiple entries in the logs that have a message like this
>>
>> Deduplicating side input tags, found non-unique side input key
>> org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.<init>:1204#4663620f501c9270
>>
>> Is this something that I should avoid? If so how can I do that?
>>
>> Thanks
>> -Binh
>>
>
Re: Single side input to multiple transforms
Posted by Reuven Lax via user <us...@beam.apache.org>.
Is this a Python job?
On Mon, Nov 7, 2022 at 12:38 AM Binh Nguyen Van <bi...@gmail.com> wrote:
> Hi,
>
> I am writing a pipeline where I have one singleton side input that I want
> to use in multiple different transforms. When I run the pipeline in Google
> Dataflow I see multiple entries in the logs that have a message like this
>
> Deduplicating side input tags, found non-unique side input key
> org.apache.beam.sdk.values.PCollectionViews$SimplePCollectionView.<init>:1204#4663620f501c9270
>
> Is this something that I should avoid? If so how can I do that?
>
> Thanks
> -Binh
>