You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Evan Galpin <ev...@gmail.com> on 2021/09/07 20:23:18 UTC

[Python] Heterogeneous TaggedOutput Type Hints

Hi all,

What is the recommended way to write type hints for a tagged output DoFn
where the outputs to different tags have different types?

I tried using a Union to describe each of the possible output types, but
that resulted in mismatched coder errors where only the last entry in the
Union was used as the assumed type.  Is there a way to associate a type
hint to a tag or something like that?

Thanks,
Evan

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Evan Galpin <ev...@gmail.com>.
Thanks for the update!  I also was not able to repro, so presumably
something is fixed? :-)

Thanks,
Evan

On Mon, Mar 21, 2022 at 8:40 PM Valentyn Tymofieiev <va...@google.com>
wrote:

> I came across this thread and wasn't able to reproduce the `expecting a KV
> coder, but had Strings` error, so hopefully that's fixed now. I had to
> modify the repro to add .with_outputs() to the line 49 in
> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>
> On Mon, Sep 27, 2021 at 5:58 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> As a workaround, can you try passing the use_portable_job_submission
>> experiment?
>>
>> On Mon, Sep 27, 2021 at 2:19 PM Luke Cwik <lc...@google.com> wrote:
>> >
>> > Sorry, I forgot that you had a minimal repro for this issue, I attached
>> details to the internal bug.
>> >
>> > On Mon, Sep 27, 2021 at 2:18 PM Luke Cwik <lc...@google.com> wrote:
>> >>
>> >> There is an internal bug 195053987 that matches what you're describing
>> but we were unable able to get a minimal repro for it. It would be useful
>> if you had a minimal repro for the issue that I could update the internal
>> bug with details and/or you could reach out to GCP support with job ids
>> and/or minimal repros to get support as well.
>> >>
>> >> On Wed, Sep 22, 2021 at 6:57 AM Evan Galpin <ev...@gmail.com>
>> wrote:
>> >>>
>> >>> Thanks for the response Luke :-)
>> >>>
>> >>> I did try setting <pcoll>.element_type for each resulting PCollection
>> using "apache_beam.typehints.typehints.KV" to describe the elements, which
>> passed type checking.  I also ran the full dataset (batch job) without the
>> GBK in question but instead using a dummy DoFn in its place which asserted
>> that every element that would be going into the GBK was a 2-tuple, along
>> with using --runtime_type_check, all of which run successfully without the
>> GBK after the TaggedOutput DoFn.
>> >>>
>> >>> Adding back the GBK also runs end-to-end successfully on the
>> DirectRunner using the identical dataset.  But as soon as I add the GBK and
>> use the DataflowRunner (v2), I get errors as soon as the optimized step
>> involving the GBK is in the "running" status:
>> >>>
>> >>> - "Could not start worker docker container"
>> >>> - "Error syncing pod"
>> >>> - "Check failed: pair_coder Strings" or "Check failed: kv_coder :
>> expecting a KV coder, but had Strings"
>> >>>
>> >>> Anything further to try? I can also provide Job IDs from Dataflow if
>> helpful (and safe to share).
>> >>>
>> >>> Thanks,
>> >>> Evan
>> >>>
>> >>> On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <lc...@google.com> wrote:
>> >>>>
>> >>>> Have you tried setting the element_type[1] explicitly on each output
>> PCollection that is returned after applying the multi-output ParDo?
>> >>>> I believe you'll get a DoOutputsTuple[2] returned after applying the
>> mult-output ParDo which allows access to the underlying PCollection objects.
>> >>>>
>> >>>> 1:
>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
>> >>>> 2:
>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234
>> >>>>
>> >>>> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com>
>> wrote:
>> >>>>>
>> >>>>> This is badly plaguing a pipeline I'm currently developing, where
>> the exact same data set and code runs end-to-end on DirectRunner, but fails
>> on DataflowRunner with either "Check failed: kv_coder : expecting a KV
>> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the
>> harness logs. It seems to be consistently repeatable with any TaggedOutput
>> + GBK afterwards.
>> >>>>>
>> >>>>> Any advice on how to proceed?
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Evan
>> >>>>>
>> >>>>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com>
>> wrote:
>> >>>>>>
>> >>>>>> The Dataflow error logs only showed 1 error which was:  "The job
>> failed because a work item has failed 4 times. Look in previous log entries
>> for the cause of each one of the 4 failures. For more information, see
>> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
>> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
>> Root cause: The worker lost contact with the service."  In "Diagnostics"
>> there were errors stating "Error syncing pod: Could not start worker docker
>> container".  The harness logs i.e. "projects/my-project/logs/
>> dataflow.googleapis.com%2Fharness" finally contained an error that
>> looked suspect, which was "Check failed: kv_coder : expecting a KV coder,
>> but had Strings", below[1] is a link to possibly a stacktrace or extra
>> detail, but is internal to google so I don't have access.
>> >>>>>>
>> >>>>>> [1]
>> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>> >>>>>>
>> >>>>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <
>> robertwb@google.com> wrote:
>> >>>>>>>
>> >>>>>>> Huh, that's strange. Yes, the exact error on the service would be
>> helpful.
>> >>>>>>>
>> >>>>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <
>> evan.galpin@gmail.com> wrote:
>> >>>>>>> >
>> >>>>>>> > Thanks for the response. I've created a gist here to
>> demonstrate a minimal repro:
>> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>> >>>>>>> >
>> >>>>>>> > It seemed to run fine both on DirectRunner and PortableRunner
>> (embed mode), but Dataflow v2 runner raised an error at runtime seemingly
>> associated with the Shuffle service?  I have job IDs and trace links if
>> those are helpful as well.
>> >>>>>>> >
>> >>>>>>> > Thanks,
>> >>>>>>> > Evan
>> >>>>>>> >
>> >>>>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <
>> robertwb@google.com> wrote:
>> >>>>>>> >>
>> >>>>>>> >> This is not yet supported. Using a union for now is the way to
>> go. (If
>> >>>>>>> >> only the last value of the union was used, that sounds like a
>> bug. Do
>> >>>>>>> >> you have a minimal repro?)
>> >>>>>>> >>
>> >>>>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <
>> evan.galpin@gmail.com> wrote:
>> >>>>>>> >> >
>> >>>>>>> >> > Hi all,
>> >>>>>>> >> >
>> >>>>>>> >> > What is the recommended way to write type hints for a tagged
>> output DoFn where the outputs to different tags have different types?
>> >>>>>>> >> >
>> >>>>>>> >> > I tried using a Union to describe each of the possible
>> output types, but that resulted in mismatched coder errors where only the
>> last entry in the Union was used as the assumed type.  Is there a way to
>> associate a type hint to a tag or something like that?
>> >>>>>>> >> >
>> >>>>>>> >> > Thanks,
>> >>>>>>> >> > Evan
>>
>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Valentyn Tymofieiev <va...@google.com>.
I came across this thread and wasn't able to reproduce the `expecting a KV
coder, but had Strings` error, so hopefully that's fixed now. I had to
modify the repro to add .with_outputs() to the line 49 in
https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc

On Mon, Sep 27, 2021 at 5:58 PM Robert Bradshaw <ro...@google.com> wrote:

> As a workaround, can you try passing the use_portable_job_submission
> experiment?
>
> On Mon, Sep 27, 2021 at 2:19 PM Luke Cwik <lc...@google.com> wrote:
> >
> > Sorry, I forgot that you had a minimal repro for this issue, I attached
> details to the internal bug.
> >
> > On Mon, Sep 27, 2021 at 2:18 PM Luke Cwik <lc...@google.com> wrote:
> >>
> >> There is an internal bug 195053987 that matches what you're describing
> but we were unable able to get a minimal repro for it. It would be useful
> if you had a minimal repro for the issue that I could update the internal
> bug with details and/or you could reach out to GCP support with job ids
> and/or minimal repros to get support as well.
> >>
> >> On Wed, Sep 22, 2021 at 6:57 AM Evan Galpin <ev...@gmail.com>
> wrote:
> >>>
> >>> Thanks for the response Luke :-)
> >>>
> >>> I did try setting <pcoll>.element_type for each resulting PCollection
> using "apache_beam.typehints.typehints.KV" to describe the elements, which
> passed type checking.  I also ran the full dataset (batch job) without the
> GBK in question but instead using a dummy DoFn in its place which asserted
> that every element that would be going into the GBK was a 2-tuple, along
> with using --runtime_type_check, all of which run successfully without the
> GBK after the TaggedOutput DoFn.
> >>>
> >>> Adding back the GBK also runs end-to-end successfully on the
> DirectRunner using the identical dataset.  But as soon as I add the GBK and
> use the DataflowRunner (v2), I get errors as soon as the optimized step
> involving the GBK is in the "running" status:
> >>>
> >>> - "Could not start worker docker container"
> >>> - "Error syncing pod"
> >>> - "Check failed: pair_coder Strings" or "Check failed: kv_coder :
> expecting a KV coder, but had Strings"
> >>>
> >>> Anything further to try? I can also provide Job IDs from Dataflow if
> helpful (and safe to share).
> >>>
> >>> Thanks,
> >>> Evan
> >>>
> >>> On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <lc...@google.com> wrote:
> >>>>
> >>>> Have you tried setting the element_type[1] explicitly on each output
> PCollection that is returned after applying the multi-output ParDo?
> >>>> I believe you'll get a DoOutputsTuple[2] returned after applying the
> mult-output ParDo which allows access to the underlying PCollection objects.
> >>>>
> >>>> 1:
> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
> >>>> 2:
> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234
> >>>>
> >>>> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com>
> wrote:
> >>>>>
> >>>>> This is badly plaguing a pipeline I'm currently developing, where
> the exact same data set and code runs end-to-end on DirectRunner, but fails
> on DataflowRunner with either "Check failed: kv_coder : expecting a KV
> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the
> harness logs. It seems to be consistently repeatable with any TaggedOutput
> + GBK afterwards.
> >>>>>
> >>>>> Any advice on how to proceed?
> >>>>>
> >>>>> Thanks,
> >>>>> Evan
> >>>>>
> >>>>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com>
> wrote:
> >>>>>>
> >>>>>> The Dataflow error logs only showed 1 error which was:  "The job
> failed because a work item has failed 4 times. Look in previous log entries
> for the cause of each one of the 4 failures. For more information, see
> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
> Root cause: The worker lost contact with the service."  In "Diagnostics"
> there were errors stating "Error syncing pod: Could not start worker docker
> container".  The harness logs i.e. "projects/my-project/logs/
> dataflow.googleapis.com%2Fharness" finally contained an error that looked
> suspect, which was "Check failed: kv_coder : expecting a KV coder, but had
> Strings", below[1] is a link to possibly a stacktrace or extra detail, but
> is internal to google so I don't have access.
> >>>>>>
> >>>>>> [1]
> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
> >>>>>>
> >>>>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com>
> wrote:
> >>>>>>>
> >>>>>>> Huh, that's strange. Yes, the exact error on the service would be
> helpful.
> >>>>>>>
> >>>>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com>
> wrote:
> >>>>>>> >
> >>>>>>> > Thanks for the response. I've created a gist here to demonstrate
> a minimal repro:
> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
> >>>>>>> >
> >>>>>>> > It seemed to run fine both on DirectRunner and PortableRunner
> (embed mode), but Dataflow v2 runner raised an error at runtime seemingly
> associated with the Shuffle service?  I have job IDs and trace links if
> those are helpful as well.
> >>>>>>> >
> >>>>>>> > Thanks,
> >>>>>>> > Evan
> >>>>>>> >
> >>>>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <
> robertwb@google.com> wrote:
> >>>>>>> >>
> >>>>>>> >> This is not yet supported. Using a union for now is the way to
> go. (If
> >>>>>>> >> only the last value of the union was used, that sounds like a
> bug. Do
> >>>>>>> >> you have a minimal repro?)
> >>>>>>> >>
> >>>>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <
> evan.galpin@gmail.com> wrote:
> >>>>>>> >> >
> >>>>>>> >> > Hi all,
> >>>>>>> >> >
> >>>>>>> >> > What is the recommended way to write type hints for a tagged
> output DoFn where the outputs to different tags have different types?
> >>>>>>> >> >
> >>>>>>> >> > I tried using a Union to describe each of the possible output
> types, but that resulted in mismatched coder errors where only the last
> entry in the Union was used as the assumed type.  Is there a way to
> associate a type hint to a tag or something like that?
> >>>>>>> >> >
> >>>>>>> >> > Thanks,
> >>>>>>> >> > Evan
>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Robert Bradshaw <ro...@google.com>.
As a workaround, can you try passing the use_portable_job_submission
experiment?

On Mon, Sep 27, 2021 at 2:19 PM Luke Cwik <lc...@google.com> wrote:
>
> Sorry, I forgot that you had a minimal repro for this issue, I attached details to the internal bug.
>
> On Mon, Sep 27, 2021 at 2:18 PM Luke Cwik <lc...@google.com> wrote:
>>
>> There is an internal bug 195053987 that matches what you're describing but we were unable able to get a minimal repro for it. It would be useful if you had a minimal repro for the issue that I could update the internal bug with details and/or you could reach out to GCP support with job ids and/or minimal repros to get support as well.
>>
>> On Wed, Sep 22, 2021 at 6:57 AM Evan Galpin <ev...@gmail.com> wrote:
>>>
>>> Thanks for the response Luke :-)
>>>
>>> I did try setting <pcoll>.element_type for each resulting PCollection using "apache_beam.typehints.typehints.KV" to describe the elements, which passed type checking.  I also ran the full dataset (batch job) without the GBK in question but instead using a dummy DoFn in its place which asserted that every element that would be going into the GBK was a 2-tuple, along with using --runtime_type_check, all of which run successfully without the GBK after the TaggedOutput DoFn.
>>>
>>> Adding back the GBK also runs end-to-end successfully on the DirectRunner using the identical dataset.  But as soon as I add the GBK and use the DataflowRunner (v2), I get errors as soon as the optimized step involving the GBK is in the "running" status:
>>>
>>> - "Could not start worker docker container"
>>> - "Error syncing pod"
>>> - "Check failed: pair_coder Strings" or "Check failed: kv_coder : expecting a KV coder, but had Strings"
>>>
>>> Anything further to try? I can also provide Job IDs from Dataflow if helpful (and safe to share).
>>>
>>> Thanks,
>>> Evan
>>>
>>> On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>> Have you tried setting the element_type[1] explicitly on each output PCollection that is returned after applying the multi-output ParDo?
>>>> I believe you'll get a DoOutputsTuple[2] returned after applying the mult-output ParDo which allows access to the underlying PCollection objects.
>>>>
>>>> 1: https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
>>>> 2: https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234
>>>>
>>>> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com> wrote:
>>>>>
>>>>> This is badly plaguing a pipeline I'm currently developing, where the exact same data set and code runs end-to-end on DirectRunner, but fails on DataflowRunner with either "Check failed: kv_coder : expecting a KV coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the harness logs. It seems to be consistently repeatable with any TaggedOutput + GBK afterwards.
>>>>>
>>>>> Any advice on how to proceed?
>>>>>
>>>>> Thanks,
>>>>> Evan
>>>>>
>>>>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com> wrote:
>>>>>>
>>>>>> The Dataflow error logs only showed 1 error which was:  "The job failed because a work item has failed 4 times. Look in previous log entries for the cause of each one of the 4 failures. For more information, see https://cloud.google.com/dataflow/docs/guides/common-errors. The work item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c Root cause: The worker lost contact with the service."  In "Diagnostics" there were errors stating "Error syncing pod: Could not start worker docker container".  The harness logs i.e. "projects/my-project/logs/dataflow.googleapis.com%2Fharness" finally contained an error that looked suspect, which was "Check failed: kv_coder : expecting a KV coder, but had Strings", below[1] is a link to possibly a stacktrace or extra detail, but is internal to google so I don't have access.
>>>>>>
>>>>>> [1] https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>>>>>>
>>>>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com> wrote:
>>>>>>>
>>>>>>> Huh, that's strange. Yes, the exact error on the service would be helpful.
>>>>>>>
>>>>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com> wrote:
>>>>>>> >
>>>>>>> > Thanks for the response. I've created a gist here to demonstrate a minimal repro: https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>>>>>>> >
>>>>>>> > It seemed to run fine both on DirectRunner and PortableRunner (embed mode), but Dataflow v2 runner raised an error at runtime seemingly associated with the Shuffle service?  I have job IDs and trace links if those are helpful as well.
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Evan
>>>>>>> >
>>>>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com> wrote:
>>>>>>> >>
>>>>>>> >> This is not yet supported. Using a union for now is the way to go. (If
>>>>>>> >> only the last value of the union was used, that sounds like a bug. Do
>>>>>>> >> you have a minimal repro?)
>>>>>>> >>
>>>>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com> wrote:
>>>>>>> >> >
>>>>>>> >> > Hi all,
>>>>>>> >> >
>>>>>>> >> > What is the recommended way to write type hints for a tagged output DoFn where the outputs to different tags have different types?
>>>>>>> >> >
>>>>>>> >> > I tried using a Union to describe each of the possible output types, but that resulted in mismatched coder errors where only the last entry in the Union was used as the assumed type.  Is there a way to associate a type hint to a tag or something like that?
>>>>>>> >> >
>>>>>>> >> > Thanks,
>>>>>>> >> > Evan

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Luke Cwik <lc...@google.com>.
Sorry, I forgot that you had a minimal repro for this issue, I attached
details to the internal bug.

On Mon, Sep 27, 2021 at 2:18 PM Luke Cwik <lc...@google.com> wrote:

> There is an internal bug 195053987 that matches what you're describing but
> we were unable able to get a minimal repro for it. It would be useful if
> you had a minimal repro for the issue that I could update the internal bug
> with details and/or you could reach out to GCP support with job ids and/or
> minimal repros to get support as well.
>
> On Wed, Sep 22, 2021 at 6:57 AM Evan Galpin <ev...@gmail.com> wrote:
>
>> Thanks for the response Luke :-)
>>
>> I did try setting <pcoll>.element_type for each resulting PCollection
>> using "apache_beam.typehints.typehints.KV" to describe the elements, which
>> passed type checking.  I also ran the full dataset (batch job) without the
>> GBK in question but instead using a dummy DoFn in its place which asserted
>> that every element that would be going into the GBK was a 2-tuple, along
>> with using --runtime_type_check, all of which run successfully without the
>> GBK after the TaggedOutput DoFn.
>>
>> Adding back the GBK also runs end-to-end successfully on the DirectRunner
>> using the identical dataset.  But as soon as I add the GBK and use the
>> DataflowRunner (v2), I get errors as soon as the optimized step involving
>> the GBK is in the "running" status:
>>
>> - "Could not start worker docker container"
>> - "Error syncing pod"
>> - "Check failed: pair_coder Strings" or "Check failed: kv_coder :
>> expecting a KV coder, but had Strings"
>>
>> Anything further to try? I can also provide Job IDs from Dataflow if
>> helpful (and safe to share).
>>
>> Thanks,
>> Evan
>>
>> On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <lc...@google.com> wrote:
>>
>>> Have you tried setting the element_type[1] explicitly on each output
>>> PCollection that is returned after applying the multi-output ParDo?
>>> I believe you'll get a DoOutputsTuple[2] returned after applying the
>>> mult-output ParDo which allows access to the underlying PCollection objects.
>>>
>>> 1:
>>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
>>> 2:
>>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234
>>>
>>> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com>
>>> wrote:
>>>
>>>> This is badly plaguing a pipeline I'm currently developing, where the
>>>> exact same data set and code runs end-to-end on DirectRunner, but fails on
>>>> DataflowRunner with either "Check failed: kv_coder : expecting a KV
>>>> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the
>>>> harness logs. It seems to be consistently repeatable with any TaggedOutput
>>>> + GBK afterwards.
>>>>
>>>> Any advice on how to proceed?
>>>>
>>>> Thanks,
>>>> Evan
>>>>
>>>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com>
>>>> wrote:
>>>>
>>>>> The Dataflow error logs only showed 1 error which was:  "The job
>>>>> failed because a work item has failed 4 times. Look in previous log entries
>>>>> for the cause of each one of the 4 failures. For more information, see
>>>>> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
>>>>> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
>>>>> Root cause: The worker lost contact with the service."  In "Diagnostics"
>>>>> there were errors stating "Error syncing pod: Could not start worker docker
>>>>> container".  The harness logs i.e. "projects/my-project/logs/
>>>>> dataflow.googleapis.com%2Fharness" finally contained an error that
>>>>> looked suspect, which was "Check failed: kv_coder : expecting a KV
>>>>> coder, but had Strings", below[1] is a link to possibly a stacktrace or
>>>>> extra detail, but is internal to google so I don't have access.
>>>>>
>>>>> [1]
>>>>> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>>>>>
>>>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Huh, that's strange. Yes, the exact error on the service would be
>>>>>> helpful.
>>>>>>
>>>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Thanks for the response. I've created a gist here to demonstrate a
>>>>>> minimal repro:
>>>>>> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>>>>>> >
>>>>>> > It seemed to run fine both on DirectRunner and PortableRunner
>>>>>> (embed mode), but Dataflow v2 runner raised an error at runtime seemingly
>>>>>> associated with the Shuffle service?  I have job IDs and trace links if
>>>>>> those are helpful as well.
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Evan
>>>>>> >
>>>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> This is not yet supported. Using a union for now is the way to go.
>>>>>> (If
>>>>>> >> only the last value of the union was used, that sounds like a bug.
>>>>>> Do
>>>>>> >> you have a minimal repro?)
>>>>>> >>
>>>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com>
>>>>>> wrote:
>>>>>> >> >
>>>>>> >> > Hi all,
>>>>>> >> >
>>>>>> >> > What is the recommended way to write type hints for a tagged
>>>>>> output DoFn where the outputs to different tags have different types?
>>>>>> >> >
>>>>>> >> > I tried using a Union to describe each of the possible output
>>>>>> types, but that resulted in mismatched coder errors where only the last
>>>>>> entry in the Union was used as the assumed type.  Is there a way to
>>>>>> associate a type hint to a tag or something like that?
>>>>>> >> >
>>>>>> >> > Thanks,
>>>>>> >> > Evan
>>>>>>
>>>>>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Luke Cwik <lc...@google.com>.
There is an internal bug 195053987 that matches what you're describing but
we were unable able to get a minimal repro for it. It would be useful if
you had a minimal repro for the issue that I could update the internal bug
with details and/or you could reach out to GCP support with job ids and/or
minimal repros to get support as well.

On Wed, Sep 22, 2021 at 6:57 AM Evan Galpin <ev...@gmail.com> wrote:

> Thanks for the response Luke :-)
>
> I did try setting <pcoll>.element_type for each resulting PCollection
> using "apache_beam.typehints.typehints.KV" to describe the elements, which
> passed type checking.  I also ran the full dataset (batch job) without the
> GBK in question but instead using a dummy DoFn in its place which asserted
> that every element that would be going into the GBK was a 2-tuple, along
> with using --runtime_type_check, all of which run successfully without the
> GBK after the TaggedOutput DoFn.
>
> Adding back the GBK also runs end-to-end successfully on the DirectRunner
> using the identical dataset.  But as soon as I add the GBK and use the
> DataflowRunner (v2), I get errors as soon as the optimized step involving
> the GBK is in the "running" status:
>
> - "Could not start worker docker container"
> - "Error syncing pod"
> - "Check failed: pair_coder Strings" or "Check failed: kv_coder :
> expecting a KV coder, but had Strings"
>
> Anything further to try? I can also provide Job IDs from Dataflow if
> helpful (and safe to share).
>
> Thanks,
> Evan
>
> On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <lc...@google.com> wrote:
>
>> Have you tried setting the element_type[1] explicitly on each output
>> PCollection that is returned after applying the multi-output ParDo?
>> I believe you'll get a DoOutputsTuple[2] returned after applying the
>> mult-output ParDo which allows access to the underlying PCollection objects.
>>
>> 1:
>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
>> 2:
>> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234
>>
>> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com>
>> wrote:
>>
>>> This is badly plaguing a pipeline I'm currently developing, where the
>>> exact same data set and code runs end-to-end on DirectRunner, but fails on
>>> DataflowRunner with either "Check failed: kv_coder : expecting a KV
>>> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the
>>> harness logs. It seems to be consistently repeatable with any TaggedOutput
>>> + GBK afterwards.
>>>
>>> Any advice on how to proceed?
>>>
>>> Thanks,
>>> Evan
>>>
>>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com>
>>> wrote:
>>>
>>>> The Dataflow error logs only showed 1 error which was:  "The job failed
>>>> because a work item has failed 4 times. Look in previous log entries for
>>>> the cause of each one of the 4 failures. For more information, see
>>>> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
>>>> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
>>>> Root cause: The worker lost contact with the service."  In "Diagnostics"
>>>> there were errors stating "Error syncing pod: Could not start worker docker
>>>> container".  The harness logs i.e. "projects/my-project/logs/
>>>> dataflow.googleapis.com%2Fharness" finally contained an error that
>>>> looked suspect, which was "Check failed: kv_coder : expecting a KV
>>>> coder, but had Strings", below[1] is a link to possibly a stacktrace or
>>>> extra detail, but is internal to google so I don't have access.
>>>>
>>>> [1]
>>>> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>>>>
>>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> Huh, that's strange. Yes, the exact error on the service would be
>>>>> helpful.
>>>>>
>>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Thanks for the response. I've created a gist here to demonstrate a
>>>>> minimal repro:
>>>>> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>>>>> >
>>>>> > It seemed to run fine both on DirectRunner and PortableRunner (embed
>>>>> mode), but Dataflow v2 runner raised an error at runtime seemingly
>>>>> associated with the Shuffle service?  I have job IDs and trace links if
>>>>> those are helpful as well.
>>>>> >
>>>>> > Thanks,
>>>>> > Evan
>>>>> >
>>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>> >>
>>>>> >> This is not yet supported. Using a union for now is the way to go.
>>>>> (If
>>>>> >> only the last value of the union was used, that sounds like a bug.
>>>>> Do
>>>>> >> you have a minimal repro?)
>>>>> >>
>>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com>
>>>>> wrote:
>>>>> >> >
>>>>> >> > Hi all,
>>>>> >> >
>>>>> >> > What is the recommended way to write type hints for a tagged
>>>>> output DoFn where the outputs to different tags have different types?
>>>>> >> >
>>>>> >> > I tried using a Union to describe each of the possible output
>>>>> types, but that resulted in mismatched coder errors where only the last
>>>>> entry in the Union was used as the assumed type.  Is there a way to
>>>>> associate a type hint to a tag or something like that?
>>>>> >> >
>>>>> >> > Thanks,
>>>>> >> > Evan
>>>>>
>>>>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Evan Galpin <ev...@gmail.com>.
Thanks for the response Luke :-)

I did try setting <pcoll>.element_type for each resulting PCollection using
"apache_beam.typehints.typehints.KV" to describe the elements, which passed
type checking.  I also ran the full dataset (batch job) without the GBK in
question but instead using a dummy DoFn in its place which asserted that
every element that would be going into the GBK was a 2-tuple, along with
using --runtime_type_check, all of which run successfully without the GBK
after the TaggedOutput DoFn.

Adding back the GBK also runs end-to-end successfully on the DirectRunner
using the identical dataset.  But as soon as I add the GBK and use the
DataflowRunner (v2), I get errors as soon as the optimized step involving
the GBK is in the "running" status:

- "Could not start worker docker container"
- "Error syncing pod"
- "Check failed: pair_coder Strings" or "Check failed: kv_coder : expecting
a KV coder, but had Strings"

Anything further to try? I can also provide Job IDs from Dataflow if
helpful (and safe to share).

Thanks,
Evan

On Wed, Sep 22, 2021 at 1:09 AM Luke Cwik <lc...@google.com> wrote:

> Have you tried setting the element_type[1] explicitly on each output
> PCollection that is returned after applying the multi-output ParDo?
> I believe you'll get a DoOutputsTuple[2] returned after applying the
> mult-output ParDo which allows access to the underlying PCollection objects.
>
> 1:
> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
> 2:
> https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234
>
> On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com>
> wrote:
>
>> This is badly plaguing a pipeline I'm currently developing, where the
>> exact same data set and code runs end-to-end on DirectRunner, but fails on
>> DataflowRunner with either "Check failed: kv_coder : expecting a KV
>> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the
>> harness logs. It seems to be consistently repeatable with any TaggedOutput
>> + GBK afterwards.
>>
>> Any advice on how to proceed?
>>
>> Thanks,
>> Evan
>>
>> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com>
>> wrote:
>>
>>> The Dataflow error logs only showed 1 error which was:  "The job failed
>>> because a work item has failed 4 times. Look in previous log entries for
>>> the cause of each one of the 4 failures. For more information, see
>>> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
>>> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
>>> Root cause: The worker lost contact with the service."  In "Diagnostics"
>>> there were errors stating "Error syncing pod: Could not start worker docker
>>> container".  The harness logs i.e. "projects/my-project/logs/
>>> dataflow.googleapis.com%2Fharness" finally contained an error that
>>> looked suspect, which was "Check failed: kv_coder : expecting a KV
>>> coder, but had Strings", below[1] is a link to possibly a stacktrace or
>>> extra detail, but is internal to google so I don't have access.
>>>
>>> [1]
>>> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>>>
>>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> Huh, that's strange. Yes, the exact error on the service would be
>>>> helpful.
>>>>
>>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Thanks for the response. I've created a gist here to demonstrate a
>>>> minimal repro:
>>>> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>>>> >
>>>> > It seemed to run fine both on DirectRunner and PortableRunner (embed
>>>> mode), but Dataflow v2 runner raised an error at runtime seemingly
>>>> associated with the Shuffle service?  I have job IDs and trace links if
>>>> those are helpful as well.
>>>> >
>>>> > Thanks,
>>>> > Evan
>>>> >
>>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>> >>
>>>> >> This is not yet supported. Using a union for now is the way to go.
>>>> (If
>>>> >> only the last value of the union was used, that sounds like a bug. Do
>>>> >> you have a minimal repro?)
>>>> >>
>>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com>
>>>> wrote:
>>>> >> >
>>>> >> > Hi all,
>>>> >> >
>>>> >> > What is the recommended way to write type hints for a tagged
>>>> output DoFn where the outputs to different tags have different types?
>>>> >> >
>>>> >> > I tried using a Union to describe each of the possible output
>>>> types, but that resulted in mismatched coder errors where only the last
>>>> entry in the Union was used as the assumed type.  Is there a way to
>>>> associate a type hint to a tag or something like that?
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Evan
>>>>
>>>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Luke Cwik <lc...@google.com>.
Have you tried setting the element_type[1] explicitly on each output
PCollection that is returned after applying the multi-output ParDo?
I believe you'll get a DoOutputsTuple[2] returned after applying the
mult-output ParDo which allows access to the underlying PCollection objects.

1:
https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L99
2:
https://github.com/apache/beam/blob/ebf2aacf37b97fc85b167271f184f61f5b06ddc3/sdks/python/apache_beam/pvalue.py#L234

On Tue, Sep 21, 2021 at 10:29 AM Evan Galpin <ev...@gmail.com> wrote:

> This is badly plaguing a pipeline I'm currently developing, where the
> exact same data set and code runs end-to-end on DirectRunner, but fails on
> DataflowRunner with either "Check failed: kv_coder : expecting a KV
> coder, but had Strings" or "Check failed: pair_coder Strings" hidden in the
> harness logs. It seems to be consistently repeatable with any TaggedOutput
> + GBK afterwards.
>
> Any advice on how to proceed?
>
> Thanks,
> Evan
>
> On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com>
> wrote:
>
>> The Dataflow error logs only showed 1 error which was:  "The job failed
>> because a work item has failed 4 times. Look in previous log entries for
>> the cause of each one of the 4 failures. For more information, see
>> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
>> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
>> Root cause: The worker lost contact with the service."  In "Diagnostics"
>> there were errors stating "Error syncing pod: Could not start worker docker
>> container".  The harness logs i.e. "projects/my-project/logs/
>> dataflow.googleapis.com%2Fharness" finally contained an error that
>> looked suspect, which was "Check failed: kv_coder : expecting a KV
>> coder, but had Strings", below[1] is a link to possibly a stacktrace or
>> extra detail, but is internal to google so I don't have access.
>>
>> [1]
>> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>>
>> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> Huh, that's strange. Yes, the exact error on the service would be
>>> helpful.
>>>
>>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com>
>>> wrote:
>>> >
>>> > Thanks for the response. I've created a gist here to demonstrate a
>>> minimal repro:
>>> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>>> >
>>> > It seemed to run fine both on DirectRunner and PortableRunner (embed
>>> mode), but Dataflow v2 runner raised an error at runtime seemingly
>>> associated with the Shuffle service?  I have job IDs and trace links if
>>> those are helpful as well.
>>> >
>>> > Thanks,
>>> > Evan
>>> >
>>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>> >>
>>> >> This is not yet supported. Using a union for now is the way to go. (If
>>> >> only the last value of the union was used, that sounds like a bug. Do
>>> >> you have a minimal repro?)
>>> >>
>>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com>
>>> wrote:
>>> >> >
>>> >> > Hi all,
>>> >> >
>>> >> > What is the recommended way to write type hints for a tagged output
>>> DoFn where the outputs to different tags have different types?
>>> >> >
>>> >> > I tried using a Union to describe each of the possible output
>>> types, but that resulted in mismatched coder errors where only the last
>>> entry in the Union was used as the assumed type.  Is there a way to
>>> associate a type hint to a tag or something like that?
>>> >> >
>>> >> > Thanks,
>>> >> > Evan
>>>
>>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Evan Galpin <ev...@gmail.com>.
This is badly plaguing a pipeline I'm currently developing, where the exact
same data set and code runs end-to-end on DirectRunner, but fails on
DataflowRunner with either "Check failed: kv_coder : expecting a KV coder,
but had Strings" or "Check failed: pair_coder Strings" hidden in the
harness logs. It seems to be consistently repeatable with any TaggedOutput
+ GBK afterwards.

Any advice on how to proceed?

Thanks,
Evan

On Fri, Sep 17, 2021 at 11:20 AM Evan Galpin <ev...@gmail.com> wrote:

> The Dataflow error logs only showed 1 error which was:  "The job failed
> because a work item has failed 4 times. Look in previous log entries for
> the cause of each one of the 4 failures. For more information, see
> https://cloud.google.com/dataflow/docs/guides/common-errors. The work
> item was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c
> Root cause: The worker lost contact with the service."  In "Diagnostics"
> there were errors stating "Error syncing pod: Could not start worker docker
> container".  The harness logs i.e. "projects/my-project/logs/
> dataflow.googleapis.com%2Fharness" finally contained an error that looked
> suspect, which was "Check failed: kv_coder : expecting a KV coder, but
> had Strings", below[1] is a link to possibly a stacktrace or extra detail,
> but is internal to google so I don't have access.
>
> [1]
> https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4
>
> On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> Huh, that's strange. Yes, the exact error on the service would be helpful.
>>
>> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com>
>> wrote:
>> >
>> > Thanks for the response. I've created a gist here to demonstrate a
>> minimal repro:
>> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>> >
>> > It seemed to run fine both on DirectRunner and PortableRunner (embed
>> mode), but Dataflow v2 runner raised an error at runtime seemingly
>> associated with the Shuffle service?  I have job IDs and trace links if
>> those are helpful as well.
>> >
>> > Thanks,
>> > Evan
>> >
>> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>> >>
>> >> This is not yet supported. Using a union for now is the way to go. (If
>> >> only the last value of the union was used, that sounds like a bug. Do
>> >> you have a minimal repro?)
>> >>
>> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com>
>> wrote:
>> >> >
>> >> > Hi all,
>> >> >
>> >> > What is the recommended way to write type hints for a tagged output
>> DoFn where the outputs to different tags have different types?
>> >> >
>> >> > I tried using a Union to describe each of the possible output types,
>> but that resulted in mismatched coder errors where only the last entry in
>> the Union was used as the assumed type.  Is there a way to associate a type
>> hint to a tag or something like that?
>> >> >
>> >> > Thanks,
>> >> > Evan
>>
>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Evan Galpin <ev...@gmail.com>.
The Dataflow error logs only showed 1 error which was:  "The job failed
because a work item has failed 4 times. Look in previous log entries for
the cause of each one of the 4 failures. For more information, see
https://cloud.google.com/dataflow/docs/guides/common-errors. The work item
was attempted on these workers: beamapp-XXXX-XXXXX-kt85-harness-8k2c Root
cause: The worker lost contact with the service."  In "Diagnostics" there
were errors stating "Error syncing pod: Could not start worker docker
container".  The harness logs i.e. "projects/my-project/logs/
dataflow.googleapis.com%2Fharness" finally contained an error that looked
suspect, which was "Check failed: kv_coder : expecting a KV coder, but had
Strings", below[1] is a link to possibly a stacktrace or extra detail, but
is internal to google so I don't have access.

[1]
https://symbolize.corp.google.com/r/?trace=55a197abcf56,55a197abbe33,55a197abb97e,55a197abd708,55a196d4e22f,55a196d4d8d3,55a196d4da35,55a1967ec247,55a196f62b26,55a1968969b3,55a196886613,55a19696b0e6,55a196969815,55a1969693eb,55a19696916e,55a1969653bc,55a196b0150a,55a196b04e11,55a1979fc8df,7fe7736674e7,7fe7734dc22c&map=13ddc0ac8b57640c29c5016eb26ef88e:55a1956e7000-55a197bd5010,f1c96c67b57b74a4d7050f34aca016eef674f765:7fe773660000-7fe773676dac,76b955c7af655a4c1e53b8d4aaa0255f3721f95f:7fe7734a5000-7fe7736464c4

On Thu, Sep 9, 2021 at 6:46 PM Robert Bradshaw <ro...@google.com> wrote:

> Huh, that's strange. Yes, the exact error on the service would be helpful.
>
> On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com> wrote:
> >
> > Thanks for the response. I've created a gist here to demonstrate a
> minimal repro:
> https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
> >
> > It seemed to run fine both on DirectRunner and PortableRunner (embed
> mode), but Dataflow v2 runner raised an error at runtime seemingly
> associated with the Shuffle service?  I have job IDs and trace links if
> those are helpful as well.
> >
> > Thanks,
> > Evan
> >
> > On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com>
> wrote:
> >>
> >> This is not yet supported. Using a union for now is the way to go. (If
> >> only the last value of the union was used, that sounds like a bug. Do
> >> you have a minimal repro?)
> >>
> >> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com>
> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > What is the recommended way to write type hints for a tagged output
> DoFn where the outputs to different tags have different types?
> >> >
> >> > I tried using a Union to describe each of the possible output types,
> but that resulted in mismatched coder errors where only the last entry in
> the Union was used as the assumed type.  Is there a way to associate a type
> hint to a tag or something like that?
> >> >
> >> > Thanks,
> >> > Evan
>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Robert Bradshaw <ro...@google.com>.
Huh, that's strange. Yes, the exact error on the service would be helpful.

On Wed, Sep 8, 2021 at 10:12 AM Evan Galpin <ev...@gmail.com> wrote:
>
> Thanks for the response. I've created a gist here to demonstrate a minimal repro: https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc
>
> It seemed to run fine both on DirectRunner and PortableRunner (embed mode), but Dataflow v2 runner raised an error at runtime seemingly associated with the Shuffle service?  I have job IDs and trace links if those are helpful as well.
>
> Thanks,
> Evan
>
> On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com> wrote:
>>
>> This is not yet supported. Using a union for now is the way to go. (If
>> only the last value of the union was used, that sounds like a bug. Do
>> you have a minimal repro?)
>>
>> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > What is the recommended way to write type hints for a tagged output DoFn where the outputs to different tags have different types?
>> >
>> > I tried using a Union to describe each of the possible output types, but that resulted in mismatched coder errors where only the last entry in the Union was used as the assumed type.  Is there a way to associate a type hint to a tag or something like that?
>> >
>> > Thanks,
>> > Evan

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Evan Galpin <ev...@gmail.com>.
Thanks for the response. I've created a gist here to demonstrate a minimal
repro: https://gist.github.com/egalpin/2d6ad2210cf9f66108ff48a9c7566ebc

It seemed to run fine both on DirectRunner and PortableRunner (embed mode),
but Dataflow v2 runner raised an error at runtime seemingly associated with
the Shuffle service?  I have job IDs and trace links if those are helpful
as well.

Thanks,
Evan

On Tue, Sep 7, 2021 at 4:35 PM Robert Bradshaw <ro...@google.com> wrote:

> This is not yet supported. Using a union for now is the way to go. (If
> only the last value of the union was used, that sounds like a bug. Do
> you have a minimal repro?)
>
> On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com> wrote:
> >
> > Hi all,
> >
> > What is the recommended way to write type hints for a tagged output DoFn
> where the outputs to different tags have different types?
> >
> > I tried using a Union to describe each of the possible output types, but
> that resulted in mismatched coder errors where only the last entry in the
> Union was used as the assumed type.  Is there a way to associate a type
> hint to a tag or something like that?
> >
> > Thanks,
> > Evan
>

Re: [Python] Heterogeneous TaggedOutput Type Hints

Posted by Robert Bradshaw <ro...@google.com>.
This is not yet supported. Using a union for now is the way to go. (If
only the last value of the union was used, that sounds like a bug. Do
you have a minimal repro?)

On Tue, Sep 7, 2021 at 1:23 PM Evan Galpin <ev...@gmail.com> wrote:
>
> Hi all,
>
> What is the recommended way to write type hints for a tagged output DoFn where the outputs to different tags have different types?
>
> I tried using a Union to describe each of the possible output types, but that resulted in mismatched coder errors where only the last entry in the Union was used as the assumed type.  Is there a way to associate a type hint to a tag or something like that?
>
> Thanks,
> Evan