You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Brian Hulette <bh...@google.com> on 2021/03/01 23:04:42 UTC

Re: Potential bug with BEAM-11460?

I think this is really just a case of an unanticipated use-case. What you
did in https://github.com/apache/beam/pull/13616 is reasonable, you tested
the paths that you thought users would need.
This situation is just a little different because the solution was to add a
"withCoder" method, which was inadvertently mentioned in your error message
due to a copy-paste error.

Brian

On Sun, Feb 28, 2021 at 5:28 AM Anant Damle <an...@google.com> wrote:

> Thanks Tao!
> @Brian Hulette <bh...@google.com> & @Tao Li: I would be curious to
> learn how can one test these kinds of PRs where one is only providing some
> plumbing.
>
> On Sat, Feb 27, 2021 at 1:05 AM Tao Li <ta...@zillow.com> wrote:
>
>> Thanks @Anant Damle <an...@google.com> for fixing the issue with
>> BEAM-11460 and BEAM-11527 so quickly!
>>
>>
>>
>> *From: *Anant Damle <an...@google.com>
>> *Date: *Friday, February 26, 2021 at 6:49 AM
>> *To: *Tao Li <ta...@zillow.com>
>> *Cc: *"user@beam.apache.org" <us...@beam.apache.org>, Brian Hulette <
>> bhulette@google.com>
>> *Subject: *Re: Potential bug with BEAM-11460?
>>
>>
>>
>> @Tao Li, I have added the Unit Test for your use-case and in this commit
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpull%2F14078%2Fcommits%2Ff5459bb3533194de48712229957a555ef79f17ef&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623820239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Weg3VhsVq45SYPdYxJc0EVjCXBlvQJOMvmveqxKYIRY%3D&reserved=0>
>> .
>>
>>
>>
>> On Fri, Feb 26, 2021 at 10:13 PM Anant Damle <an...@google.com> wrote:
>>
>> Thanks Tao,
>>
>> Let me try and put this as a test-case.
>>
>> I am also looking into BEAM-11527
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-11527&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623820239%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2BC88BEp985JnhebqzzRmEaiWuhn5MmVOWnn9DqfgcLQ%3D&reserved=0>
>> .
>>
>>
>>
>> Thanks,
>>
>> Anant
>>
>>
>>
>> On Fri, Feb 26, 2021 at 9:30 AM Tao Li <ta...@zillow.com> wrote:
>>
>> @Brian Hulette <bh...@google.com> I think the main issue I am trying
>> to reporting is that I see this error message “Specify it explicitly using
>> withCoder().” But I did not find withCoder() API available from ParquetIO.
>> So maybe we need to add that method.
>>
>> Getting back to your ask, here is roughly the code I was running. Hope
>> this helps.
>>
>> PCollection<Row> inputDataTest =
>> pipeline.apply(ParquetIO.parseGenericRecords(new
>> SerializableFunction<GenericRecord, Row>() {
>>
>>                             public Row apply(GenericRecord record) {
>>
>>                                 return AvroUtils.toBeamRowStrict(record,
>> null);
>>
>>                             }
>>
>>                         })
>>
>>                         .from(path));
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From: *Brian Hulette <bh...@google.com>
>> *Reply-To: *"user@beam.apache.org" <us...@beam.apache.org>
>> *Date: *Thursday, February 25, 2021 at 3:11 PM
>> *To: *Anant Damle <an...@google.com>
>> *Cc: *user <us...@beam.apache.org>
>> *Subject: *Re: Potential bug with BEAM-11460?
>>
>>
>>
>> Hi Tao,
>> Thanks for reporting this! Could you share more details about your
>> use-case, Anant mentioned that he's having trouble coming up with a test
>> case where inferCoder doesn't work [1].
>>
>>
>>
>> Brian
>>
>> [1] https://github.com/apache/beam/pull/14078#issuecomment-786293576
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpull%2F14078%23issuecomment-786293576&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623830201%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jCJzq6jPAzlgIAqbUMaRIBVHeMuXZG4450fNsCpb61c%3D&reserved=0>
>>
>>
>>
>> On Wed, Feb 24, 2021 at 6:49 PM Anant Damle <an...@google.com> wrote:
>>
>> Hi Brian,
>>
>> I think you are right. Create BEAM-11861
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FBEAM-11861&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623830201%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7Yy66M9t%2FDD2vQTjTm7pYKTScWlp%2BbRyJ7wz5TYZkY8%3D&reserved=0>,
>> will send a PR today.
>>
>> Present workaround is to provide .setCoder directly on the Output
>> PCollection.
>>
>>
>>
>> On Thu, Feb 25, 2021 at 5:25 AM Brian Hulette <bh...@google.com>
>> wrote:
>>
>> +Anant Damle <an...@google.com> is this an oversight in
>> https://github.com/apache/beam/pull/13616
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fpull%2F13616&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623840148%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jSuJ1kPMoE4vje6a6RNOnQxBluiBT1Pq7gPg5hOJlac%3D&reserved=0>?
>> What would be the right way to fix this?
>>
>>
>>
>> On Tue, Feb 23, 2021 at 5:24 PM Tao Li <ta...@zillow.com> wrote:
>>
>> Hi Beam community,
>>
>>
>>
>> I cannot log into Beam jira so I am asking this question here. I am
>> testing this new feature from Beam 2.28 and see below error:
>>
>>
>>
>> Exception in thread "main" java.lang.IllegalArgumentException: Unable to
>> infer coder for output of parseFn. Specify it explicitly using withCoder().
>>
>>                 at
>> org.apache.beam.sdk.io.parquet.ParquetIO$ParseFiles.inferCoder(ParquetIO.java:554)
>>
>>                 at
>> org.apache.beam.sdk.io.parquet.ParquetIO$ParseFiles.expand(ParquetIO.java:521)
>>
>>                 at
>> org.apache.beam.sdk.io.parquet.ParquetIO$ParseFiles.expand(ParquetIO.java:483)
>>
>>                 at
>> org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:547)
>>
>>
>>
>> However ParquetIO builder does not have this withCoder() method. I think
>> this error message is mimicking AvroIO:
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java#L1010
>> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fbeam%2Fblob%2Fmaster%2Fsdks%2Fjava%2Fcore%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fbeam%2Fsdk%2Fio%2FAvroIO.java%23L1010&data=04%7C01%7Ctaol%40zillow.com%7C9108a61d8cf34535ea4708d8da65b2ce%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637499477623840148%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=x42TgU3zVZ6l207GriWVcUPfI796pRhrkZNheyMxkEE%3D&reserved=0>
>>
>>
>>
>> Should we add this method to ParquetIO? Thanks!
>>
>>