You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Abdelhakim Bendjabeur <ab...@gorgias.com> on 2022/03/28 13:50:13 UTC

Support null values in kafkaIO

Hello,

I am trying to build a pipeline using Beam's Python SDK to run on Dataflow
and I encountered an error when encoding Null value message coming from
kafka (tombstone message)

```

Caused by: org.apache.beam.sdk.coders.CoderException: cannot encode a
null byte[]

```

It seems unsupported for the moment, as I saw here
<https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ByteArrayCoder.java#L63>

Is there a workaround for this?
To avoid having errors pop up each time a null-value message arrives?
Or to bypass these events?

jira ticket that might be related here
<https://issues.apache.org/jira/browse/BEAM-10529>
slack message here
<https://the-asf.slack.com/archives/C9H0YNP3P/p1648226359872819>

Kind regards,
Abdelhakim Bendjabeur
Data Engineer @gorgias

Re: Support null values in kafkaIO

Posted by Alexey Romanenko <ar...@gmail.com>.
Thank you for working on this, John! This case with null key/values seems quite demanded.  

—
Alexey

> On 28 Mar 2022, at 21:58, John Casey <jo...@google.com> wrote:
> 
> Unfortunately, there isn't a workaround at the moment. I'm nearing completion of the fix.
> 
> https://issues.apache.org/jira/projects/BEAM/issues/BEAM-10529 <https://issues.apache.org/jira/projects/BEAM/issues/BEAM-10529>
> 
> John
> 
> On Mon, Mar 28, 2022 at 12:21 PM Brian Hulette <bhulette@google.com <ma...@google.com>> wrote:
> Hi Abdelhakim,
> 
> +John Casey <ma...@google.com> is working on a fix [1] for BEAM-10529 now. I'm not aware of a workaround but maybe John knows of one.
> 
> Brian
> 
> [1] https://github.com/apache/beam/pull/16923 <https://github.com/apache/beam/pull/16923>
> On Mon, Mar 28, 2022 at 6:50 AM Abdelhakim Bendjabeur <abdelhakim.bendjabeur@gorgias.com <ma...@gorgias.com>> wrote:
> Hello,
> 
> I am trying to build a pipeline using Beam's Python SDK to run on Dataflow and I encountered an error when encoding Null value message coming from kafka (tombstone message)
> 
> ```
> Caused by: org.apache.beam.sdk.coders.CoderException: cannot encode a null byte[]
> ```
> 
> It seems unsupported for the moment, as I saw here <https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ByteArrayCoder.java#L63>
> 
> Is there a workaround for this? 
> To avoid having errors pop up each time a null-value message arrives?
> Or to bypass these events?
> 
> jira ticket that might be related here <https://issues.apache.org/jira/browse/BEAM-10529>
> slack message here <https://the-asf.slack.com/archives/C9H0YNP3P/p1648226359872819>
> 
> Kind regards,
> Abdelhakim Bendjabeur
> Data Engineer @gorgias <>

Re: Support null values in kafkaIO

Posted by John Casey <jo...@google.com>.
Unfortunately, there isn't a workaround at the moment. I'm nearing
completion of the fix.

https://issues.apache.org/jira/projects/BEAM/issues/BEAM-10529

John

On Mon, Mar 28, 2022 at 12:21 PM Brian Hulette <bh...@google.com> wrote:

> Hi Abdelhakim,
>
> +John Casey <jo...@google.com> is working on a fix [1] for
> BEAM-10529 now. I'm not aware of a workaround but maybe John knows of one.
>
> Brian
>
> [1] https://github.com/apache/beam/pull/16923
>
> On Mon, Mar 28, 2022 at 6:50 AM Abdelhakim Bendjabeur <
> abdelhakim.bendjabeur@gorgias.com> wrote:
>
>> Hello,
>>
>> I am trying to build a pipeline using Beam's Python SDK to run on
>> Dataflow and I encountered an error when encoding Null value message coming
>> from kafka (tombstone message)
>>
>> ```
>>
>> Caused by: org.apache.beam.sdk.coders.CoderException: cannot encode a null byte[]
>>
>> ```
>>
>> It seems unsupported for the moment, as I saw here
>> <https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ByteArrayCoder.java#L63>
>>
>> Is there a workaround for this?
>> To avoid having errors pop up each time a null-value message arrives?
>> Or to bypass these events?
>>
>> jira ticket that might be related here
>> <https://issues.apache.org/jira/browse/BEAM-10529>
>> slack message here
>> <https://the-asf.slack.com/archives/C9H0YNP3P/p1648226359872819>
>>
>> Kind regards,
>> Abdelhakim Bendjabeur
>> Data Engineer @gorgias
>>
>

Re: Support null values in kafkaIO

Posted by Brian Hulette <bh...@google.com>.
Hi Abdelhakim,

+John Casey <jo...@google.com> is working on a fix [1] for BEAM-10529
now. I'm not aware of a workaround but maybe John knows of one.

Brian

[1] https://github.com/apache/beam/pull/16923

On Mon, Mar 28, 2022 at 6:50 AM Abdelhakim Bendjabeur <
abdelhakim.bendjabeur@gorgias.com> wrote:

> Hello,
>
> I am trying to build a pipeline using Beam's Python SDK to run on Dataflow
> and I encountered an error when encoding Null value message coming from
> kafka (tombstone message)
>
> ```
>
> Caused by: org.apache.beam.sdk.coders.CoderException: cannot encode a null byte[]
>
> ```
>
> It seems unsupported for the moment, as I saw here
> <https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ByteArrayCoder.java#L63>
>
> Is there a workaround for this?
> To avoid having errors pop up each time a null-value message arrives?
> Or to bypass these events?
>
> jira ticket that might be related here
> <https://issues.apache.org/jira/browse/BEAM-10529>
> slack message here
> <https://the-asf.slack.com/archives/C9H0YNP3P/p1648226359872819>
>
> Kind regards,
> Abdelhakim Bendjabeur
> Data Engineer @gorgias
>