You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Piotr Szuberski <pi...@polidea.com> on 2020/06/02 09:29:37 UTC

Re: Python Cross-language wrappers for Java IOs

Sure, I'll do that

On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com> wrote: 
> Great. Thanks for working on this. Can you please add these tasks and JIRAs
> to the cross-language transforms roadmap under "Connector/transform
> support".
> https://beam.apache.org/roadmap/connectors-multi-sdk/
> 
> Happy to help if you run into any issues during this task.
> 
> <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
> Cham
> 
> On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <pi...@polidea.com>
> wrote:
> 
> > I added to Jira task of creating cross-language wrappers for Java IOs. It
> > will soon be in progress.
> >
> 

Re: Python Cross-language wrappers for Java IOs

Posted by Chamikara Jayalath <ch...@google.com>.
On Tue, Jun 16, 2020 at 2:12 PM Brian Hulette <bh...@google.com> wrote:

>
>
> On Tue, Jun 16, 2020 at 1:42 PM Piotr Szuberski <
> piotr.szuberski@polidea.com> wrote:
>
>> I published a PR with Jdbc xlang write transform with python wrapper
>> (#12023, #12022) - could you suggest someone to assign for CR?
>
> I could help review these
>

I'm happy to help as well. Thanks.


>
>> Btw. For now the standard row coder in python is very limited - it can
>> only use simple types (no bytes, datetime, boolean)  but I suppose it'll be
>> developed in the near future?
>>
>
> BEAM-7996 is tracking support for the remaining primitive types in
> Python's row coder. None of these should be that difficult to add I just
> haven't had the time myself. Boolean should be especially quick since Chad
> made it a standard coder a while ago.
> DATETIME is harder. We want it to be a logical type rather than a
> primitive type for portable schemas. I was trying to do that with the
> MillisInstant PR [1], but that raised a question about logical types and
> SQL which we still haven't resolved [2].
>
> [1] https://github.com/apache/beam/pull/11456
> [2]
> https://lists.apache.org/thread.html/r2e05355b74fb5b8149af78ade1e3539ec08371a9a4b2b9e45737e6be%40%3Cdev.beam.apache.org%3E
>
> On 2020/06/15 17:08:29, Chamikara Jayalath <ch...@google.com> wrote:
>> > Thanks. +1 for using RowCoder. We should try to use standard coders [1]
>> in
>> > the x-lang SDK boundaries.
>> > If we use other coders (for example, ProtoCoder) it may or may not work
>> > depending on how various runners implement support for x-lang.
>> >
>> > This might require slightly updating existing transforms or adding
>> > additional conversion transforms to cross-language builders.
>> >
>> > [1]
>> >
>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L669
>> >
>> > Thanks,
>> > Cham
>> >
>> > On Mon, Jun 15, 2020 at 10:00 AM Piotr Szuberski <
>> > piotr.szuberski@polidea.com> wrote:
>> >
>> > > Right now I'm working on JdbcIO and I'm using Row and Schema
>> protobuffs.
>> > > I'm figuring out how to use them properly. Thanks for the article -
>> for
>> > > sure it will be helpful!
>> > >
>> > > On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote:
>> > > > Thanks! I see there are jiras for SpannerIO and JdbcIO as part of
>> that.
>> > > Are
>> > > > you planning on using row coder for them?
>> > > > If so I want to make sure you're aware of
>> > > > https://s.apache.org/beam-schema-io (sent to the dev list last week
>> > > > [1]). +Scott
>> > > > Lukas <sl...@google.com> will be working on building out the ideas
>> > > there
>> > > > this summer. His work could be useful for making these IOs
>> cross-language
>> > > > (and you would get a mapping to SQL out of it without much more
>> effort).
>> > > >
>> > > > Brian
>> > > >
>> > > > [1]
>> > > >
>> > >
>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>> > > >
>> > > > On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>> > > piotr.szuberski@polidea.com>
>> > > > wrote:
>> > > >
>> > > > > Sure, I'll do that
>> > > > >
>> > > > > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>> > > wrote:
>> > > > > > Great. Thanks for working on this. Can you please add these
>> tasks and
>> > > > > JIRAs
>> > > > > > to the cross-language transforms roadmap under
>> "Connector/transform
>> > > > > > support".
>> > > > > > https://beam.apache.org/roadmap/connectors-multi-sdk/
>> > > > > >
>> > > > > > Happy to help if you run into any issues during this task.
>> > > > > >
>> > > > > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>> > > > > > Cham
>> > > > > >
>> > > > > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>> > > > > piotr.szuberski@polidea.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > I added to Jira task of creating cross-language wrappers for
>> Java
>> > > IOs.
>> > > > > It
>> > > > > > > will soon be in progress.
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: Python Cross-language wrappers for Java IOs

Posted by Brian Hulette <bh...@google.com>.
On Tue, Jun 16, 2020 at 1:42 PM Piotr Szuberski <pi...@polidea.com>
wrote:

> I published a PR with Jdbc xlang write transform with python wrapper
> (#12023, #12022) - could you suggest someone to assign for CR?

I could help review these

>
> Btw. For now the standard row coder in python is very limited - it can
> only use simple types (no bytes, datetime, boolean)  but I suppose it'll be
> developed in the near future?
>

BEAM-7996 is tracking support for the remaining primitive types in
Python's row coder. None of these should be that difficult to add I just
haven't had the time myself. Boolean should be especially quick since Chad
made it a standard coder a while ago.
DATETIME is harder. We want it to be a logical type rather than a primitive
type for portable schemas. I was trying to do that with the MillisInstant
PR [1], but that raised a question about logical types and SQL which we
still haven't resolved [2].

[1] https://github.com/apache/beam/pull/11456
[2]
https://lists.apache.org/thread.html/r2e05355b74fb5b8149af78ade1e3539ec08371a9a4b2b9e45737e6be%40%3Cdev.beam.apache.org%3E

On 2020/06/15 17:08:29, Chamikara Jayalath <ch...@google.com> wrote:
> > Thanks. +1 for using RowCoder. We should try to use standard coders [1]
> in
> > the x-lang SDK boundaries.
> > If we use other coders (for example, ProtoCoder) it may or may not work
> > depending on how various runners implement support for x-lang.
> >
> > This might require slightly updating existing transforms or adding
> > additional conversion transforms to cross-language builders.
> >
> > [1]
> >
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L669
> >
> > Thanks,
> > Cham
> >
> > On Mon, Jun 15, 2020 at 10:00 AM Piotr Szuberski <
> > piotr.szuberski@polidea.com> wrote:
> >
> > > Right now I'm working on JdbcIO and I'm using Row and Schema
> protobuffs.
> > > I'm figuring out how to use them properly. Thanks for the article - for
> > > sure it will be helpful!
> > >
> > > On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote:
> > > > Thanks! I see there are jiras for SpannerIO and JdbcIO as part of
> that.
> > > Are
> > > > you planning on using row coder for them?
> > > > If so I want to make sure you're aware of
> > > > https://s.apache.org/beam-schema-io (sent to the dev list last week
> > > > [1]). +Scott
> > > > Lukas <sl...@google.com> will be working on building out the ideas
> > > there
> > > > this summer. His work could be useful for making these IOs
> cross-language
> > > > (and you would get a mapping to SQL out of it without much more
> effort).
> > > >
> > > > Brian
> > > >
> > > > [1]
> > > >
> > >
> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
> > > >
> > > > On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
> > > piotr.szuberski@polidea.com>
> > > > wrote:
> > > >
> > > > > Sure, I'll do that
> > > > >
> > > > > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
> > > wrote:
> > > > > > Great. Thanks for working on this. Can you please add these
> tasks and
> > > > > JIRAs
> > > > > > to the cross-language transforms roadmap under
> "Connector/transform
> > > > > > support".
> > > > > > https://beam.apache.org/roadmap/connectors-multi-sdk/
> > > > > >
> > > > > > Happy to help if you run into any issues during this task.
> > > > > >
> > > > > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
> > > > > > Cham
> > > > > >
> > > > > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
> > > > > piotr.szuberski@polidea.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I added to Jira task of creating cross-language wrappers for
> Java
> > > IOs.
> > > > > It
> > > > > > > will soon be in progress.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: Python Cross-language wrappers for Java IOs

Posted by Piotr Szuberski <pi...@polidea.com>.
I published a PR with Jdbc xlang write transform with python wrapper (#12023, #12022) - could you suggest someone to assign for CR?

Btw. For now the standard row coder in python is very limited - it can only use simple types (no bytes, datetime, boolean)  but I suppose it'll be developed in the near future?

On 2020/06/15 17:08:29, Chamikara Jayalath <ch...@google.com> wrote: 
> Thanks. +1 for using RowCoder. We should try to use standard coders [1] in
> the x-lang SDK boundaries.
> If we use other coders (for example, ProtoCoder) it may or may not work
> depending on how various runners implement support for x-lang.
> 
> This might require slightly updating existing transforms or adding
> additional conversion transforms to cross-language builders.
> 
> [1]
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L669
> 
> Thanks,
> Cham
> 
> On Mon, Jun 15, 2020 at 10:00 AM Piotr Szuberski <
> piotr.szuberski@polidea.com> wrote:
> 
> > Right now I'm working on JdbcIO and I'm using Row and Schema protobuffs.
> > I'm figuring out how to use them properly. Thanks for the article - for
> > sure it will be helpful!
> >
> > On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote:
> > > Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
> > Are
> > > you planning on using row coder for them?
> > > If so I want to make sure you're aware of
> > > https://s.apache.org/beam-schema-io (sent to the dev list last week
> > > [1]). +Scott
> > > Lukas <sl...@google.com> will be working on building out the ideas
> > there
> > > this summer. His work could be useful for making these IOs cross-language
> > > (and you would get a mapping to SQL out of it without much more effort).
> > >
> > > Brian
> > >
> > > [1]
> > >
> > https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
> > >
> > > On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
> > piotr.szuberski@polidea.com>
> > > wrote:
> > >
> > > > Sure, I'll do that
> > > >
> > > > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
> > wrote:
> > > > > Great. Thanks for working on this. Can you please add these tasks and
> > > > JIRAs
> > > > > to the cross-language transforms roadmap under "Connector/transform
> > > > > support".
> > > > > https://beam.apache.org/roadmap/connectors-multi-sdk/
> > > > >
> > > > > Happy to help if you run into any issues during this task.
> > > > >
> > > > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
> > > > > Cham
> > > > >
> > > > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
> > > > piotr.szuberski@polidea.com>
> > > > > wrote:
> > > > >
> > > > > > I added to Jira task of creating cross-language wrappers for Java
> > IOs.
> > > > It
> > > > > > will soon be in progress.
> > > > > >
> > > > >
> > > >
> > >
> >
> 

Re: Python Cross-language wrappers for Java IOs

Posted by Chamikara Jayalath <ch...@google.com>.
It may or may not work. Some runners may have additional
optimizations/rewiring and runners are not aware whether two arbitrary
coders implemented in different languages are equivalent or not unless they
use the same URN defined in standard coders.
It's safe to stick to standard coders. We already have RowCoder for
encoding "arbitrary" objects across SDK boundaries.

Thanks,
Cham

On Mon, Jun 15, 2020 at 10:34 AM Boyuan Zhang <bo...@google.com> wrote:

> Thanks Cham. Standard coder is a good point. Does it mean non-standard
> coder doesn't work when crossing language boundaries even if it is
> implemented in both Java and Python sdk?
>
> On Mon, Jun 15, 2020 at 10:08 AM Chamikara Jayalath <ch...@google.com>
> wrote:
>
>> Thanks. +1 for using RowCoder. We should try to use standard coders [1]
>> in the x-lang SDK boundaries.
>> If we use other coders (for example, ProtoCoder) it may or may not work
>> depending on how various runners implement support for x-lang.
>>
>> This might require slightly updating existing transforms or adding
>> additional conversion transforms to cross-language builders.
>>
>> [1]
>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L669
>>
>> Thanks,
>> Cham
>>
>> On Mon, Jun 15, 2020 at 10:00 AM Piotr Szuberski <
>> piotr.szuberski@polidea.com> wrote:
>>
>>> Right now I'm working on JdbcIO and I'm using Row and Schema protobuffs.
>>> I'm figuring out how to use them properly. Thanks for the article - for
>>> sure it will be helpful!
>>>
>>> On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote:
>>> > Thanks! I see there are jiras for SpannerIO and JdbcIO as part of
>>> that. Are
>>> > you planning on using row coder for them?
>>> > If so I want to make sure you're aware of
>>> > https://s.apache.org/beam-schema-io (sent to the dev list last week
>>> > [1]). +Scott
>>> > Lukas <sl...@google.com> will be working on building out the ideas
>>> there
>>> > this summer. His work could be useful for making these IOs
>>> cross-language
>>> > (and you would get a mapping to SQL out of it without much more
>>> effort).
>>> >
>>> > Brian
>>> >
>>> > [1]
>>> >
>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>> >
>>> > On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>> piotr.szuberski@polidea.com>
>>> > wrote:
>>> >
>>> > > Sure, I'll do that
>>> > >
>>> > > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>> wrote:
>>> > > > Great. Thanks for working on this. Can you please add these tasks
>>> and
>>> > > JIRAs
>>> > > > to the cross-language transforms roadmap under "Connector/transform
>>> > > > support".
>>> > > > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>> > > >
>>> > > > Happy to help if you run into any issues during this task.
>>> > > >
>>> > > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>> > > > Cham
>>> > > >
>>> > > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>> > > piotr.szuberski@polidea.com>
>>> > > > wrote:
>>> > > >
>>> > > > > I added to Jira task of creating cross-language wrappers for
>>> Java IOs.
>>> > > It
>>> > > > > will soon be in progress.
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: Python Cross-language wrappers for Java IOs

Posted by Boyuan Zhang <bo...@google.com>.
Thanks Cham. Standard coder is a good point. Does it mean non-standard
coder doesn't work when crossing language boundaries even if it is
implemented in both Java and Python sdk?

On Mon, Jun 15, 2020 at 10:08 AM Chamikara Jayalath <ch...@google.com>
wrote:

> Thanks. +1 for using RowCoder. We should try to use standard coders [1] in
> the x-lang SDK boundaries.
> If we use other coders (for example, ProtoCoder) it may or may not work
> depending on how various runners implement support for x-lang.
>
> This might require slightly updating existing transforms or adding
> additional conversion transforms to cross-language builders.
>
> [1]
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L669
>
> Thanks,
> Cham
>
> On Mon, Jun 15, 2020 at 10:00 AM Piotr Szuberski <
> piotr.szuberski@polidea.com> wrote:
>
>> Right now I'm working on JdbcIO and I'm using Row and Schema protobuffs.
>> I'm figuring out how to use them properly. Thanks for the article - for
>> sure it will be helpful!
>>
>> On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote:
>> > Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
>> Are
>> > you planning on using row coder for them?
>> > If so I want to make sure you're aware of
>> > https://s.apache.org/beam-schema-io (sent to the dev list last week
>> > [1]). +Scott
>> > Lukas <sl...@google.com> will be working on building out the ideas
>> there
>> > this summer. His work could be useful for making these IOs
>> cross-language
>> > (and you would get a mapping to SQL out of it without much more effort).
>> >
>> > Brian
>> >
>> > [1]
>> >
>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>> >
>> > On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>> piotr.szuberski@polidea.com>
>> > wrote:
>> >
>> > > Sure, I'll do that
>> > >
>> > > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>> wrote:
>> > > > Great. Thanks for working on this. Can you please add these tasks
>> and
>> > > JIRAs
>> > > > to the cross-language transforms roadmap under "Connector/transform
>> > > > support".
>> > > > https://beam.apache.org/roadmap/connectors-multi-sdk/
>> > > >
>> > > > Happy to help if you run into any issues during this task.
>> > > >
>> > > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>> > > > Cham
>> > > >
>> > > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>> > > piotr.szuberski@polidea.com>
>> > > > wrote:
>> > > >
>> > > > > I added to Jira task of creating cross-language wrappers for Java
>> IOs.
>> > > It
>> > > > > will soon be in progress.
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: Python Cross-language wrappers for Java IOs

Posted by Chamikara Jayalath <ch...@google.com>.
Thanks. +1 for using RowCoder. We should try to use standard coders [1] in
the x-lang SDK boundaries.
If we use other coders (for example, ProtoCoder) it may or may not work
depending on how various runners implement support for x-lang.

This might require slightly updating existing transforms or adding
additional conversion transforms to cross-language builders.

[1]
https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L669

Thanks,
Cham

On Mon, Jun 15, 2020 at 10:00 AM Piotr Szuberski <
piotr.szuberski@polidea.com> wrote:

> Right now I'm working on JdbcIO and I'm using Row and Schema protobuffs.
> I'm figuring out how to use them properly. Thanks for the article - for
> sure it will be helpful!
>
> On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote:
> > Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
> Are
> > you planning on using row coder for them?
> > If so I want to make sure you're aware of
> > https://s.apache.org/beam-schema-io (sent to the dev list last week
> > [1]). +Scott
> > Lukas <sl...@google.com> will be working on building out the ideas
> there
> > this summer. His work could be useful for making these IOs cross-language
> > (and you would get a mapping to SQL out of it without much more effort).
> >
> > Brian
> >
> > [1]
> >
> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
> >
> > On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
> piotr.szuberski@polidea.com>
> > wrote:
> >
> > > Sure, I'll do that
> > >
> > > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
> wrote:
> > > > Great. Thanks for working on this. Can you please add these tasks and
> > > JIRAs
> > > > to the cross-language transforms roadmap under "Connector/transform
> > > > support".
> > > > https://beam.apache.org/roadmap/connectors-multi-sdk/
> > > >
> > > > Happy to help if you run into any issues during this task.
> > > >
> > > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
> > > > Cham
> > > >
> > > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
> > > piotr.szuberski@polidea.com>
> > > > wrote:
> > > >
> > > > > I added to Jira task of creating cross-language wrappers for Java
> IOs.
> > > It
> > > > > will soon be in progress.
> > > > >
> > > >
> > >
> >
>

Re: Python Cross-language wrappers for Java IOs

Posted by Piotr Szuberski <pi...@polidea.com>.
Right now I'm working on JdbcIO and I'm using Row and Schema protobuffs. I'm figuring out how to use them properly. Thanks for the article - for sure it will be helpful!

On 2020/06/12 20:32:16, Brian Hulette <bh...@google.com> wrote: 
> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that. Are
> you planning on using row coder for them?
> If so I want to make sure you're aware of
> https://s.apache.org/beam-schema-io (sent to the dev list last week
> [1]). +Scott
> Lukas <sl...@google.com> will be working on building out the ideas there
> this summer. His work could be useful for making these IOs cross-language
> (and you would get a mapping to SQL out of it without much more effort).
> 
> Brian
> 
> [1]
> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
> 
> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <pi...@polidea.com>
> wrote:
> 
> > Sure, I'll do that
> >
> > On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com> wrote:
> > > Great. Thanks for working on this. Can you please add these tasks and
> > JIRAs
> > > to the cross-language transforms roadmap under "Connector/transform
> > > support".
> > > https://beam.apache.org/roadmap/connectors-multi-sdk/
> > >
> > > Happy to help if you run into any issues during this task.
> > >
> > > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
> > > Cham
> > >
> > > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
> > piotr.szuberski@polidea.com>
> > > wrote:
> > >
> > > > I added to Jira task of creating cross-language wrappers for Java IOs.
> > It
> > > > will soon be in progress.
> > > >
> > >
> >
> 

Re: Python Cross-language wrappers for Java IOs

Posted by Boyuan Zhang <bo...@google.com>.
The change should be schema change, mostly adding new fields.

On Mon, Jun 15, 2020 at 11:32 AM Brian Hulette <bh...@google.com> wrote:

>
>
> On Mon, Jun 15, 2020 at 11:12 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Jun 12, 2020 at 4:12 PM Brian Hulette <bh...@google.com>
>> wrote:
>>
>>> > are unknown fields propagated through if the user only reads/modifies
>>> a row?
>>> I'm not sure I understand this question. Are you asking about handling
>>> schema changes?
>>> The wire format includes the number of fields in the schema,
>>> specifically so that we can detect when the schema changes. This is
>>> restricted to added or removed fields at the end of the schema. i.e. if we
>>> receive an element that says it has N more fields than the schema this
>>> coder was created with we assume the pipeline was updated with a schema
>>> that drops the last N fields and ignore the extra fields. Similarly if we
>>> receive an element with N fewer fields than we expect we'll just fill the
>>> last N fields with nulls.
>>> This logic is implemented in Python [1] and Java [2], but it's not
>>> exercised since no runners actually support pipeline update with schema
>>> changes.
>>>
>>> > how does it work in a pipeline update scenario (downgrade / upgrade)?
>>> It's a standard coder with a defined spec [3] and tests in
>>> standard_coders.yaml [4] (although we could certainly use more coverage
>>> there) so I think pipeline update should work fine, unless I'm missing
>>> something.
>>>
>>
>> The big question is whether the pipeline update will be rejected due to
>> the Coder having "changed."
>>
>>
>
> Do you mean changed because the schema has changed, or due to the vagaries
> of Java serialization?
>
>
>> Brian
>>>
>>> [1]
>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189
>>> [2]
>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356
>>> [3]
>>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864
>>> [4]
>>> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364
>>>
>>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> +Boyuan Zhang <bo...@google.com>
>>>>
>>>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>>>
>>>>> What is the update / compat story around schemas?
>>>>> * are unknown fields propagated through if the user only
>>>>> reads/modifies a row?
>>>>> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>>>>>
>>>>> Boyuan has been working on a Kafka via SDF source and have been trying
>>>>> to figure out which interchange format to use for the "source descriptors"
>>>>> that feed into the SDF. Some obvious choices are json, avro, proto, and
>>>>> Beam schemas all with their caveats.
>>>>>
>>>>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of
>>>>>> that. Are you planning on using row coder for them?
>>>>>> If so I want to make sure you're aware of
>>>>>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>>>>>> [1]). +Scott Lukas <sl...@google.com> will be working on building
>>>>>> out the ideas there this summer. His work could be useful for making these
>>>>>> IOs cross-language (and you would get a mapping to SQL out of it without
>>>>>> much more effort).
>>>>>>
>>>>>> Brian
>>>>>>
>>>>>> [1]
>>>>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>>>>>
>>>>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>>>>> piotr.szuberski@polidea.com> wrote:
>>>>>>
>>>>>>> Sure, I'll do that
>>>>>>>
>>>>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>>>>>> wrote:
>>>>>>> > Great. Thanks for working on this. Can you please add these tasks
>>>>>>> and JIRAs
>>>>>>> > to the cross-language transforms roadmap under "Connector/transform
>>>>>>> > support".
>>>>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>>>>>> >
>>>>>>> > Happy to help if you run into any issues during this task.
>>>>>>> >
>>>>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>>>>>> > Cham
>>>>>>> >
>>>>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>>>>>> piotr.szuberski@polidea.com>
>>>>>>> > wrote:
>>>>>>> >
>>>>>>> > > I added to Jira task of creating cross-language wrappers for
>>>>>>> Java IOs. It
>>>>>>> > > will soon be in progress.
>>>>>>> > >
>>>>>>> >
>>>>>>>
>>>>>>

Re: Python Cross-language wrappers for Java IOs

Posted by Brian Hulette <bh...@google.com>.
On Mon, Jun 15, 2020 at 11:12 AM Robert Bradshaw <ro...@google.com>
wrote:

> On Fri, Jun 12, 2020 at 4:12 PM Brian Hulette <bh...@google.com> wrote:
>
>> > are unknown fields propagated through if the user only reads/modifies a
>> row?
>> I'm not sure I understand this question. Are you asking about handling
>> schema changes?
>> The wire format includes the number of fields in the schema, specifically
>> so that we can detect when the schema changes. This is restricted to added
>> or removed fields at the end of the schema. i.e. if we receive an element
>> that says it has N more fields than the schema this coder was created with
>> we assume the pipeline was updated with a schema that drops the last N
>> fields and ignore the extra fields. Similarly if we receive an element with
>> N fewer fields than we expect we'll just fill the last N fields with nulls.
>> This logic is implemented in Python [1] and Java [2], but it's not
>> exercised since no runners actually support pipeline update with schema
>> changes.
>>
>> > how does it work in a pipeline update scenario (downgrade / upgrade)?
>> It's a standard coder with a defined spec [3] and tests in
>> standard_coders.yaml [4] (although we could certainly use more coverage
>> there) so I think pipeline update should work fine, unless I'm missing
>> something.
>>
>
> The big question is whether the pipeline update will be rejected due to
> the Coder having "changed."
>
>

Do you mean changed because the schema has changed, or due to the vagaries
of Java serialization?


> Brian
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189
>> [2]
>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356
>> [3]
>> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864
>> [4]
>> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364
>>
>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> +Boyuan Zhang <bo...@google.com>
>>>
>>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>>
>>>> What is the update / compat story around schemas?
>>>> * are unknown fields propagated through if the user only reads/modifies
>>>> a row?
>>>> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>>>>
>>>> Boyuan has been working on a Kafka via SDF source and have been trying
>>>> to figure out which interchange format to use for the "source descriptors"
>>>> that feed into the SDF. Some obvious choices are json, avro, proto, and
>>>> Beam schemas all with their caveats.
>>>>
>>>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com>
>>>> wrote:
>>>>
>>>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of
>>>>> that. Are you planning on using row coder for them?
>>>>> If so I want to make sure you're aware of
>>>>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>>>>> [1]). +Scott Lukas <sl...@google.com> will be working on building
>>>>> out the ideas there this summer. His work could be useful for making these
>>>>> IOs cross-language (and you would get a mapping to SQL out of it without
>>>>> much more effort).
>>>>>
>>>>> Brian
>>>>>
>>>>> [1]
>>>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>>>>
>>>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>>>> piotr.szuberski@polidea.com> wrote:
>>>>>
>>>>>> Sure, I'll do that
>>>>>>
>>>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>>>>> wrote:
>>>>>> > Great. Thanks for working on this. Can you please add these tasks
>>>>>> and JIRAs
>>>>>> > to the cross-language transforms roadmap under "Connector/transform
>>>>>> > support".
>>>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>>>>> >
>>>>>> > Happy to help if you run into any issues during this task.
>>>>>> >
>>>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>>>>> > Cham
>>>>>> >
>>>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>>>>> piotr.szuberski@polidea.com>
>>>>>> > wrote:
>>>>>> >
>>>>>> > > I added to Jira task of creating cross-language wrappers for Java
>>>>>> IOs. It
>>>>>> > > will soon be in progress.
>>>>>> > >
>>>>>> >
>>>>>>
>>>>>

Re: Python Cross-language wrappers for Java IOs

Posted by Robert Bradshaw <ro...@google.com>.
On Fri, Jun 12, 2020 at 4:12 PM Brian Hulette <bh...@google.com> wrote:

> > are unknown fields propagated through if the user only reads/modifies a
> row?
> I'm not sure I understand this question. Are you asking about handling
> schema changes?
> The wire format includes the number of fields in the schema, specifically
> so that we can detect when the schema changes. This is restricted to added
> or removed fields at the end of the schema. i.e. if we receive an element
> that says it has N more fields than the schema this coder was created with
> we assume the pipeline was updated with a schema that drops the last N
> fields and ignore the extra fields. Similarly if we receive an element with
> N fewer fields than we expect we'll just fill the last N fields with nulls.
> This logic is implemented in Python [1] and Java [2], but it's not
> exercised since no runners actually support pipeline update with schema
> changes.
>
> > how does it work in a pipeline update scenario (downgrade / upgrade)?
> It's a standard coder with a defined spec [3] and tests in
> standard_coders.yaml [4] (although we could certainly use more coverage
> there) so I think pipeline update should work fine, unless I'm missing
> something.
>

The big question is whether the pipeline update will be rejected due to the
Coder having "changed."


> Brian
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189
> [2]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356
> [3]
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864
> [4]
> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364
>
> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>
>> +Boyuan Zhang <bo...@google.com>
>>
>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> What is the update / compat story around schemas?
>>> * are unknown fields propagated through if the user only reads/modifies
>>> a row?
>>> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>>>
>>> Boyuan has been working on a Kafka via SDF source and have been trying
>>> to figure out which interchange format to use for the "source descriptors"
>>> that feed into the SDF. Some obvious choices are json, avro, proto, and
>>> Beam schemas all with their caveats.
>>>
>>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com>
>>> wrote:
>>>
>>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
>>>> Are you planning on using row coder for them?
>>>> If so I want to make sure you're aware of
>>>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>>>> [1]). +Scott Lukas <sl...@google.com> will be working on building out
>>>> the ideas there this summer. His work could be useful for making these IOs
>>>> cross-language (and you would get a mapping to SQL out of it without much
>>>> more effort).
>>>>
>>>> Brian
>>>>
>>>> [1]
>>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>>>
>>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>>> piotr.szuberski@polidea.com> wrote:
>>>>
>>>>> Sure, I'll do that
>>>>>
>>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>>>> wrote:
>>>>> > Great. Thanks for working on this. Can you please add these tasks
>>>>> and JIRAs
>>>>> > to the cross-language transforms roadmap under "Connector/transform
>>>>> > support".
>>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>>>> >
>>>>> > Happy to help if you run into any issues during this task.
>>>>> >
>>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>>>> > Cham
>>>>> >
>>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>>>> piotr.szuberski@polidea.com>
>>>>> > wrote:
>>>>> >
>>>>> > > I added to Jira task of creating cross-language wrappers for Java
>>>>> IOs. It
>>>>> > > will soon be in progress.
>>>>> > >
>>>>> >
>>>>>
>>>>

Re: Python Cross-language wrappers for Java IOs

Posted by Boyuan Zhang <bo...@google.com>.
Thanks Brian and Luke!

I'm curious whether Schema supports optional fields like protobuf. In my
use case, most of the fields will be optional and my application only
accesses these field when the value is presented. Also it seems like if I
want to use Schema to transfer data across sdk, I need to define a Schema
in Java and a NamedTuple in python, right?

On Fri, Jun 12, 2020 at 4:11 PM Brian Hulette <bh...@google.com> wrote:

> > are unknown fields propagated through if the user only reads/modifies a
> row?
> I'm not sure I understand this question. Are you asking about handling
> schema changes?
> The wire format includes the number of fields in the schema, specifically
> so that we can detect when the schema changes. This is restricted to added
> or removed fields at the end of the schema. i.e. if we receive an element
> that says it has N more fields than the schema this coder was created with
> we assume the pipeline was updated with a schema that drops the last N
> fields and ignore the extra fields. Similarly if we receive an element with
> N fewer fields than we expect we'll just fill the last N fields with nulls.
> This logic is implemented in Python [1] and Java [2], but it's not
> exercised since no runners actually support pipeline update with schema
> changes.
>
> > how does it work in a pipeline update scenario (downgrade / upgrade)?
> It's a standard coder with a defined spec [3] and tests in
> standard_coders.yaml [4] (although we could certainly use more coverage
> there) so I think pipeline update should work fine, unless I'm missing
> something.
>
> Brian
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189
> [2]
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356
> [3]
> https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864
> [4]
> https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364
>
> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>
>> +Boyuan Zhang <bo...@google.com>
>>
>> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>>
>>> What is the update / compat story around schemas?
>>> * are unknown fields propagated through if the user only reads/modifies
>>> a row?
>>> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>>>
>>> Boyuan has been working on a Kafka via SDF source and have been trying
>>> to figure out which interchange format to use for the "source descriptors"
>>> that feed into the SDF. Some obvious choices are json, avro, proto, and
>>> Beam schemas all with their caveats.
>>>
>>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com>
>>> wrote:
>>>
>>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
>>>> Are you planning on using row coder for them?
>>>> If so I want to make sure you're aware of
>>>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>>>> [1]). +Scott Lukas <sl...@google.com> will be working on building out
>>>> the ideas there this summer. His work could be useful for making these IOs
>>>> cross-language (and you would get a mapping to SQL out of it without much
>>>> more effort).
>>>>
>>>> Brian
>>>>
>>>> [1]
>>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>>>
>>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>>> piotr.szuberski@polidea.com> wrote:
>>>>
>>>>> Sure, I'll do that
>>>>>
>>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>>>> wrote:
>>>>> > Great. Thanks for working on this. Can you please add these tasks
>>>>> and JIRAs
>>>>> > to the cross-language transforms roadmap under "Connector/transform
>>>>> > support".
>>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>>>> >
>>>>> > Happy to help if you run into any issues during this task.
>>>>> >
>>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>>>> > Cham
>>>>> >
>>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>>>> piotr.szuberski@polidea.com>
>>>>> > wrote:
>>>>> >
>>>>> > > I added to Jira task of creating cross-language wrappers for Java
>>>>> IOs. It
>>>>> > > will soon be in progress.
>>>>> > >
>>>>> >
>>>>>
>>>>

Re: Python Cross-language wrappers for Java IOs

Posted by Brian Hulette <bh...@google.com>.
> are unknown fields propagated through if the user only reads/modifies a
row?
I'm not sure I understand this question. Are you asking about handling
schema changes?
The wire format includes the number of fields in the schema, specifically
so that we can detect when the schema changes. This is restricted to added
or removed fields at the end of the schema. i.e. if we receive an element
that says it has N more fields than the schema this coder was created with
we assume the pipeline was updated with a schema that drops the last N
fields and ignore the extra fields. Similarly if we receive an element with
N fewer fields than we expect we'll just fill the last N fields with nulls.
This logic is implemented in Python [1] and Java [2], but it's not
exercised since no runners actually support pipeline update with schema
changes.

> how does it work in a pipeline update scenario (downgrade / upgrade)?
It's a standard coder with a defined spec [3] and tests in
standard_coders.yaml [4] (although we could certainly use more coverage
there) so I think pipeline update should work fine, unless I'm missing
something.

Brian

[1]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/coders/row_coder.py#L177-L189
[2]
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java#L341-L356
[3]
https://github.com/apache/beam/blob/master/model/pipeline/src/main/proto/beam_runner_api.proto#L833-L864
[4]
https://github.com/apache/beam/blob/master/model/fn-execution/src/main/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml#L344-L364

On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:

> +Boyuan Zhang <bo...@google.com>
>
> On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:
>
>> What is the update / compat story around schemas?
>> * are unknown fields propagated through if the user only reads/modifies a
>> row?
>> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>>
>> Boyuan has been working on a Kafka via SDF source and have been trying to
>> figure out which interchange format to use for the "source descriptors"
>> that feed into the SDF. Some obvious choices are json, avro, proto, and
>> Beam schemas all with their caveats.
>>
>> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com>
>> wrote:
>>
>>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
>>> Are you planning on using row coder for them?
>>> If so I want to make sure you're aware of
>>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>>> [1]). +Scott Lukas <sl...@google.com> will be working on building out
>>> the ideas there this summer. His work could be useful for making these IOs
>>> cross-language (and you would get a mapping to SQL out of it without much
>>> more effort).
>>>
>>> Brian
>>>
>>> [1]
>>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>>
>>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>>> piotr.szuberski@polidea.com> wrote:
>>>
>>>> Sure, I'll do that
>>>>
>>>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>>> wrote:
>>>> > Great. Thanks for working on this. Can you please add these tasks and
>>>> JIRAs
>>>> > to the cross-language transforms roadmap under "Connector/transform
>>>> > support".
>>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>>> >
>>>> > Happy to help if you run into any issues during this task.
>>>> >
>>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>>> > Cham
>>>> >
>>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>>> piotr.szuberski@polidea.com>
>>>> > wrote:
>>>> >
>>>> > > I added to Jira task of creating cross-language wrappers for Java
>>>> IOs. It
>>>> > > will soon be in progress.
>>>> > >
>>>> >
>>>>
>>>

Re: Python Cross-language wrappers for Java IOs

Posted by Luke Cwik <lc...@google.com>.
+Boyuan Zhang <bo...@google.com>

On Fri, Jun 12, 2020 at 3:32 PM Luke Cwik <lc...@google.com> wrote:

> What is the update / compat story around schemas?
> * are unknown fields propagated through if the user only reads/modifies a
> row?
> * how does it work in a pipeline update scenario (downgrade / upgrade)?
>
> Boyuan has been working on a Kafka via SDF source and have been trying to
> figure out which interchange format to use for the "source descriptors"
> that feed into the SDF. Some obvious choices are json, avro, proto, and
> Beam schemas all with their caveats.
>
> On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com> wrote:
>
>> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
>> Are you planning on using row coder for them?
>> If so I want to make sure you're aware of
>> https://s.apache.org/beam-schema-io (sent to the dev list last week
>> [1]). +Scott Lukas <sl...@google.com> will be working on building out
>> the ideas there this summer. His work could be useful for making these IOs
>> cross-language (and you would get a mapping to SQL out of it without much
>> more effort).
>>
>> Brian
>>
>> [1]
>> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>>
>> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
>> piotr.szuberski@polidea.com> wrote:
>>
>>> Sure, I'll do that
>>>
>>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com>
>>> wrote:
>>> > Great. Thanks for working on this. Can you please add these tasks and
>>> JIRAs
>>> > to the cross-language transforms roadmap under "Connector/transform
>>> > support".
>>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>>> >
>>> > Happy to help if you run into any issues during this task.
>>> >
>>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>>> > Cham
>>> >
>>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>>> piotr.szuberski@polidea.com>
>>> > wrote:
>>> >
>>> > > I added to Jira task of creating cross-language wrappers for Java
>>> IOs. It
>>> > > will soon be in progress.
>>> > >
>>> >
>>>
>>

Re: Python Cross-language wrappers for Java IOs

Posted by Luke Cwik <lc...@google.com>.
What is the update / compat story around schemas?
* are unknown fields propagated through if the user only reads/modifies a
row?
* how does it work in a pipeline update scenario (downgrade / upgrade)?

Boyuan has been working on a Kafka via SDF source and have been trying to
figure out which interchange format to use for the "source descriptors"
that feed into the SDF. Some obvious choices are json, avro, proto, and
Beam schemas all with their caveats.

On Fri, Jun 12, 2020 at 1:32 PM Brian Hulette <bh...@google.com> wrote:

> Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that.
> Are you planning on using row coder for them?
> If so I want to make sure you're aware of
> https://s.apache.org/beam-schema-io (sent to the dev list last week [1]). +Scott
> Lukas <sl...@google.com> will be working on building out the ideas there
> this summer. His work could be useful for making these IOs cross-language
> (and you would get a mapping to SQL out of it without much more effort).
>
> Brian
>
> [1]
> https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E
>
> On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <
> piotr.szuberski@polidea.com> wrote:
>
>> Sure, I'll do that
>>
>> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com> wrote:
>> > Great. Thanks for working on this. Can you please add these tasks and
>> JIRAs
>> > to the cross-language transforms roadmap under "Connector/transform
>> > support".
>> > https://beam.apache.org/roadmap/connectors-multi-sdk/
>> >
>> > Happy to help if you run into any issues during this task.
>> >
>> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
>> > Cham
>> >
>> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
>> piotr.szuberski@polidea.com>
>> > wrote:
>> >
>> > > I added to Jira task of creating cross-language wrappers for Java
>> IOs. It
>> > > will soon be in progress.
>> > >
>> >
>>
>

Re: Python Cross-language wrappers for Java IOs

Posted by Brian Hulette <bh...@google.com>.
Thanks! I see there are jiras for SpannerIO and JdbcIO as part of that. Are
you planning on using row coder for them?
If so I want to make sure you're aware of
https://s.apache.org/beam-schema-io (sent to the dev list last week
[1]). +Scott
Lukas <sl...@google.com> will be working on building out the ideas there
this summer. His work could be useful for making these IOs cross-language
(and you would get a mapping to SQL out of it without much more effort).

Brian

[1]
https://lists.apache.org/thread.html/rc1695025d41c5dc38cdf7bc32bea0e7421379b1c543c2d82f69aa179%40%3Cdev.beam.apache.org%3E

On Tue, Jun 2, 2020 at 9:30 AM Piotr Szuberski <pi...@polidea.com>
wrote:

> Sure, I'll do that
>
> On 2020/05/28 17:54:49, Chamikara Jayalath <ch...@google.com> wrote:
> > Great. Thanks for working on this. Can you please add these tasks and
> JIRAs
> > to the cross-language transforms roadmap under "Connector/transform
> > support".
> > https://beam.apache.org/roadmap/connectors-multi-sdk/
> >
> > Happy to help if you run into any issues during this task.
> >
> > <https://beam.apache.org/roadmap/connectors-multi-sdk/>Thanks,
> > Cham
> >
> > On Thu, May 28, 2020 at 9:59 AM Piotr Szuberski <
> piotr.szuberski@polidea.com>
> > wrote:
> >
> > > I added to Jira task of creating cross-language wrappers for Java IOs.
> It
> > > will soon be in progress.
> > >
> >
>