You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by John Casey via dev <de...@beam.apache.org> on 2023/01/09 16:57:44 UTC

How to write an IO guide draft

Hi All,

I spent the last few weeks of December drafting a "How to write an IO
guide":
https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#

and an associated code sample: https://github.com/apache/beam/pull/24799

My goal is to make it easier for a new IO developer to create a new IO from
scratch. This is intended to complement the various standards documents
that have been floating around. Where those are intended to
prescribe structure of an IO, this is more focused on the mechanics of
internal design.

Please take a look and let me know what you think,

John

Re: How to write an IO guide draft

Posted by Robert Burke <ro...@frantil.com>.
It's my intent this quarter to translate the document for Go. A document
like this has been the main blocker to developing these instructions as I'm
adamant about not replicating the initial IO stumbles that any naive author
would go through.

I'm very excited about this.

On Tue, Jan 10, 2023, 8:41 AM Sachin Agarwal via dev <de...@beam.apache.org>
wrote:

> Totally agreed with that, but it's not bad as a statement of intent for
> our vision -
>
> On Tue, Jan 10, 2023 at 8:34 AM Alexey Romanenko <ar...@gmail.com>
> wrote:
>
>> I doubt that it will be a "de-facto" standard behaviour for all runners
>> in the short term until the cross-language funtionality brings additional
>> complexity into pipeline deployment and performance overhead.
>>
>> Perhaps, it will be changed in long term, but for now, I may guess that
>> the most of Beam pipelines still use the same SDK IO connectors as a
>> pipeline itself.
>>
>> —
>> Alexey
>>
>> On 10 Jan 2023, at 16:51, Sachin Agarwal via dev <de...@beam.apache.org>
>> wrote:
>>
>> I think the idea of cross language is that an IO is only in one language
>> and others can use that IO. My feeling is that the idea of “what language
>> is this IO in” becomes an implementation detail that folks won’t have to
>> care about longer term. There are enhancements needed to the expansion
>> service to make that happen but that’s my understanding of the strategy.
>>
>> On Tue, Jan 10, 2023 at 7:40 AM Austin Bennett <au...@apache.org> wrote:
>>
>>> This is great, thanks for putting this together!
>>>
>>> A related question:  are we as a community targeting java to be the
>>> canonical/target IO language if an IO does not currently exist?  If that is
>>> not the case, then I would imagine we are hoping that we might eventually
>>> also wind up with good examples for implementing IOs in other languages as
>>> well [ not suggesting that you/John address that, but that we add GH Issues
>>> as that might be worthwhile to hope others take on ]?
>>>
>>>
>>>
>>> On Mon, Jan 9, 2023 at 8:58 AM John Casey via dev <de...@beam.apache.org>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I spent the last few weeks of December drafting a "How to write an IO
>>>> guide":
>>>> https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>>>>
>>>> and an associated code sample:
>>>> https://github.com/apache/beam/pull/24799
>>>>
>>>> My goal is to make it easier for a new IO developer to create a new IO
>>>> from scratch. This is intended to complement the various standards
>>>> documents that have been floating around. Where those are intended to
>>>> prescribe structure of an IO, this is more focused on the mechanics of
>>>> internal design.
>>>>
>>>> Please take a look and let me know what you think,
>>>>
>>>> John
>>>>
>>>
>>

Re: How to write an IO guide draft

Posted by Sachin Agarwal via dev <de...@beam.apache.org>.
Totally agreed with that, but it's not bad as a statement of intent for our
vision -

On Tue, Jan 10, 2023 at 8:34 AM Alexey Romanenko <ar...@gmail.com>
wrote:

> I doubt that it will be a "de-facto" standard behaviour for all runners in
> the short term until the cross-language funtionality brings additional
> complexity into pipeline deployment and performance overhead.
>
> Perhaps, it will be changed in long term, but for now, I may guess that
> the most of Beam pipelines still use the same SDK IO connectors as a
> pipeline itself.
>
> —
> Alexey
>
> On 10 Jan 2023, at 16:51, Sachin Agarwal via dev <de...@beam.apache.org>
> wrote:
>
> I think the idea of cross language is that an IO is only in one language
> and others can use that IO. My feeling is that the idea of “what language
> is this IO in” becomes an implementation detail that folks won’t have to
> care about longer term. There are enhancements needed to the expansion
> service to make that happen but that’s my understanding of the strategy.
>
> On Tue, Jan 10, 2023 at 7:40 AM Austin Bennett <au...@apache.org> wrote:
>
>> This is great, thanks for putting this together!
>>
>> A related question:  are we as a community targeting java to be the
>> canonical/target IO language if an IO does not currently exist?  If that is
>> not the case, then I would imagine we are hoping that we might eventually
>> also wind up with good examples for implementing IOs in other languages as
>> well [ not suggesting that you/John address that, but that we add GH Issues
>> as that might be worthwhile to hope others take on ]?
>>
>>
>>
>> On Mon, Jan 9, 2023 at 8:58 AM John Casey via dev <de...@beam.apache.org>
>> wrote:
>>
>>> Hi All,
>>>
>>> I spent the last few weeks of December drafting a "How to write an IO
>>> guide":
>>> https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>>>
>>> and an associated code sample: https://github.com/apache/beam/pull/24799
>>>
>>> My goal is to make it easier for a new IO developer to create a new IO
>>> from scratch. This is intended to complement the various standards
>>> documents that have been floating around. Where those are intended to
>>> prescribe structure of an IO, this is more focused on the mechanics of
>>> internal design.
>>>
>>> Please take a look and let me know what you think,
>>>
>>> John
>>>
>>
>

Re: How to write an IO guide draft

Posted by Alexey Romanenko <ar...@gmail.com>.
I doubt that it will be a "de-facto" standard behaviour for all runners in the short term until the cross-language funtionality brings additional complexity into pipeline deployment and performance overhead. 

Perhaps, it will be changed in long term, but for now, I may guess that the most of Beam pipelines still use the same SDK IO connectors as a pipeline itself.

—
Alexey

> On 10 Jan 2023, at 16:51, Sachin Agarwal via dev <de...@beam.apache.org> wrote:
> 
> I think the idea of cross language is that an IO is only in one language and others can use that IO. My feeling is that the idea of “what language is this IO in” becomes an implementation detail that folks won’t have to care about longer term. There are enhancements needed to the expansion service to make that happen but that’s my understanding of the strategy. 
> 
> On Tue, Jan 10, 2023 at 7:40 AM Austin Bennett <austin@apache.org <ma...@apache.org>> wrote:
>> This is great, thanks for putting this together!  
>> 
>> A related question:  are we as a community targeting java to be the canonical/target IO language if an IO does not currently exist?  If that is not the case, then I would imagine we are hoping that we might eventually also wind up with good examples for implementing IOs in other languages as well [ not suggesting that you/John address that, but that we add GH Issues as that might be worthwhile to hope others take on ]?
>> 
>> 
>> 
>> On Mon, Jan 9, 2023 at 8:58 AM John Casey via dev <dev@beam.apache.org <ma...@beam.apache.org>> wrote:
>>> Hi All,
>>> 
>>> I spent the last few weeks of December drafting a "How to write an IO guide": https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>>> 
>>> and an associated code sample: https://github.com/apache/beam/pull/24799
>>> 
>>> My goal is to make it easier for a new IO developer to create a new IO from scratch. This is intended to complement the various standards documents that have been floating around. Where those are intended to prescribe structure of an IO, this is more focused on the mechanics of internal design.
>>> 
>>> Please take a look and let me know what you think,
>>> 
>>> John


Re: How to write an IO guide draft

Posted by Sachin Agarwal via dev <de...@beam.apache.org>.
I think the idea of cross language is that an IO is only in one language
and others can use that IO. My feeling is that the idea of “what language
is this IO in” becomes an implementation detail that folks won’t have to
care about longer term. There are enhancements needed to the expansion
service to make that happen but that’s my understanding of the strategy.

On Tue, Jan 10, 2023 at 7:40 AM Austin Bennett <au...@apache.org> wrote:

> This is great, thanks for putting this together!
>
> A related question:  are we as a community targeting java to be the
> canonical/target IO language if an IO does not currently exist?  If that is
> not the case, then I would imagine we are hoping that we might eventually
> also wind up with good examples for implementing IOs in other languages as
> well [ not suggesting that you/John address that, but that we add GH Issues
> as that might be worthwhile to hope others take on ]?
>
>
>
> On Mon, Jan 9, 2023 at 8:58 AM John Casey via dev <de...@beam.apache.org>
> wrote:
>
>> Hi All,
>>
>> I spent the last few weeks of December drafting a "How to write an IO
>> guide":
>> https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>>
>> and an associated code sample: https://github.com/apache/beam/pull/24799
>>
>> My goal is to make it easier for a new IO developer to create a new IO
>> from scratch. This is intended to complement the various standards
>> documents that have been floating around. Where those are intended to
>> prescribe structure of an IO, this is more focused on the mechanics of
>> internal design.
>>
>> Please take a look and let me know what you think,
>>
>> John
>>
>

Re: How to write an IO guide draft

Posted by Austin Bennett <au...@apache.org>.
This is great, thanks for putting this together!

A related question:  are we as a community targeting java to be the
canonical/target IO language if an IO does not currently exist?  If that is
not the case, then I would imagine we are hoping that we might eventually
also wind up with good examples for implementing IOs in other languages as
well [ not suggesting that you/John address that, but that we add GH Issues
as that might be worthwhile to hope others take on ]?



On Mon, Jan 9, 2023 at 8:58 AM John Casey via dev <de...@beam.apache.org>
wrote:

> Hi All,
>
> I spent the last few weeks of December drafting a "How to write an IO
> guide":
> https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>
> and an associated code sample: https://github.com/apache/beam/pull/24799
>
> My goal is to make it easier for a new IO developer to create a new IO
> from scratch. This is intended to complement the various standards
> documents that have been floating around. Where those are intended to
> prescribe structure of an IO, this is more focused on the mechanics of
> internal design.
>
> Please take a look and let me know what you think,
>
> John
>

Re: How to write an IO guide draft

Posted by Herman Mak via dev <de...@beam.apache.org>.
Thanks John!

Herman Mak |  Customer Engineer, Hong Kong, Google Cloud |
hermanmak@google.com |  +852-3923-5417





On Tue, Jan 10, 2023 at 12:58 AM John Casey via dev <de...@beam.apache.org>
wrote:

> Hi All,
>
> I spent the last few weeks of December drafting a "How to write an IO
> guide":
> https://docs.google.com/document/d/1-WxZTNu9RrLhh5O7Dl5PbnKqz3e5gm1x3gDBBhszVF8/edit#
>
> and an associated code sample: https://github.com/apache/beam/pull/24799
>
> My goal is to make it easier for a new IO developer to create a new IO
> from scratch. This is intended to complement the various standards
> documents that have been floating around. Where those are intended to
> prescribe structure of an IO, this is more focused on the mechanics of
> internal design.
>
> Please take a look and let me know what you think,
>
> John
>