You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Eugene Kirpichov <ki...@google.com.INVALID> on 2017/05/03 06:54:33 UTC

Re: Let's make Beam transforms comply with PTransform Style Guide

Hey all,

The effort is complete: all transforms have been brought in accordance with
the style guide and the JIRAs are closed!

In nearly all cases the fixes introduced small but backward-incompatible
changes, but always with a simple migration path, and I believe the Beam
API surface is overall much better for it.

For example, there are no more IOs that use Coder's as their primary way of
interpreting binary data; no more ugly Bound/Unbound classes; no more IOs
exposing their Source or Sink API directly (instead of packaging as
PTransform); the code is cleaner and shorter (due to AutoValue and a more
principled distinction between factory methods and builder methods) and
there are a lot more canonical examples of how to write transforms for
future authors, now that every transform shipped with the SDK is a
canonical example :)

The only thing remaining is adjusting the website documentation, release
notes, etc. - I'll work on this tomorrow.


On Thu, Apr 20, 2017 at 10:55 PM Jean-Baptiste Onofré <jb...@nanthrax.net>
wrote:

> No problem ;)
>
> Happy to review if needed ;)
>
> Regards
> JB
>
> On 04/21/2017 07:50 AM, Eugene Kirpichov wrote:
> > Guys, apologies, but I already have Kinesis in review, and Pubsub ready
> for
> > review. I'm afraid there's not much left for volunteers to take on right
> > now.
> >
> > On Thu, Apr 20, 2017 at 10:47 PM Jean-Baptiste Onofré <jb...@nanthrax.net>
> > wrote:
> >
> >> Cool, I gonna take a look on PubSub later today (I would like to finish
> >> CassandraIO, HDFS refactoring and Spark 2 support first ;)).
> >>
> >> Regards
> >> JB
> >>
> >> On 04/21/2017 06:03 AM, tarush grover wrote:
> >>> Hi,
> >>>
> >>> I can take kinesis one.
> >>>
> >>> Regards,
> >>> Tarush
> >>>
> >>>
> >>> On Thu, 20 Apr 2017 at 11:18 AM, Jean-Baptiste Onofré <jb@nanthrax.net
> >
> >>> wrote:
> >>>
> >>>> Gonna take a look on the pending IOs.
> >>>>
> >>>> Thanks !
> >>>> Regards
> >>>> JB
> >>>>
> >>>> On 04/19/2017 10:05 PM, Eugene Kirpichov wrote:
> >>>>> A few more knocked down
> >>>>> - I finished Map/FlatMap, XML, TFRecordIO
> >>>>> - I'm working on CountingInput; it's nontrivial.
> >>>>> - Reuven is working on Text/Avro
> >>>>> - @peay is working on removing coders from KafkaIO
> >>>>>
> >>>>> Kinesis and PubsubIO remain; of these, Kinesis is the easier one.
> >>>>>
> >>>>> Any takers?
> >>>>>
> >>>>> On Fri, Apr 7, 2017 at 10:47 PM Jean-Baptiste Onofré <
> jb@nanthrax.net>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Eugene,
> >>>>>>
> >>>>>> thanks for the update. I'm volunteer to tackle some those IOs (and
> >> make
> >>>>>> them
> >>>>>> conform with PTransform style guide). I'm pretty sure other people
> >> will
> >>>>>> jump on ;)
> >>>>>>
> >>>>>> Regards
> >>>>>> JB
> >>>>>>
> >>>>>> On 04/08/2017 12:20 AM, Eugene Kirpichov wrote:
> >>>>>>> Hey all,
> >>>>>>>
> >>>>>>> More progress has been made and we're nearing completion. ParDo,
> >>>>>> BigQueryIO
> >>>>>>> and Window are fixed; Map/FlatMapElements are in review.
> >>>>>>>
> >>>>>>> The remaining unclaimed ones are all IOs of some form, and here's a
> >>>> list.
> >>>>>>> I've marked them all as "starter" in JIRA.
> >>>>>>>
> >>>>>>> XML - https://issues.apache.org/jira/browse/BEAM-1914
> >>>>>>> TFRecordIO (Tensorflow) -
> >>>>>> https://issues.apache.org/jira/browse/BEAM-1913
> >>>>>>> KinesisIO - https://issues.apache.org/jira/browse/BEAM-1428
> >>>>>>> PubsubIO - https://issues.apache.org/jira/browse/BEAM-1415
> >>>>>>> CountingInput - https://issues.apache.org/jira/browse/BEAM-1414
> >>>>>>>
> >>>>>>> https://github.com/apache/beam/pull/2149 , which fixes BigQueryIO,
> >> is
> >>>> a
> >>>>>>> good model to follow when taking these on, as well as e.g.
> >>>>>>> https://github.com/apache/beam/pull/1927 (TextIO)
> >>>>>>>
> >>>>>>> These are all actually easy to fix, but need volunteers (I do not
> >> have
> >>>>>> time
> >>>>>>> to fix all of these myself, but happy to be a reviewer - @jkff).
> >>>>>>> Let's finish this up in time for the first Beam stable release, so
> >>>> Beam's
> >>>>>>> stable API surface is consistent and polished!
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>> --
> >>>> Jean-Baptiste Onofré
> >>>> jbonofre@apache.org
> >>>> http://blog.nanthrax.net
> >>>> Talend - http://www.talend.com
> >>>>
> >>>
> >>
> >> --
> >> Jean-Baptiste Onofré
> >> jbonofre@apache.org
> >> http://blog.nanthrax.net
> >> Talend - http://www.talend.com
> >>
> >
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: Let's make Beam transforms comply with PTransform Style Guide

Posted by Davor Bonaci <da...@apache.org>.
Thanks Eugene -- this is remarkable.

On Tue, May 2, 2017 at 11:54 PM, Eugene Kirpichov <
kirpichov@google.com.invalid> wrote:

> Hey all,
>
> The effort is complete: all transforms have been brought in accordance with
> the style guide and the JIRAs are closed!
>
> In nearly all cases the fixes introduced small but backward-incompatible
> changes, but always with a simple migration path, and I believe the Beam
> API surface is overall much better for it.
>
> For example, there are no more IOs that use Coder's as their primary way of
> interpreting binary data; no more ugly Bound/Unbound classes; no more IOs
> exposing their Source or Sink API directly (instead of packaging as
> PTransform); the code is cleaner and shorter (due to AutoValue and a more
> principled distinction between factory methods and builder methods) and
> there are a lot more canonical examples of how to write transforms for
> future authors, now that every transform shipped with the SDK is a
> canonical example :)
>
> The only thing remaining is adjusting the website documentation, release
> notes, etc. - I'll work on this tomorrow.
>
>
> On Thu, Apr 20, 2017 at 10:55 PM Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
> > No problem ;)
> >
> > Happy to review if needed ;)
> >
> > Regards
> > JB
> >
> > On 04/21/2017 07:50 AM, Eugene Kirpichov wrote:
> > > Guys, apologies, but I already have Kinesis in review, and Pubsub ready
> > for
> > > review. I'm afraid there's not much left for volunteers to take on
> right
> > > now.
> > >
> > > On Thu, Apr 20, 2017 at 10:47 PM Jean-Baptiste Onofré <jb@nanthrax.net
> >
> > > wrote:
> > >
> > >> Cool, I gonna take a look on PubSub later today (I would like to
> finish
> > >> CassandraIO, HDFS refactoring and Spark 2 support first ;)).
> > >>
> > >> Regards
> > >> JB
> > >>
> > >> On 04/21/2017 06:03 AM, tarush grover wrote:
> > >>> Hi,
> > >>>
> > >>> I can take kinesis one.
> > >>>
> > >>> Regards,
> > >>> Tarush
> > >>>
> > >>>
> > >>> On Thu, 20 Apr 2017 at 11:18 AM, Jean-Baptiste Onofré <
> jb@nanthrax.net
> > >
> > >>> wrote:
> > >>>
> > >>>> Gonna take a look on the pending IOs.
> > >>>>
> > >>>> Thanks !
> > >>>> Regards
> > >>>> JB
> > >>>>
> > >>>> On 04/19/2017 10:05 PM, Eugene Kirpichov wrote:
> > >>>>> A few more knocked down
> > >>>>> - I finished Map/FlatMap, XML, TFRecordIO
> > >>>>> - I'm working on CountingInput; it's nontrivial.
> > >>>>> - Reuven is working on Text/Avro
> > >>>>> - @peay is working on removing coders from KafkaIO
> > >>>>>
> > >>>>> Kinesis and PubsubIO remain; of these, Kinesis is the easier one.
> > >>>>>
> > >>>>> Any takers?
> > >>>>>
> > >>>>> On Fri, Apr 7, 2017 at 10:47 PM Jean-Baptiste Onofré <
> > jb@nanthrax.net>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hi Eugene,
> > >>>>>>
> > >>>>>> thanks for the update. I'm volunteer to tackle some those IOs (and
> > >> make
> > >>>>>> them
> > >>>>>> conform with PTransform style guide). I'm pretty sure other people
> > >> will
> > >>>>>> jump on ;)
> > >>>>>>
> > >>>>>> Regards
> > >>>>>> JB
> > >>>>>>
> > >>>>>> On 04/08/2017 12:20 AM, Eugene Kirpichov wrote:
> > >>>>>>> Hey all,
> > >>>>>>>
> > >>>>>>> More progress has been made and we're nearing completion. ParDo,
> > >>>>>> BigQueryIO
> > >>>>>>> and Window are fixed; Map/FlatMapElements are in review.
> > >>>>>>>
> > >>>>>>> The remaining unclaimed ones are all IOs of some form, and
> here's a
> > >>>> list.
> > >>>>>>> I've marked them all as "starter" in JIRA.
> > >>>>>>>
> > >>>>>>> XML - https://issues.apache.org/jira/browse/BEAM-1914
> > >>>>>>> TFRecordIO (Tensorflow) -
> > >>>>>> https://issues.apache.org/jira/browse/BEAM-1913
> > >>>>>>> KinesisIO - https://issues.apache.org/jira/browse/BEAM-1428
> > >>>>>>> PubsubIO - https://issues.apache.org/jira/browse/BEAM-1415
> > >>>>>>> CountingInput - https://issues.apache.org/jira/browse/BEAM-1414
> > >>>>>>>
> > >>>>>>> https://github.com/apache/beam/pull/2149 , which fixes
> BigQueryIO,
> > >> is
> > >>>> a
> > >>>>>>> good model to follow when taking these on, as well as e.g.
> > >>>>>>> https://github.com/apache/beam/pull/1927 (TextIO)
> > >>>>>>>
> > >>>>>>> These are all actually easy to fix, but need volunteers (I do not
> > >> have
> > >>>>>> time
> > >>>>>>> to fix all of these myself, but happy to be a reviewer - @jkff).
> > >>>>>>> Let's finish this up in time for the first Beam stable release,
> so
> > >>>> Beam's
> > >>>>>>> stable API surface is consistent and polished!
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>> --
> > >>>> Jean-Baptiste Onofré
> > >>>> jbonofre@apache.org
> > >>>> http://blog.nanthrax.net
> > >>>> Talend - http://www.talend.com
> > >>>>
> > >>>
> > >>
> > >> --
> > >> Jean-Baptiste Onofré
> > >> jbonofre@apache.org
> > >> http://blog.nanthrax.net
> > >> Talend - http://www.talend.com
> > >>
> > >
> >
> > --
> > Jean-Baptiste Onofré
> > jbonofre@apache.org
> > http://blog.nanthrax.net
> > Talend - http://www.talend.com
> >
>