You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/03 18:37:18 UTC

[GitHub] [beam] kennknowles opened a new issue, #18461: Simplify specifying coders on PCollectionTuple

kennknowles opened a new issue, #18461:
URL: https://github.com/apache/beam/issues/18461

   Currently when using a multi-output ParDo, the user usually has to do one of the following:
   
   1) Use anonymous class: new TupleTag<Foo\>() {} - in order to reify the Foo type and make coder inference work. In this case, a frequent problem is that the anonymous class captures a large enclosing class, and either doesn't serialize at all, or at least serializes to something bulky.
   2) Explicitly do tuple.get(myTag).setCoder(...)
   
   Both of these are suboptimal.
   
   Could we have e.g. a constructor for TupleTag that explicitly takes a TypeDescriptor? Or even a Coder? Or a family of factory methods for TupleTagList that take these? E.g.:
   in.apply(ParDo.of(...).withOutputTags(mainTag, TupleTagList.of(side1, FooCoder.of()).and(side2, BarCoder.of()));
   
   I would suggest both: TupleTag constructor should optionally take a TypeDescriptor; and TupleTagList.of() and .and() should optionally take a Coder.
   
   Imported from Jira [BEAM-2536](https://issues.apache.org/jira/browse/BEAM-2536). Original Jira may contain additional context.
   Reported by: jkff.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org