You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (JIRA)" <ji...@apache.org> on 2018/11/17 21:00:00 UTC
[jira] [Commented] (BEAM-3612) Make it easy to generate type-specialized Go SDK reflectx.Funcs

    [ https://issues.apache.org/jira/browse/BEAM-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16690698#comment-16690698 ] 

Robert Burke commented on BEAM-3612:
------------------------------------

While the benchmark I added in [https://github.com/apache/beam/pull/6893] focused on what was available with the reflection library, moving to the "explicit" version is complicated.
The Funcx library would need to modified to handle explicit receivers for it's type analysis to prep for PCollection binding. This would require a new wrapping function that assumes the first parameter is the receiver, since detecting if the first parameter is a receiver would be difficult in general without adding outside information.

Subsequently, during execution the receiver would need to be passed in on invocation. These two aren't so bad on the whole since they would be very targeted changes, to narrow areas of the codebase.

The real issue is we'd still need to have the type assertion shims generated for those signatures, and *every* method for a structural DoFn would need it's own custom shim. This would be a bit excessive, as we wouldn't be able to re-use shims for identical signatured methods & functions, since the methods would no longer match a simple function anymore. 

eg. type Foo struct {}
func (f *Foo) ProcessElement(int) int
 

would need a shim signature of func(*Foo, int) int, instead of the current func(int) int, with the implicit wrapping.



In aggregate, it could be problematic and at very least, for beam packages have excessively large files. Imagine if the stats package were to have structural DoFns for each numeric type. There would be no re-use of shims between the Sum, Min, or Max aggregators.

However, an obvious solution presents itself in the form of the code generator. We already have one, we just need to generate more code with it, to avoid the Struct-Explosion. We could add a new reflectx registration that produces for a given instance of a structuralDoFn, closured functions that call each method. The code generator can then generate one of these per struct, and graph/fn.go can then offload this to this wrapper call.

We can put the registration code into the reflectx package along with the caller, emitter, and input registrations. Further we could then move the iteration through methods in graph/fn.go to the reflectx package, which is a more coherent place for that code.
As before, I have a prototype and a benchmark for these, and it's presently processing.

> Make it easy to generate type-specialized Go SDK reflectx.Funcs
> ---------------------------------------------------------------
>
>                 Key: BEAM-3612
>                 URL: https://issues.apache.org/jira/browse/BEAM-3612
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-go
>            Reporter: Henning Rohde
>            Assignee: Robert Burke
>            Priority: Major
>          Time Spent: 6h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)