You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2021/03/31 18:38:00 UTC
[jira] [Updated] (BEAM-3293) Add lazy map side input form
[ https://issues.apache.org/jira/browse/BEAM-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Burke updated BEAM-3293:
-------------------------------
Description:
Add InputKinds LazyMap and LazyMultiMap that allow map lookup without reading everything to memory. They will be accessed through functions such as:
func(K) func(*V) bool (a keyed function that returns an iterator)
func(K) []V (a keyed function that returns a slice of values)
On the execution layer, the new forms would need to be added to exec/sideinput.go
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/sideinput.go]
The inputs layer, for the actual abstraction using reflection:
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/input.go]
The funcx package would need to be updated to detect the new parameter forms
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/fn.go]
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/sideinput.go]
as well has the DoFn graph validation code
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/graph/fn.go#L566]
They would need to be correctly translated into the pipeline protos:
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/graphx/translate.go#L315]
and finally back to the newly created handlers in the exec package.
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L402]
If implemented pre-generics, the code generator frontend, and backend would need to be updated to detect and generate code for efficient no-reflection overhead map access functions. [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/shimx/generate.go]
[https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/starcgenx/starcgenx.go]
Unit must be added throughout and Integration tests should be added to verify the functionality against portable beam runners.
[https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
And of course, the user GoDoc should be updated for the support.
See this lengthy email response for a more indepth guide to how Side Inputs operate. [https://lists.apache.org/thread.html/ra42dc7ee30842f11740eff33f0afcd63702695878e427127e1268381%40%3Cdev.beam.apache.org%3E]
was:
Add InputKinds LazyMap and LazyMultiMap that allow map lookup without reading everything to memory. They will be accessed through functions such as:
func(k string) int
func(k string) []int
On the execution layer, the new forms would need to be added to exec/sideinput.go
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/sideinput.go
The inputs layer, for the actual abstraction using reflection:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/input.go
The funcx package would need to be updated to detect the new parameter forms
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/fn.go
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/sideinput.go
as well has the DoFn graph validation code
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/graph/fn.go#L566
They would need to be correctly translated into the pipeline protos:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/graphx/translate.go#L315
and finally back to the newly created handlers in the exec package.
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L402
If implemented pre-generics, the code generator frontend, and backend would need to be updated to detect and generate code for efficient no-reflection overhead map access functions. https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/shimx/generate.go
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/starcgenx/starcgenx.go
Unit must be added throughout and Integration tests should be added to verify the functionality against portable beam runners.
https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives
And of course, the user GoDoc should be updated for the support.
> Add lazy map side input form
> ----------------------------
>
> Key: BEAM-3293
> URL: https://issues.apache.org/jira/browse/BEAM-3293
> Project: Beam
> Issue Type: Improvement
> Components: sdk-go
> Reporter: Henning Rohde
> Priority: P3
>
> Add InputKinds LazyMap and LazyMultiMap that allow map lookup without reading everything to memory. They will be accessed through functions such as:
> func(K) func(*V) bool (a keyed function that returns an iterator)
> func(K) []V (a keyed function that returns a slice of values)
> On the execution layer, the new forms would need to be added to exec/sideinput.go
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/sideinput.go]
> The inputs layer, for the actual abstraction using reflection:
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/input.go]
> The funcx package would need to be updated to detect the new parameter forms
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/fn.go]
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/funcx/sideinput.go]
> as well has the DoFn graph validation code
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/graph/fn.go#L566]
> They would need to be correctly translated into the pipeline protos:
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/graphx/translate.go#L315]
> and finally back to the newly created handlers in the exec package.
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/translate.go#L402]
> If implemented pre-generics, the code generator frontend, and backend would need to be updated to detect and generate code for efficient no-reflection overhead map access functions. [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/shimx/generate.go]
> [https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/starcgenx/starcgenx.go]
> Unit must be added throughout and Integration tests should be added to verify the functionality against portable beam runners.
> [https://github.com/apache/beam/tree/master/sdks/go/test/integration/primitives]
> And of course, the user GoDoc should be updated for the support.
> See this lengthy email response for a more indepth guide to how Side Inputs operate. [https://lists.apache.org/thread.html/ra42dc7ee30842f11740eff33f0afcd63702695878e427127e1268381%40%3Cdev.beam.apache.org%3E]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)