You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2020/05/21 16:54:00 UTC
[jira] [Created] (BEAM-10056) Side Input Validation too tight,
doesn't allow CoGBK
Robert Burke created BEAM-10056:
-----------------------------------
Summary: Side Input Validation too tight, doesn't allow CoGBK
Key: BEAM-10056
URL: https://issues.apache.org/jira/browse/BEAM-10056
Project: Beam
Issue Type: Bug
Components: sdk-go
Reporter: Robert Burke
Assignee: Robert Burke
The following doesn't pass validation, though it should as it's a valid signature for ParDo accepting a PCollection<CoGBK<string, *clientHistory, *clientHistory>>
func (fn *writer) StartBundle(ctx context.Context) error
func (fn *writer) ProcessElement(
ctx context.Context,
key string,
iter1, iter2 func(**clientHistory) bool)
func (fn *writer) FinishBundle(ctx context.Context)
It returns an error:
Missing side inputs in the StartBundle method of a DoFn. If side inputs are present in ProcessElement those side inputs must also be present in StartBundle.
Full error:
inserting ParDo in scope root:
graph.AsDoFn: for Fn named <...pii...>/userpackage.writer:
side inputs expected in method StartBundle [recovered]
panic: Missing side inputs in the StartBundle method of a DoFn. If side inputs are present in ProcessElement those side inputs must also be present in StartBundle.
Full error:
inserting ParDo in scope root:
graph.AsDoFn: for Fn named <...pii...>/userpackage.writer:
side inputs expected in method StartBundle
This is happening in the input unaware validation, which means it needs to be loosened, and validated elsewhere.
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/graph/fn.go#L527
There are "sibling" cases for the DoFn signature
func (fn *writer) StartBundle(context.Context, side func(**clientHistory) bool) error
func (fn *writer) ProcessElement(
ctx context.Context,
key string,
iter, side func(**clientHistory) bool)
func (fn *writer) FinishBundle( context.Context, side, func(**clientHistory) bool)
and
func (fn *writer) StartBundle(context.Context, side1, side2 func(**clientHistory) bool) error
func (fn *writer) ProcessElement(
ctx context.Context,
key string,
side1, side2 func(**clientHistory) bool)
func (fn *writer) FinishBundle( context.Context, side1, side2 func(**clientHistory) bool)
Would be for <CoGBK<string, *clientHistory>> with <*clientHistory> on the side, and
<string,> with <*clientHistory> and <*clientHistory> on the side respectively.
Which would only be determinable fully with the input, and should provide a clear error when PCollection binding is occuring.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)