You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Robert Burke (Jira)" <ji...@apache.org> on 2020/08/10 17:48:00 UTC

[jira] [Commented] (BEAM-9615) [Go SDK] Beam Schemas

    [ https://issues.apache.org/jira/browse/BEAM-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174954#comment-17174954 ] 

Robert Burke commented on BEAM-9615:
------------------------------------

I have end to end working case for schemas of ordinary go user structs. There's still necessary handling around errors, but those are easier to make fluent when we see how users get the code to fail.

An issue I didn't quite expect is that there's no way to have top level schemas be pointer types without an extra option or similar. I've worked around it by having a hard option to extract a registered type, but that feels problematic in general. I think I'll introduce a go specific "nillable" option for the top level schema, but the short result is that from external transforms, one essentially always needs to use a value type rather than a pointer type when schemas are involved since there's no good way for the Go SDK to infer that it needs a pointer type at that level of coder, unless the runner doesn't elide the go specific option.

Care needs to be taken with theĀ  option when decoding row fields, since field types can already be nillable, and we would want to avoid unnecessary pointer to pointers, which would be incorrect in this case.

> [Go SDK] Beam Schemas
> ---------------------
>
>                 Key: BEAM-9615
>                 URL: https://issues.apache.org/jira/browse/BEAM-9615
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-go
>            Reporter: Robert Burke
>            Assignee: Robert Burke
>            Priority: P2
>          Time Spent: 10h 50m
>  Remaining Estimate: 0h
>
> Schema support is required for advanced cross language features in Beam, and has the opportunity to replace the current default JSON encoding of elements.
> Some quick notes, though a better fleshed out doc with details will be forthcoming:
>  * All base coders should be implemented, and listed as coder capabilities. I think only stringutf8 is missing presently.
>  * Should support fairly arbitrary user types, seamlessly. That is, users should be able to rely on it "just working" if their type is compatible.
>  * Should support schema metadata tagging.
> In particular, one breaking shift in the default will be to explicitly fail pipelines if elements have unexported fields, when no other custom coder has been added. This has been a source of errors/dropped data/keys and a simply warning at construction time won't cut it. However, we could provide a manual "use beam schemas, but ignore unexported fields" registration as a work around.
> Edit: Doc is now at https://s.apache.org/beam-go-schemas



--
This message was sent by Atlassian Jira
(v8.3.4#803005)