You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Kenneth Knowles (JIRA)" <ji...@apache.org> on 2017/11/16 20:13:00 UTC

[jira] [Created] (BEAM-3205) Publicly document known coder wire formats and their URNs

Kenneth Knowles created BEAM-3205:
-------------------------------------

             Summary: Publicly document known coder wire formats and their URNs
                 Key: BEAM-3205
                 URL: https://issues.apache.org/jira/browse/BEAM-3205
             Project: Beam
          Issue Type: Improvement
          Components: beam-model
            Reporter: Kenneth Knowles
            Assignee: Robert Bradshaw


Overarching issue: We need to get our Google Docs, markdown, and email threads that sketch the Beam model as it is developed into a centralized place with clear information architecture / navigation, and draw the line that "if it isn't reachable from here in an obvious way it isn't the spec". [1]

Specific issue: Which coders are required for a runner and SDK to understand? Which coders are otherwise considered standardized? What is the abstract specification for their wire format?

Today we have https://github.com/apache/beam/blob/master/model/fn-execution/src/test/resources/org/apache/beam/model/fnexecution/v1/standard_coders.yaml which is the beginning of a compliance test suite for standardized coders.

This would really benefit from:

 - narrative descriptions of the formats, including _abstract_ specification (not examples) and perhaps motivation
 - specification of which are required and which are merely "well known"
 - ties into BEAM-3203 in terms of which coders are required to decode to compatible value in every SDK
 - once we have an abstract spec and some examples, and one language has robust coders that pass the examples, we could turn it around and treat that implementation as a reference impl for fuzz testing

Any sort of fancy hacking that blends the tests with the narrative is fine, though mostly I think they'll end up covering disjoint topics.

[1] I filed BEAM-2567 and BEAM-2568 and ported https://beam.apache.org/contribute/runner-guide/, and [~herohde] put together https://beam.apache.org/contribute/portability/ and https://github.com/apache/beam/blob/master/sdks/CONTAINERS.md




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)