You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/08/03 17:19:00 UTC

[jira] [Work logged] (BEAM-7996) Add support for remaining data types in python RowCoder

     [ https://issues.apache.org/jira/browse/BEAM-7996?focusedWorklogId=465822&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465822 ]

ASF GitHub Bot logged work on BEAM-7996:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Aug/20 17:18
            Start Date: 03/Aug/20 17:18
    Worklog Time Spent: 10m 
      Work Description: robertwb commented on a change in pull request #12426:
URL: https://github.com/apache/beam/pull/12426#discussion_r464550207



##########
File path: model/pipeline/src/main/proto/beam_runner_api.proto
##########
@@ -855,10 +855,21 @@ message StandardCoders {
     //     BOOLEAN:   beam:coder:bool:v1
     //     BYTES:     beam:coder:bytes:v1
     //   ArrayType:   beam:coder:iterable:v1 (always has a known length)
-    //   MapType:     not yet a standard coder (BEAM-7996)
+    //   MapType:     not a standard coder, specification defined below.
     //   RowType:     beam:coder:row:v1
     //   LogicalType: Uses the coder for its representation.
     //
+    // The MapType is encoded by:
+    //   - An INT32 representing the size of the map (N)
+    //   - Followed by N interleaved keys and values, encoded with their
+    //     corresponding coder.
+    //
+    // Nullable types in container types (ArrayType, MapType) are encoded by:
+    //   - A one byte null indicator, 0x00 for null values, or 0x01 for present
+    //     values.
+    //   - For present values the null indicator is followed by the value
+    //     encoded with it's corresponding coder.
+    //

Review comment:
       Just so I understand it right, at the schema level, is nullability field is a property of the schema (rather than having an optional type)? So there's no way to declare a map as not (potentially) having null keys/values? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 465822)
    Time Spent: 3h 10m  (was: 3h)

> Add support for remaining data types in python RowCoder 
> --------------------------------------------------------
>
>                 Key: BEAM-7996
>                 URL: https://issues.apache.org/jira/browse/BEAM-7996
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Brian Hulette
>            Assignee: Brian Hulette
>            Priority: P2
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> In the initial [python RowCoder implementation|https://github.com/apache/beam/pull/9188] we only added support for the data types that already had coders in the Python SDK. We should add support for the remaining data types that are not currently supported:
> * INT8 (ByteCoder in Java)
> * INT16 (BigEndianShortCoder in Java)
> * FLOAT (FloatCoder in Java) (Note: doubles are supported, this is specifically for single-precision)
> * --BOOLEAN (standard beam:coder:bool:v1, BooleanCoder in Java)--
> * --BYTES (standard beam:coder:bytes:v1, ByteArrayCoder in Java)--
> * Map (MapCoder in Java)
> We might consider making those coders standard so they can be tested independently from RowCoder in standard_coders.yaml. Or, if we don't do that we should probably add a more robust testing framework for RowCoder itself, because it will be challenging to test all of these types as part of the RowCoder tests in standard_coders.yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)