You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Taoufik DACHRAOUI (JIRA)" <ji...@apache.org> on 2019/04/11 16:17:00 UTC

[jira] [Updated] (SPARK-27388) expression encoder for avro objects

     [ https://issues.apache.org/jira/browse/SPARK-27388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Taoufik DACHRAOUI updated SPARK-27388:
--------------------------------------
    Summary: expression encoder for avro objects  (was: expression encoder for avro like objects)

> expression encoder for avro objects
> -----------------------------------
>
>                 Key: SPARK-27388
>                 URL: https://issues.apache.org/jira/browse/SPARK-27388
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.4.1
>            Reporter: Taoufik DACHRAOUI
>            Priority: Major
>
> *What changes were proposed in this pull request?*
> The PR adds support for bean objects, java.util.List, java.util.Map, java.nio.ByteBuffer and java enums to ScalaReflection; unlike the existing javaBean Encoder, properties can be named without the set/get prefix (this is one of the key points that allows the encoding of Avro Fixed types. I believe, the other key point is that the addition must be in ScalaReflection).
> Reminder of Avro types:
>  * primitive types: null, boolean, int, long, float, double, bytes, string
>  * complex types: Records, Enums, Arrays, Maps, Unions, Fixed
> This PR supports simple unions (having a null type and a non-null type) but not complex unions for the simple reason that the Avro compiler will generate java code with type Object for all complex unions, and fields with simple unions will be typed as the non-null type of the union.
>  
> *How was this patch tested?*
> currently only 1 encodeDecodeTest was added to ExpressionEncoderSuite; the test uses the following avro schema:
> {code:java}
> {"namespace": "org.apache.spark.sql.catalyst.encoders", "type": "record", "name": "AvroExample1",
>  "fields": [
>  {"name":"mymoney","type":["null",{"type":"record","name":"Money","namespace":"org.apache.spark.sql.catalyst.encoders","fields":[
>  {"name":"amount","type":"float","default":0},
>  {"name":"currency","type":{"type":"enum","name":"Currency","symbols":["EUR","USD","BRL"]},"default":"EUR"}]}], "default":null},
>  {"name": "myfloat", "type": "float"},
>  {"name": "mylong", "type": "long"},
>  {"name": "myint", "type": "int"},
>  {"name": "mydouble", "type": "double"},
>  {"name": "myboolean", "type": "boolean"},
>  {"name": "mystring", "type": "string"},
>  {"name": "mybytes", "type": "bytes"},
>  {"name": "myfixed", "type": {"type": "fixed", "name": "Magic", "size": 4}},
>  {"name": "myarray", "type": {"type": "array", "items": "string"}},
>  {"name": "mymap", "type": {"type": "map", "values": "int"}}
>  ] }{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org