You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Victor Mota (JIRA)" <ji...@apache.org> on 2017/10/12 21:56:01 UTC
[jira] [Commented] (AVRO-1335) C++ should support field default values

    [ https://issues.apache.org/jira/browse/AVRO-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202689#comment-16202689 ] 

Victor Mota commented on AVRO-1335:
-----------------------------------

Thanks for taking a look! I finally got around to making the fixes you recommended and also added more unit tests for bytes and fixed type:
https://github.com/apache/avro/pull/241

I used string::replace instead of memcpy which should be fine right, it seemed safer. Also with regards to "2 *  sizeof(T)", I don't quite follow and when I tried to make that modification all "\u00ff" (the format in the docs [https://avro.apache.org/docs/1.8.1/spec.html#schema_record]) would turn to "\u0000". 

> C++ should support field default values
> ---------------------------------------
>
>                 Key: AVRO-1335
>                 URL: https://issues.apache.org/jira/browse/AVRO-1335
>             Project: Avro
>          Issue Type: Improvement
>          Components: c++
>    Affects Versions: 1.7.4
>            Reporter: Bin Guo
>         Attachments: AVRO-1335.patch
>
>
> We found that resolvingDecoder could not provide bidirectional compatibility between different version of schemas.
> Especially for records, for example:
> {code:title=First schema}
> {
>     "type": "record",
>     "name": "TestRecord",
>     "fields": [
>         {
>             "name": "MyData",
> 			"type": {
> 				"type": "record",
> 				"name": "SubData",
> 				"fields": [
> 					{
> 						"name": "Version1",
> 						"type": "string"
> 					}
> 				]
> 			}
>         },
> 	{
>             "name": "OtherData",
>             "type": "string"
>         }
>     ]
> }
> {code}
> {code:title=Second schema}
> {
>     "type": "record",
>     "name": "TestRecord",
>     "fields": [
>         {
>             "name": "MyData",
> 			"type": {
> 				"type": "record",
> 				"name": "SubData",
> 				"fields": [
> 					{
> 						"name": "Version1",
> 						"type": "string"
> 					},
> 					{
> 						"name": "Version2",
> 						"type": "string"
> 					}
> 				]
> 			}
>         },
> 	{
>             "name": "OtherData",
>             "type": "string"
>         }
>     ]
> }
> {code}
> Say, node A knows only the first schema and node B knows the second schema, and the second schema has more fields. 
> Any data generated by node B can be resolved by first schema 'cause the additional field is marked as skipped.
> But data generated by node A can not be resolved by second schema and throws an exception *"Don't know how to handle excess fields for reader."*
> This is because data is resolved exactly according to the auto-generated codec_traits which trying to read the excess field.
> The problem is we just can not only ignore the excess field in record, since the data after the troublesome record also needs to be resolved.
> Actually this problem stucked us for a very long time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)