You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2009/07/08 20:15:14 UTC

[jira] Commented: (AVRO-75) Clarify resolution for enums (and fix code)

    [ https://issues.apache.org/jira/browse/AVRO-75?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12728834#action_12728834 ] 

Doug Cutting commented on AVRO-75:
----------------------------------

> This is the only place the word "unset" is used in the doc [ ... ]

That may be a bug.  We don't currently say what happens when a reader has a field that a writer didn't write and where no default value is specified in the reader's schema.  In this case we might provide some implementation flexibility.  An optimized implementation might have an int field that's just set to zero in this case, with no means to differentiate this from an actual zero value for the field, or it might provide a means to tell whether this field was set or not.  I am not yet convinced that the spec should mandate this behavior, but might rather define such fields as "unset", which may or may not be detectable, depending on the implementation.

This gets back to the issue of optional/required fields.  I think the current intent is to treat all fields as optional, that, if a field must have a valid value then one can specify a default value in the schema rather than require that all writers already have that field.

> I propose that we declare this case an error [ ... ]

"Unset" may make sense in this case too.  In Java's reflect and specific API's, an Enum instance could be null.  This is somewhat analogous to a field that's been added to the writer but not yet to the reader.  A reader that requires this might provide a default value that would be used when either the writer does not provide a value or the writer provides an unknown value.

That said, I don't have a strong feeling and would be willing to make this an error if others can explain why that should be preferred.

> GenericDatumReader should be updated to throw an error in this case.

Or, if we decide that "unset" is useful here, we could have it use null or the default in this case.  We'd then need to update ReflectDatumReader too, as it currently throws an exception in this case.  In either case, the code does not conform to the spec.


> Clarify resolution for enums (and fix code)
> -------------------------------------------
>
>                 Key: AVRO-75
>                 URL: https://issues.apache.org/jira/browse/AVRO-75
>             Project: Avro
>          Issue Type: Bug
>          Components: spec
>            Reporter: Raymie Stata
>            Assignee: Doug Cutting
>
> The current resolution rule for enum's says: "if the writer's symbol is not present in the reader's enum, then the enum value is unset."  This is the only place the word "unset" is used in the doc, it's not clear what you mean.  The code seems to be inconsistent: GenericDatumReader will happily return a symbol the reader doesn't understand; ReflectDatumReader will probably throw a class-not-found exception; ResolvingDecoder throws an error.
> I propose that we declare this case an error, i.e., rewrite the spec to "if the writer's symbol is not listed in the reader's enum, an error is signaled."  GenericDatumReader should be updated to throw an error in this case.
> If we decide to stick with the "unset" language, we need to define what "unset" means (and, if necessary, update ReflectDatumReader and ResolvingDecoder).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.