You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Mark Hayes <ma...@greybird.com> on 2012/06/09 17:58:42 UTC

GenericDatumWriter.write and the Integer type

Hi, I have a question about the treatment of Integer types (defined as
'int' in the schema) when serializing with GenericDatumWriter.  The
behavior changed in the 1.6 code line.

The change was apparently to address this issue:
https://issues.apache.org/jira/browse/AVRO-249

Here is the diff:
http://svn.apache.org/viewvc/avro/trunk/lang/java/avro/src/main/java/org/apache/avro/generic/GenericDatumWriter.java?r1=1078917&r2=1178973&pathrev=1178973&diff_format=h

Instead of casting the datum to Integer in the old code:
case INT:     out.writeInt((Integer)datum);     break;

The new code casts to Number:
case INT:     out.writeInt(((Number)datum).intValue()); break;

If the datum is a Long, Float or Double, the intValue() method truncates
the value, which is a silent loss of information.  I would rather that an
exception is reported, which is what happens in the old code, so the user
is aware that they've attempted to serialize a value that can't be
represented.

I can override GenericDatumWriter.write to address this (essentially revert
to the old code behavior).

But is my reliance on the casting errors, to get cheap validation,
appropriate?  Or would the recommended approach be to use a
ValidatingEncoder instead?

Thanks,
--mark

Re: GenericDatumWriter.write and the Integer type

Posted by Doug Cutting <cu...@apache.org>.
On Sat, Jun 9, 2012 at 8:58 AM, Mark Hayes <ma...@greybird.com> wrote:
> I can override GenericDatumWriter.write to address this (essentially revert
> to the old code behavior).
>
> But is my reliance on the casting errors, to get cheap validation,
> appropriate?

Relying on a casting error for validation seems appropriate.  Sorry
this change wasn't good for you.  Your workaround seems entirely
reasonable to me.

Doug