You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Andrew Pennebaker <ap...@42six.com> on 2013/08/23 16:48:06 UTC

Avro data type for datetimes?

Could there be an Avro data type for date times? That would greatly
increase data compatibility between systems, reducing errors from string
printing and parsing, and relaxing the need for producers and consumers to
agree on specific formatters.

Re: Avro data type for datetimes?

Posted by Doug Cutting <cu...@apache.org>.
On Fri, Aug 23, 2013 at 9:39 AM, Doug Cutting <cu...@apache.org> wrote:
> Since release 1.7.4, Avro Java will serialize and deserialize
> instances of java.util.Date using the following schema:
>
>   {"type":"string", "java-class":"java.util.Date"}

Oops.  I forgot that this was reverted in AVRO-1155.

  https://issues.apache.org/jira/browse/AVRO-1155

So I think we're better off specifying a standard schema for
datetimes.  The existing Jira for this is:

  https://issues.apache.org/jira/browse/AVRO-739

I'll try to create a patch for this soon.

Doug

Re: Avro data type for datetimes?

Posted by Doug Cutting <cu...@apache.org>.
Since release 1.7.4, Avro Java will serialize and deserialize
instances of java.util.Date using the following schema:

  {"type":"string", "java-class":"java.util.Date"}

This might thus be use as a standard Date schema, with other
implementations also treating this schema specially.

Or one can devise a different standard, common schema for datetimes,
perhaps something like:

  {"type":"record", "name":"org.apache.avro.Datetime",
"fields":[{"name":"millisSinceEpoch", "type":"long"}]}

Schema parsers might then be configured to pre-define this, so that
folks can simply refer to it with "type":"org.apache.avro.Datetime".
(For example, AVRO-1188 permits one to specify directories containing
schemas to include when Maven compiles schemas, protocols and IDL.)
For back-compatibility however, a schema parser must not be required
to include it.  When a schema is printed, e.g., in a data file or in
an RPC handshake, the full schema will still be printed, so that
parsers that do not have this pre-defined can still process the data.

Doug

On Fri, Aug 23, 2013 at 7:48 AM, Andrew Pennebaker
<ap...@42six.com> wrote:
> Could there be an Avro data type for date times? That would greatly increase
> data compatibility between systems, reducing errors from string printing and
> parsing, and relaxing the need for producers and consumers to agree on
> specific formatters.