You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Jeff Hammerbacher (JIRA)" <ji...@apache.org> on 2011/01/20 02:31:43 UTC

[jira] Created: (AVRO-739) Add Date/Time data types

Add Date/Time data types
------------------------

                 Key: AVRO-739
                 URL: https://issues.apache.org/jira/browse/AVRO-739
             Project: Avro
          Issue Type: New Feature
          Components: spec
            Reporter: Jeff Hammerbacher




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (AVRO-739) Add Date/Time data types

Posted by "Russell Jurney (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169988#comment-13169988 ] 

Russell Jurney commented on AVRO-739:
-------------------------------------

PIG-1314 may be relevant.  ISO8601 datetime format seemed convenient.
                
> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-739) Add Date/Time data types

Posted by "Kenneth Baltrinic (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179475#comment-13179475 ] 

Kenneth Baltrinic commented on AVRO-739:
----------------------------------------

I concur w/ C Fletcher that some consideration to timezones and daylight savings time is needed.  At the very minimum the spec would need require that in the absence of an explicit timezone, all times are in UTC.
                
> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (AVRO-739) Add Date/Time data types

Posted by "John A. De Goes (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474263#comment-13474263 ] 

John A. De Goes commented on AVRO-739:
--------------------------------------

Adopting UTC milliseconds as the date/time format is fundamentally wrong and will render the type useless for any serious application. ISO8601 is the standard format for date/time. It preserves the critical notion of timezone and daylight savings time, and of course lets you express time in UTC as well if that's what you want. The binary encoding is only slightly bulkier than UTC milliseconds.
                
> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (AVRO-739) Add Date/Time data types

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986730#action_12986730 ] 

Ron Bodkin commented on AVRO-739:
---------------------------------

Sorry I forgot to pate in Doug Cutting's design:
The way that I have imagined doing this is to specify a standard schema
for dates, then implementations can optionally map this to a native date
type.

The schema could be a record containing a long, e.g.:

{"type": "record", "name":"org.apache.avro.lib.Date", "fields" : [
   {"name": "time", "type": "long"}
  ]
}

Java could read this into a java.util.Date, Python to a datetime, etc.
Such conventions could be added to the Avro specification.

Does this sound like a reasonable approach?

And also this email thread -

On 01/18/2011 09:19 AM, Jeremy Custenborder wrote:
I agree with storing it as a long. How would you handle this in code
generation and serialization? Would you envision hooks during code
generation that would generate a member that is the native date time
for the language?

Yes.  Just as "bytes" is represented in Java by java.nio.ByteBuffer,
"org.apache.avro.lib.Date" could be represented by java.util.Date.

Does the serializer handle a date object that is
native to the language?

Yes, serializers and deserializers would need to implement this mapping.

Does this sound like a reasonable approach?

I really like the idea of having a standard
datetime as a supported type of avro. It's a problem that everyone has
to solve on their own.


> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-739) Add Date/Time data types

Posted by "Jeremy Custenborder (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990370#comment-12990370 ] 

Jeremy Custenborder commented on AVRO-739:
------------------------------------------

What were you thinking a long with the number of milliseconds since 1980 UTC? If you need more precision than that you are most likely going to make your own type. I really like the idea of getting something that can map to the native types in most of the languages. This would be a really cool feature.    

> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (AVRO-739) Add Date/Time data types

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12986729#action_12986729 ] 

Ron Bodkin commented on AVRO-739:
---------------------------------

>From the discussion on the users list, I agree that it'd be great to start with a simple timestamp, which gets serialized as a long. Let's start with a simple feature, and future enhancements can be tracked separately.

Doug proposed this design:


I noted that it would be nice to allow some flexibility in the implementation
classes for dates, e.g., letting Java users use Joda time classes as well
as java.util.Date

Scott said:
Absolutely.  This is a per-language feature though, so it may not require
much of the spec.  For example, in Java it could simply be a configuration
parameter passed to the DatumReader/Writers.  It doesn't make a lot of
sense to store metadata on the data that says "this is a Joda object, not
java.util.Date" -- that is a user choice and not intrinsic to describing
the data.

My input: 
I agree this shouldn't be part of the serialized format. It would be nice to
have a clean way to specify the configuration/mappings used that allows
for specifying the mappings for more such org.apache.avro data types. It
also should be supported for reflection and code generation approaches, as well.

Scott also said:
There are other questions too -- what are the timestamp units
(milliseconds? configurable?), what is the origin (1970? 2010?
configurable?) -- these decisions affect the serialization size.

My input:
I would like to see a format that allows storing data at the precision of popular libraries and languages (java.util.Date, Joda time, Python datetime, etc.). Having a long representing microseconds since Jan. 1 1970 seems like a good compromise for general purpose use. It supports higher precision libraries and still allows representing a few hundred thousand years of data. Some libraries do allow nanosecond resolution - but limiting to 270 years seems like a bigger limitation than microsecond precision.



> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (AVRO-739) Add Date/Time data types

Posted by "Colin Fletcher (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128000#comment-13128000 ] 

Colin Fletcher commented on AVRO-739:
-------------------------------------

The serialization of date/times must incorporate timezone. If it does not, then i will be unable to use it for the large scale projects I am leading.  It doesnt matter to me if the format is custom in byte mode, but in json must be json compliant.
                
> Add Date/Time data types
> ------------------------
>
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira