You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@vxquery.apache.org by Eldon Carman <ec...@ucr.edu> on 2012/06/30 06:15:52 UTC

Date, DateTime and Time Data Storage Details Feedback

The pointables for storing Date, DateTime and Time can be stored and
accessed in several different methods. Each of these items store a
timezone attribute, if I am not mistaken. Since calculating number of
days in a month can be complicated, I will suggest methods that
separate both month and years from the rest of the components.
Basically you have two options in storing these values: lump them into
one or two values or keep each item separated.

Date (separated)
 - Year (4 digits YYYY) 2 Bytes
 - Month (2 digits MM) 1 Byte
 - Day (2 digits DD) 1 Byte
 - TimeZone Hours (2 digits HH) 1 Byte
 - TimeZone Minutes (2 digits MM) 1 Byte
 --- total 6 bytes

Date (combined)
 - Year Month (YYYY * 12 + MM) 4 Byte
 - Day (2 digits DD) 1 Byte
 - TimeZone (2 digits HH * 60 + MM) 2 Byte
 --- total 7 bytes

DateTime (separated)
 - Year (4 digits YYYY) 2 Bytes
 - Month (2 digits MM) 1 Byte
 - Day (2 digits DD) 1 Byte
 - Hours (2 digits HH) 1 Bytes
 - Minutes (2 digits MM) 1 Byte
 - Seconds (5 digits SS.SSS) 4 Byte
 - TimeZone Hours (2 digits HH) 1 Byte
 - TimeZone Minutes (2 digits MM) 1 Byte
 --- total 12 bytes

DateTime (combined)
 - Year Month (YYYY * 12 + MM) 4 Byte
 - Day Hours Minutes Seconds (DD * 86,400 + HH * 3,600 + MM * 60 +
SS.SSS) 8 Byte
 - TimeZone (2 digits HH * 60 + MM) 2 Byte
 --- total 14 bytes

Time (separated)
 - Minutes (2 digits MM) 1 Byte
 - Seconds (5 digits SS.SSS) 4 Byte
 - TimeZone Hours (2 digits HH) 1 Byte
 - TimeZone Minutes (2 digits MM) 1 Byte
 --- total 7 bytes

Time (combined)
 - Hours Minutes Seconds (HH * 3,600 + MM * 60 + SS.SSS) 4 Byte
 - TimeZone (2 digits HH * 60 + MM) 2 Byte
 --- total 6 bytes

I think either approach is good. Looks like we save some space on
storing each value individually. This may be different if we did not
stick to powers of two for the storage sizes. The timezone could be
separated, but I think it would be easy to keep that as a single
storage field and in most cases you will want the value int minutes
anyway.

As for accessing the information, I think having functions get both
the combined values and individual values would be beneficial for the
implementing functions. It would help them lower their computations
when needed. We could even offer two ways to save the information for
combined and separated.

I wanted to get some feedback before I went any further with
implementing these pointables. I think my approach would be to create
an interface that was independent of the way data was saved and you
could access it via either method.

Re: Date, DateTime and Time Data Storage Details Feedback

Posted by Cezar Andrei <ce...@gmail.com>.
Preston,

For date related operations you can get inspiration or copy code from
Apache XMLBeans' GDate and GDateBuilder:
http://svn.apache.org/viewvc/xmlbeans/trunk/src/xmlpublic/org/apache/xmlbeans/GDate.java?view=markup
http://svn.apache.org/viewvc/xmlbeans/trunk/src/xmlpublic/org/apache/xmlbeans/GDateBuilder.java?view=markup

Cezar

On Sun, Jul 1, 2012 at 11:49 PM, Till Westmann <ti...@westmann.org> wrote:

> Preston,
>
> thanks for going ahead with one of the choices - we were a little slow to
> respond.
> I also agree with your choice. Storing the components allows us to
> implement XQuery's comparison semantics and it allows for efficient
> extraction of the components.
>
> Cheers,
> Till
>
> On Jun 29, 2012, at 9:15 PM, Eldon Carman wrote:
>
> > The pointables for storing Date, DateTime and Time can be stored and
> > accessed in several different methods. Each of these items store a
> > timezone attribute, if I am not mistaken. Since calculating number of
> > days in a month can be complicated, I will suggest methods that
> > separate both month and years from the rest of the components.
> > Basically you have two options in storing these values: lump them into
> > one or two values or keep each item separated.
> >
> > Date (separated)
> > - Year (4 digits YYYY) 2 Bytes
> > - Month (2 digits MM) 1 Byte
> > - Day (2 digits DD) 1 Byte
> > - TimeZone Hours (2 digits HH) 1 Byte
> > - TimeZone Minutes (2 digits MM) 1 Byte
> > --- total 6 bytes
> >
> > Date (combined)
> > - Year Month (YYYY * 12 + MM) 4 Byte
> > - Day (2 digits DD) 1 Byte
> > - TimeZone (2 digits HH * 60 + MM) 2 Byte
> > --- total 7 bytes
> >
> > DateTime (separated)
> > - Year (4 digits YYYY) 2 Bytes
> > - Month (2 digits MM) 1 Byte
> > - Day (2 digits DD) 1 Byte
> > - Hours (2 digits HH) 1 Bytes
> > - Minutes (2 digits MM) 1 Byte
> > - Seconds (5 digits SS.SSS) 4 Byte
> > - TimeZone Hours (2 digits HH) 1 Byte
> > - TimeZone Minutes (2 digits MM) 1 Byte
> > --- total 12 bytes
> >
> > DateTime (combined)
> > - Year Month (YYYY * 12 + MM) 4 Byte
> > - Day Hours Minutes Seconds (DD * 86,400 + HH * 3,600 + MM * 60 +
> > SS.SSS) 8 Byte
> > - TimeZone (2 digits HH * 60 + MM) 2 Byte
> > --- total 14 bytes
> >
> > Time (separated)
> > - Minutes (2 digits MM) 1 Byte
> > - Seconds (5 digits SS.SSS) 4 Byte
> > - TimeZone Hours (2 digits HH) 1 Byte
> > - TimeZone Minutes (2 digits MM) 1 Byte
> > --- total 7 bytes
> >
> > Time (combined)
> > - Hours Minutes Seconds (HH * 3,600 + MM * 60 + SS.SSS) 4 Byte
> > - TimeZone (2 digits HH * 60 + MM) 2 Byte
> > --- total 6 bytes
> >
> > I think either approach is good. Looks like we save some space on
> > storing each value individually. This may be different if we did not
> > stick to powers of two for the storage sizes. The timezone could be
> > separated, but I think it would be easy to keep that as a single
> > storage field and in most cases you will want the value int minutes
> > anyway.
> >
> > As for accessing the information, I think having functions get both
> > the combined values and individual values would be beneficial for the
> > implementing functions. It would help them lower their computations
> > when needed. We could even offer two ways to save the information for
> > combined and separated.
> >
> > I wanted to get some feedback before I went any further with
> > implementing these pointables. I think my approach would be to create
> > an interface that was independent of the way data was saved and you
> > could access it via either method.
>
>

Re: Date, DateTime and Time Data Storage Details Feedback

Posted by Till Westmann <ti...@westmann.org>.
Preston,

thanks for going ahead with one of the choices - we were a little slow to respond. 
I also agree with your choice. Storing the components allows us to implement XQuery's comparison semantics and it allows for efficient extraction of the components.

Cheers,
Till

On Jun 29, 2012, at 9:15 PM, Eldon Carman wrote:

> The pointables for storing Date, DateTime and Time can be stored and
> accessed in several different methods. Each of these items store a
> timezone attribute, if I am not mistaken. Since calculating number of
> days in a month can be complicated, I will suggest methods that
> separate both month and years from the rest of the components.
> Basically you have two options in storing these values: lump them into
> one or two values or keep each item separated.
> 
> Date (separated)
> - Year (4 digits YYYY) 2 Bytes
> - Month (2 digits MM) 1 Byte
> - Day (2 digits DD) 1 Byte
> - TimeZone Hours (2 digits HH) 1 Byte
> - TimeZone Minutes (2 digits MM) 1 Byte
> --- total 6 bytes
> 
> Date (combined)
> - Year Month (YYYY * 12 + MM) 4 Byte
> - Day (2 digits DD) 1 Byte
> - TimeZone (2 digits HH * 60 + MM) 2 Byte
> --- total 7 bytes
> 
> DateTime (separated)
> - Year (4 digits YYYY) 2 Bytes
> - Month (2 digits MM) 1 Byte
> - Day (2 digits DD) 1 Byte
> - Hours (2 digits HH) 1 Bytes
> - Minutes (2 digits MM) 1 Byte
> - Seconds (5 digits SS.SSS) 4 Byte
> - TimeZone Hours (2 digits HH) 1 Byte
> - TimeZone Minutes (2 digits MM) 1 Byte
> --- total 12 bytes
> 
> DateTime (combined)
> - Year Month (YYYY * 12 + MM) 4 Byte
> - Day Hours Minutes Seconds (DD * 86,400 + HH * 3,600 + MM * 60 +
> SS.SSS) 8 Byte
> - TimeZone (2 digits HH * 60 + MM) 2 Byte
> --- total 14 bytes
> 
> Time (separated)
> - Minutes (2 digits MM) 1 Byte
> - Seconds (5 digits SS.SSS) 4 Byte
> - TimeZone Hours (2 digits HH) 1 Byte
> - TimeZone Minutes (2 digits MM) 1 Byte
> --- total 7 bytes
> 
> Time (combined)
> - Hours Minutes Seconds (HH * 3,600 + MM * 60 + SS.SSS) 4 Byte
> - TimeZone (2 digits HH * 60 + MM) 2 Byte
> --- total 6 bytes
> 
> I think either approach is good. Looks like we save some space on
> storing each value individually. This may be different if we did not
> stick to powers of two for the storage sizes. The timezone could be
> separated, but I think it would be easy to keep that as a single
> storage field and in most cases you will want the value int minutes
> anyway.
> 
> As for accessing the information, I think having functions get both
> the combined values and individual values would be beneficial for the
> implementing functions. It would help them lower their computations
> when needed. We could even offer two ways to save the information for
> combined and separated.
> 
> I wanted to get some feedback before I went any further with
> implementing these pointables. I think my approach would be to create
> an interface that was independent of the way data was saved and you
> could access it via either method.