You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "WillAyd (via GitHub)" <gi...@apache.org> on 2023/06/27 20:25:50 UTC

[GitHub] [arrow-nanoarrow] WillAyd opened a new issue, #251: Timestamp Handling Guidance

WillAyd opened a new issue, #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251

   Are there any general guidelines for dealing with timestamps? Is the idea for users to get the raw int64 value from the underlying arrow, refer to the `time_unit` and `timezone` from the schema and handle any conversions on their end? Or is nanoarrow planning to offer some types of abstraction?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] WillAyd commented on issue #251: Timestamp Handling Guidance

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1615077928

   Think this can be closed for now - thanks for the guidance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] WillAyd commented on issue #251: Timestamp Handling Guidance

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611561895

   Makes sense. My current use case is in the ADBC driver for Postgres where Postgres stores time stamps as an integer of microseconds since a 2000-01-01 epoch. Will mess around with that in the driver directly and come back to this if there's anything that feels like it makes sense upstream


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #251: Timestamp Handling Guidance

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611582955

   Maybe not for write support? https://github.com/apache/arrow-adbc/blob/main/c/driver/postgresql/statement.cc#L189-L218


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] jorisvandenbossche commented on issue #251: Timestamp Handling Guidance

Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611044788

   This might depend on the implementation? I would assume in C that the idea is indeed to let users deal with int64 values combined with the unit/timezone (is there an alternative), while eg the R or Python bindings could actually convert to native timestamp objects. 
   
   (although I assume your question is about the C code?)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] WillAyd commented on issue #251: Timestamp Handling Guidance

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611692588

   Yea this is concerned with the write. My initial thought was that if we could get the Arrow timestamps into a broken down time we could rather easily do the translation to the 2000-01-01 epoch using that struct, and that _maybe_ nanoarrow wants to provide the utilities to do that.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #251: Timestamp Handling Guidance

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611730360

   I think in that bit of code `ArrowTimeUnitGetMultiplier(bind_schema_fields[i].time_unit)` with the existing switch would be sufficient (with the complication that `ArrowTimeUnitGetMultiplier()` doesn't exist yet 🙂 ).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] lidavidm commented on issue #251: Timestamp Handling Guidance

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611705363

   "broken down time" being individual Y/M/D/etc components or?
   
   Wouldn't you just shift to microseconds, then subtract the epoch? No need to parse the timestamp further (but you would need to handle overflow/truncation)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] WillAyd closed issue #251: Timestamp Handling Guidance

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd closed issue #251: Timestamp Handling Guidance
URL: https://github.com/apache/arrow-nanoarrow/issues/251


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #251: Timestamp Handling Guidance

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611418380

   Joris is right! You may have noticed that the `ArrowArrayView` only has a `storage_type` enum member: there's no place for extra schema information. As far as the `ArrowArrayView` is concerned, it's looking at an array of 64-bit integers and it's up to you to interpret the values using the `ArrowSchemaView::time_unit` and `ArrowSchemaView::timezone` members.
   
   None of that is set in stone if handling timestamps turns out to be awful, but that's the current intention. FWIW, the only place I've used timestamps explicitly is in the R package. There I don't do anything in C...I reinterpret the array as an int64 at the R level use R to do the unit math.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] WillAyd commented on issue #251: Timestamp Handling Guidance

Posted by "WillAyd (via GitHub)" <gi...@apache.org>.
WillAyd commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611716152

   Yea exactly. I think broken-down time is a POSIX term but maybe not:
   
   https://www.gnu.org/software/libc/manual/html_node/Broken_002ddown-Time.html
   
   I think you are right about the shift / epoch subtract being more straightforward though


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] lidavidm commented on issue #251: Timestamp Handling Guidance

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611728552

   Ah I see. That may come in useful eventually but I think there's no need to do that here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] lidavidm commented on issue #251: Timestamp Handling Guidance

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611576396

   I thought we added support for PostgreSQL timestamps in the driver already? https://github.com/apache/arrow-adbc/blob/1465aba08064509e62c1eef88f6add1b024147e3/c/driver/postgresql/postgres_copy_reader.h#L656-L668


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] paleolimbot commented on issue #251: Timestamp Handling Guidance

Posted by "paleolimbot (via GitHub)" <gi...@apache.org>.
paleolimbot commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611574885

   I'm sure we can do something to help...even something as simple as `int32_t ArrowTimeUnitGetMultiplier(enum ArrowTimeUnit unit1)` might reduce how verbose that solution has to be.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-nanoarrow] lidavidm commented on issue #251: Timestamp Handling Guidance

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #251:
URL: https://github.com/apache/arrow-nanoarrow/issues/251#issuecomment-1611584956

   Ah, right.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org