You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by tu...@apache.org on 2024/01/22 11:21:27 UTC
(arrow-rs) branch master updated: Enhance Date64 type documentation (#5323)
This is an automated email from the ASF dual-hosted git repository.
tustvold pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
The following commit(s) were added to refs/heads/master by this push:
new b594d9063a Enhance Date64 type documentation (#5323)
b594d9063a is described below
commit b594d9063a55c503ae67cec2809fe3d2fa472bfa
Author: Jeffrey Vo <je...@gmail.com>
AuthorDate: Mon Jan 22 22:21:21 2024 +1100
Enhance Date64 type documentation (#5323)
* Enhance Date64 type documentation
* Update arrow-schema/src/datatype.rs
Co-authored-by: Raphael Taylor-Davies <17...@users.noreply.github.com>
* Update arrow-schema/src/datatype.rs
Co-authored-by: Raphael Taylor-Davies <17...@users.noreply.github.com>
* Update arrow-schema/src/datatype.rs
Co-authored-by: Raphael Taylor-Davies <17...@users.noreply.github.com>
---------
Co-authored-by: Raphael Taylor-Davies <17...@users.noreply.github.com>
---
arrow-schema/src/datatype.rs | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
diff --git a/arrow-schema/src/datatype.rs b/arrow-schema/src/datatype.rs
index 6276a99a47..a5bd66b50c 100644
--- a/arrow-schema/src/datatype.rs
+++ b/arrow-schema/src/datatype.rs
@@ -145,10 +145,31 @@ pub enum DataType {
/// ```
Timestamp(TimeUnit, Option<Arc<str>>),
/// A signed 32-bit date representing the elapsed time since UNIX epoch (1970-01-01)
- /// in days (32 bits).
+ /// in days.
Date32,
/// A signed 64-bit date representing the elapsed time since UNIX epoch (1970-01-01)
- /// in milliseconds (64 bits). Values are evenly divisible by 86400000.
+ /// in milliseconds.
+ ///
+ /// According to the specification (see [Schema.fbs]), this should be treated as the number of
+ /// days, in milliseconds, since the UNIX epoch. Therefore, values must be evenly divisible by
+ /// `86_400_000` (the number of milliseconds in a standard day).
+ ///
+ /// The reason for this is for compatibility with other language's native libraries,
+ /// such as Java, which historically lacked a dedicated date type
+ /// and only supported timestamps.
+ ///
+ /// Practically, validation that values of this type are evenly divisible by `86_400_000` is not enforced
+ /// by this library for performance and usability reasons. Date64 values will be treated similarly to the
+ /// `Timestamp(TimeUnit::Millisecond, None)` type, in that its values will be printed showing the time of
+ /// day if the value does not represent an exact day, and arithmetic can be done at the millisecond
+ /// granularity to change the time represented.
+ ///
+ /// Users should prefer using Date32 to cleanly represent the number of days, or one of the Timestamp
+ /// variants to include time as part of the representation, depending on their use case.
+ ///
+ /// For more details, see [#5288](https://github.com/apache/arrow-rs/issues/5288).
+ ///
+ /// [Schema.fbs]: https://github.com/apache/arrow/blob/main/format/Schema.fbs
Date64,
/// A signed 32-bit time representing the elapsed time since midnight in the unit of `TimeUnit`.
/// Must be either seconds or milliseconds.