You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2020/10/07 12:57:00 UTC
[jira] [Created] (ARROW-10213) [C++] Temporal cast from timestamp
to date rounds instead of extracting date component
David Li created ARROW-10213:
--------------------------------
Summary: [C++] Temporal cast from timestamp to date rounds instead of extracting date component
Key: ARROW-10213
URL: https://issues.apache.org/jira/browse/ARROW-10213
Project: Apache Arrow
Issue Type: Bug
Components: C++
Affects Versions: 1.0.1
Reporter: David Li
I'd expect this code to give 1950-01-01 twice (i.e. a timestamp -> date cast extracts the date component, ignoring the time component):
{code:python}
import datetime
import pyarrow as paarr = pa.array([
datetime.datetime(1950, 1, 1, 0, 0, 0),
datetime.datetime(1950, 1, 1, 12, 0, 0),
], type=pa.timestamp("ns"))print(arr)
print(arr.cast(pa.date32(), safe=False)) {code}
However it gives 1950-01-02 in the second case:
{noformat}
[
1950-01-01 00:00:00.000000000,
1950-01-01 12:00:00.000000000
]
[
1950-01-01,
1950-01-02
]
{noformat}
The reason is that the temporal cast simply divides, and C truncates towards 0 (note: Python truncates towards -Infinity, so it would give the right answer in this case!), resulting in -7304 days instead of -7305.
Depending on the intended semantics of a temporal cast, either it should be fixed to extract the date component, or the rounding behavior should be noted and a separate kernel should be implemented for extracting the date component.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)