You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Yaser Alraddadi (Jira)" <ji...@apache.org> on 2022/09/29 13:05:00 UTC
[jira] [Created] (ARROW-17893) Wrong reading of timedelta
Yaser Alraddadi created ARROW-17893:
---------------------------------------
Summary: Wrong reading of timedelta
Key: ARROW-17893
URL: https://issues.apache.org/jira/browse/ARROW-17893
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 8.0.0
Reporter: Yaser Alraddadi
Attachments: check_timedelta.py
When there is a timedelta and a list of dictionary and that also has timedelta as well, reading the upper timedelta in feather format sometimes gives wrong reading.
below is an example if you check the printed results sometime it reads the upper timedelta as {color:#00875a}0 days 03:40:23 correct{color}, and sometimes as {color:#de350b}153 days 01:03:20 wrong{color}
Here is the code, also it is attached as check_timedelta.py
{code:java}
from datetime import datetime, timedelta
import pandas as pd
import pyarrow.feather as feather
time_1 = datetime.fromisoformat("2022-04-21T10:18:12+03:00")
time_2 = datetime.fromisoformat("2022-04-21T13:58:35+03:00")
data = [
{
"waiting_time": timedelta(seconds=12, microseconds=1),
},
{
"waiting_time": timedelta(seconds=1020),
},
{
"waiting_time": timedelta(seconds=960),
},
{
"waiting_time": timedelta(seconds=960),
},
{
"waiting_time": timedelta(seconds=960),
},
{
"waiting_time": timedelta(seconds=815, microseconds=1),
},
]
df = pd.DataFrame(
[
{
"time_1": time_1,
"time_2": time_2,
"data": data,
"timedelta_1": time_2 - time_1,
"timedelta_2": timedelta(hours=3, minutes=40, seconds=23),
},
]
)
print("Correct timedelta_1: ", df["timedelta_1"].item())
print("Correct timedelta_2: ", df["timedelta_2"].item())
with open(f"records.feather.lz4", "wb") as f:
feather.write_feather(df, f, compression="lz4")
for _ in range(10):
with open(f"records.feather.lz4", "rb") as f:
print("Reading timedelta_1: ", feather.read_feather(f)["timedelta_1"].item())
print("Reading timedelta_2: ", feather.read_feather(f)["timedelta_2"].item())
{code}
Printed Results
{code:java}
Correct timedelta_1: 0 days 03:40:23
Correct timedelta_2: 0 days 03:40:23
Reading timedelta_1: 0 days 03:40:23
Reading timedelta_2: 0 days 03:40:23
Reading timedelta_1: 0 days 03:40:23
Reading timedelta_2: 0 days 03:40:23
Reading timedelta_1: 153 days 01:03:20
Reading timedelta_2: 153 days 01:03:20
Reading timedelta_1: 0 days 03:40:23
Reading timedelta_2: 0 days 03:40:23
Reading timedelta_1: 0 days 03:40:23
Reading timedelta_2: 0 days 03:40:23
Reading timedelta_1: 0 days 03:40:23
Reading timedelta_2: 153 days 01:03:20
Reading timedelta_1: 153 days 01:03:20
Reading timedelta_2: 0 days 03:40:23
Reading timedelta_1: 0 days 03:40:23
Reading timedelta_2: 153 days 01:03:20
Reading timedelta_1: 153 days 01:03:20
Reading timedelta_2: 153 days 01:03:20
Reading timedelta_1: 153 days 01:03:20
Reading timedelta_2: 153 days 01:03:20{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)