You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "abdel alfahham (Jira)" <ji...@apache.org> on 2021/03/30 13:27:00 UTC
[jira] [Created] (ARROW-12150) [Python] Invalid data when Decimal
is exported to parquet
abdel alfahham created ARROW-12150:
--------------------------------------
Summary: [Python] Invalid data when Decimal is exported to parquet
Key: ARROW-12150
URL: https://issues.apache.org/jira/browse/ARROW-12150
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 3.0.0
Environment: - macOS Big Sur 11.2.1
- python 3.8.2
Reporter: abdel alfahham
Exporting pyarrow.table that contains mixed-precision Decimals using parquet.write_table creates a parquet that contains invalid data/values.
In the example below the first value of data_decimal is turned from Decimal('579.11999511718795474735088646411895751953125000000000') in the pyarrow table to Decimal('-378.68971792399258172661600550482428224218070136475136') in the parquet.
import pyarrow
from decimal import Decimal
values_floats = [579.119995117188, 6.40999984741211, 2.0] # floats
decs_from_values = [Decimal(v) for v in values_floats] # Decimal
decs_from_float = [Decimal.from_float(v) for v in values_floats] # Decimal using from_float
decs_str = [Decimal(str(v)) for v in values_floats] # Decimal
data_dict = \{"data_decimal": decs_from_values, # python Decimal
"data_decimal_from_float": decs_from_float, # python Decimal using from_float
"data_float":values_floats, # python floats
"data_dec_str": decs_str}
table = pyarrow.table(data=data_dict)
print(table.to_pydict()) # before saving
pyarrow.parquet.write_table(table, "./pyarrow_decimal.parquet") # saving
print(pyarrow.parquet.read_table("./pyarrow_decimal.parquet").to_pydict()) # after saving
--
This message was sent by Atlassian Jira
(v8.3.4#803005)