You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Phillip Cloud (JIRA)" <ji...@apache.org> on 2018/02/15 14:10:00 UTC
[jira] [Updated] (ARROW-2162) [Python/C++] Decimal Values with too-high precision are multiplied by 100

     [ https://issues.apache.org/jira/browse/ARROW-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Phillip Cloud updated ARROW-2162:
---------------------------------
    Description: 
From GitHub:

This works as expected:

{code}
>>> pyarrow.array([decimal.Decimal('1.23')], pyarrow.decimal128(10,2))[0]
Decimal('1.23')
{code}

Storing an extra digit of precision multiplies the stored value by a factor of 100:

{code}
>>> pyarrow.array([decimal.Decimal('1.234')], pyarrow.decimal128(10,2))[0]
Decimal('123.40')
{code}

Ideally I would get an exception since the value I'm trying to store doesn't fit in the declared type of the array. It would be less good, but still ok, if the stored value were 1.23 (truncating the extra digit). I didn't expect pyarrow to silently store a value that differs from the original value by a factor of 100.

I originally thought that the code was incorrectly multiplying through by an extra factor of 10**scale, but that doesn't seem to be the case. If I change the scale, it always seems to be a factor of 100

{code}
>>> pyarrow.array([decimal.Decimal('1.2345')], pyarrow.decimal128(10,3))[0]
Decimal('123.450')
I see the same behavior if I use floating point to initialize the array rather than Python's decimal type.
{code}

I searched for open github and JIRA for open issues but didn't find anything related to this. I am using pyarrow 0.8.0 on OS X 10.12.6 using python 2.7.14 installed via Homebrew

> [Python/C++] Decimal Values with too-high precision are multiplied by 100
> -------------------------------------------------------------------------
>
>                 Key: ARROW-2162
>                 URL: https://issues.apache.org/jira/browse/ARROW-2162
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 0.8.0
>            Reporter: Phillip Cloud
>            Assignee: Phillip Cloud
>            Priority: Major
>             Fix For: 0.9.0
>
>
> From GitHub:
> This works as expected:
> {code}
> >>> pyarrow.array([decimal.Decimal('1.23')], pyarrow.decimal128(10,2))[0]
> Decimal('1.23')
> {code}
> Storing an extra digit of precision multiplies the stored value by a factor of 100:
> {code}
> >>> pyarrow.array([decimal.Decimal('1.234')], pyarrow.decimal128(10,2))[0]
> Decimal('123.40')
> {code}
> Ideally I would get an exception since the value I'm trying to store doesn't fit in the declared type of the array. It would be less good, but still ok, if the stored value were 1.23 (truncating the extra digit). I didn't expect pyarrow to silently store a value that differs from the original value by a factor of 100.
> I originally thought that the code was incorrectly multiplying through by an extra factor of 10**scale, but that doesn't seem to be the case. If I change the scale, it always seems to be a factor of 100
> {code}
> >>> pyarrow.array([decimal.Decimal('1.2345')], pyarrow.decimal128(10,3))[0]
> Decimal('123.450')
> I see the same behavior if I use floating point to initialize the array rather than Python's decimal type.
> {code}
> I searched for open github and JIRA for open issues but didn't find anything related to this. I am using pyarrow 0.8.0 on OS X 10.12.6 using python 2.7.14 installed via Homebrew



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)