You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by François Pacull <fr...@architecture-performance.fr> on 2022/07/06 12:44:46 UTC
[Python] Cast decimal to string
Dear Arrow team and users, I have a simple question regarding the decimal data type with pyarrow. I am trying to cast a table with decimal columns to string, or to write it to a csv file. In both cases I get the error message:
pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
I understand that is not implemented yet, but is there by chance a way to get around this?
Thanks, François.
PS: I am using Python : 3.9.13 & pyarrow : 8.0.0
Here is a code snippet:
import decimal
import pyarrow as pa
import pyarrow.compute as pc
import pyarrow.csv
PREC, SCAL = 18, 9 # decimal precision & scale
context = decimal.getcontext()
context.prec = PREC
ref_decimal = decimal.Decimal('0.123456789')
float_numbers = [0.1, 654.5, 4.65742]
decimal_numbers = [
decimal.Decimal(str(f)).quantize(ref_decimal) for f in float_numbers
]
pa_arr_dec = pa.array(
decimal_numbers, type=pa.decimal128(precision=PREC, scale=SCAL)
)
pa_arr_str = pc.cast(pa_arr_dec, pa.string())
Traceback (most recent call last):
File "/home/francois/Workspace/.../scripts/pyarrow_decimal.py", line 21, in <module>
pa_arr_str = pc.cast(pa_arr_dec, pa.string())
File "/home/francois/miniconda3/envs/tableau2/lib/python3.9/site-packages/pyarrow/compute.py", line 376, in cast
return call_function("cast", [arr], options)
File "pyarrow/_compute.pyx", line 542, in pyarrow._compute.call_function
File "pyarrow/_compute.pyx", line 341, in pyarrow._compute.Function.call
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 121, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
RE: [Python] Cast decimal to string
Posted by François Pacull <fr...@architecture-performance.fr>.
Thanks for your answers! I ended up using python to convert decimal columns to string :
schema = batch_table.schema
for i, field in enumerate(schema):
if pa.types.is_decimal(field.type):
column_in = batch_table.column(field.name)
column_out = pa.array(
[
str(v) if v is not None else None
for v in column_in.to_pylist()
]
)
batch_table = batch_table.set_column(i, field.name, column_out)
François
Re: [Python] Cast decimal to string
Posted by Weston Pace <we...@gmail.com>.
I've added [1]. I agree, it should be a fairly easy fix, but requires
understanding where all the casting code lives.
arrow/compute/kernels/scalar_cast_string.cc would be a good place to
start if anyone is interested. We have decimal->string methods in
arrow/util/decimal.h which can be used.
[1] https://issues.apache.org/jira/browse/ARROW-17042
On Mon, Jul 11, 2022 at 6:58 AM Wes McKinney <we...@gmail.com> wrote:
>
> Would someone like to open a Jira issue about this? This seems like an
> easy rough edge to fix
>
> On Wed, Jul 6, 2022 at 12:44 PM Weston Pace <we...@gmail.com> wrote:
> >
> > If precision is not important you can cast the column to float64 first.
> >
> > >>> x = pa.array([1, 2, 3], type=pa.decimal128(6, 1))
> > >>> x.cast(pa.float64()).cast(pa.string())
> > <pyarrow.lib.StringArray object at 0x7fd23a52cd00>
> > [
> > "1",
> > "2",
> > "3"
> > ]
> >
> > If precision is important you could use python or pandas to do the
> > conversion to string.
> >
> > >>> pa.array([str(v) for v in x.to_pylist()])
> > <pyarrow.lib.StringArray object at 0x7fd23a52cd00>
> > [
> > "1.0",
> > "2.0",
> > "3.0"
> > ]
> >
> > On Wed, Jul 6, 2022 at 5:45 AM François Pacull
> > <fr...@architecture-performance.fr> wrote:
> > >
> > > Dear Arrow team and users, I have a simple question regarding the decimal data type with pyarrow. I am trying to cast a table with decimal columns to string, or to write it to a csv file. In both cases I get the error message:
> > >
> > > pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
> > >
> > > I understand that is not implemented yet, but is there by chance a way to get around this?
> > > Thanks, François.
> > >
> > > PS: I am using Python : 3.9.13 & pyarrow : 8.0.0
> > > Here is a code snippet:
> > >
> > > import decimal
> > >
> > > import pyarrow as pa
> > > import pyarrow.compute as pc
> > > import pyarrow.csv
> > >
> > > PREC, SCAL = 18, 9 # decimal precision & scale
> > >
> > > context = decimal.getcontext()
> > > context.prec = PREC
> > > ref_decimal = decimal.Decimal('0.123456789')
> > >
> > > float_numbers = [0.1, 654.5, 4.65742]
> > > decimal_numbers = [
> > > decimal.Decimal(str(f)).quantize(ref_decimal) for f in float_numbers
> > > ]
> > >
> > > pa_arr_dec = pa.array(
> > > decimal_numbers, type=pa.decimal128(precision=PREC, scale=SCAL)
> > > )
> > > pa_arr_str = pc.cast(pa_arr_dec, pa.string())
> > >
> > >
> > > Traceback (most recent call last):
> > > File "/home/francois/Workspace/.../scripts/pyarrow_decimal.py", line 21, in <module>
> > > pa_arr_str = pc.cast(pa_arr_dec, pa.string())
> > > File "/home/francois/miniconda3/envs/tableau2/lib/python3.9/site-packages/pyarrow/compute.py", line 376, in cast
> > > return call_function("cast", [arr], options)
> > > File "pyarrow/_compute.pyx", line 542, in pyarrow._compute.call_function
> > > File "pyarrow/_compute.pyx", line 341, in pyarrow._compute.Function.call
> > > File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
> > > File "pyarrow/error.pxi", line 121, in pyarrow.lib.check_status
> > > pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
Re: [Python] Cast decimal to string
Posted by Wes McKinney <we...@gmail.com>.
Would someone like to open a Jira issue about this? This seems like an
easy rough edge to fix
On Wed, Jul 6, 2022 at 12:44 PM Weston Pace <we...@gmail.com> wrote:
>
> If precision is not important you can cast the column to float64 first.
>
> >>> x = pa.array([1, 2, 3], type=pa.decimal128(6, 1))
> >>> x.cast(pa.float64()).cast(pa.string())
> <pyarrow.lib.StringArray object at 0x7fd23a52cd00>
> [
> "1",
> "2",
> "3"
> ]
>
> If precision is important you could use python or pandas to do the
> conversion to string.
>
> >>> pa.array([str(v) for v in x.to_pylist()])
> <pyarrow.lib.StringArray object at 0x7fd23a52cd00>
> [
> "1.0",
> "2.0",
> "3.0"
> ]
>
> On Wed, Jul 6, 2022 at 5:45 AM François Pacull
> <fr...@architecture-performance.fr> wrote:
> >
> > Dear Arrow team and users, I have a simple question regarding the decimal data type with pyarrow. I am trying to cast a table with decimal columns to string, or to write it to a csv file. In both cases I get the error message:
> >
> > pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
> >
> > I understand that is not implemented yet, but is there by chance a way to get around this?
> > Thanks, François.
> >
> > PS: I am using Python : 3.9.13 & pyarrow : 8.0.0
> > Here is a code snippet:
> >
> > import decimal
> >
> > import pyarrow as pa
> > import pyarrow.compute as pc
> > import pyarrow.csv
> >
> > PREC, SCAL = 18, 9 # decimal precision & scale
> >
> > context = decimal.getcontext()
> > context.prec = PREC
> > ref_decimal = decimal.Decimal('0.123456789')
> >
> > float_numbers = [0.1, 654.5, 4.65742]
> > decimal_numbers = [
> > decimal.Decimal(str(f)).quantize(ref_decimal) for f in float_numbers
> > ]
> >
> > pa_arr_dec = pa.array(
> > decimal_numbers, type=pa.decimal128(precision=PREC, scale=SCAL)
> > )
> > pa_arr_str = pc.cast(pa_arr_dec, pa.string())
> >
> >
> > Traceback (most recent call last):
> > File "/home/francois/Workspace/.../scripts/pyarrow_decimal.py", line 21, in <module>
> > pa_arr_str = pc.cast(pa_arr_dec, pa.string())
> > File "/home/francois/miniconda3/envs/tableau2/lib/python3.9/site-packages/pyarrow/compute.py", line 376, in cast
> > return call_function("cast", [arr], options)
> > File "pyarrow/_compute.pyx", line 542, in pyarrow._compute.call_function
> > File "pyarrow/_compute.pyx", line 341, in pyarrow._compute.Function.call
> > File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
> > File "pyarrow/error.pxi", line 121, in pyarrow.lib.check_status
> > pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
Re: [Python] Cast decimal to string
Posted by Weston Pace <we...@gmail.com>.
If precision is not important you can cast the column to float64 first.
>>> x = pa.array([1, 2, 3], type=pa.decimal128(6, 1))
>>> x.cast(pa.float64()).cast(pa.string())
<pyarrow.lib.StringArray object at 0x7fd23a52cd00>
[
"1",
"2",
"3"
]
If precision is important you could use python or pandas to do the
conversion to string.
>>> pa.array([str(v) for v in x.to_pylist()])
<pyarrow.lib.StringArray object at 0x7fd23a52cd00>
[
"1.0",
"2.0",
"3.0"
]
On Wed, Jul 6, 2022 at 5:45 AM François Pacull
<fr...@architecture-performance.fr> wrote:
>
> Dear Arrow team and users, I have a simple question regarding the decimal data type with pyarrow. I am trying to cast a table with decimal columns to string, or to write it to a csv file. In both cases I get the error message:
>
> pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string
>
> I understand that is not implemented yet, but is there by chance a way to get around this?
> Thanks, François.
>
> PS: I am using Python : 3.9.13 & pyarrow : 8.0.0
> Here is a code snippet:
>
> import decimal
>
> import pyarrow as pa
> import pyarrow.compute as pc
> import pyarrow.csv
>
> PREC, SCAL = 18, 9 # decimal precision & scale
>
> context = decimal.getcontext()
> context.prec = PREC
> ref_decimal = decimal.Decimal('0.123456789')
>
> float_numbers = [0.1, 654.5, 4.65742]
> decimal_numbers = [
> decimal.Decimal(str(f)).quantize(ref_decimal) for f in float_numbers
> ]
>
> pa_arr_dec = pa.array(
> decimal_numbers, type=pa.decimal128(precision=PREC, scale=SCAL)
> )
> pa_arr_str = pc.cast(pa_arr_dec, pa.string())
>
>
> Traceback (most recent call last):
> File "/home/francois/Workspace/.../scripts/pyarrow_decimal.py", line 21, in <module>
> pa_arr_str = pc.cast(pa_arr_dec, pa.string())
> File "/home/francois/miniconda3/envs/tableau2/lib/python3.9/site-packages/pyarrow/compute.py", line 376, in cast
> return call_function("cast", [arr], options)
> File "pyarrow/_compute.pyx", line 542, in pyarrow._compute.call_function
> File "pyarrow/_compute.pyx", line 341, in pyarrow._compute.Function.call
> File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
> File "pyarrow/error.pxi", line 121, in pyarrow.lib.check_status
> pyarrow.lib.ArrowNotImplementedError: Unsupported cast from decimal128(18, 9) to utf8 using function cast_string