You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Joris Van den Bossche (Jira)" <ji...@apache.org> on 2020/09/15 14:01:00 UTC

[jira] [Resolved] (ARROW-7663) [Python] from_pandas gives TypeError instead of ArrowTypeError in some cases

     [ https://issues.apache.org/jira/browse/ARROW-7663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joris Van den Bossche resolved ARROW-7663.
------------------------------------------
    Resolution: Fixed

Issue resolved by pull request 8044
[https://github.com/apache/arrow/pull/8044]

> [Python] from_pandas gives TypeError instead of ArrowTypeError in some cases
> ----------------------------------------------------------------------------
>
>                 Key: ARROW-7663
>                 URL: https://issues.apache.org/jira/browse/ARROW-7663
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.15.1
>            Reporter: David Li
>            Assignee: Andrew Wieteska
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 2.0.0
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> from_pandas sometimes raises a TypeError with an uninformative error message rather than an ArrowTypeError with the full, informative type error for mixed-type array columns:
> {noformat}
> >>> pa.Table.from_pandas(pd.DataFrame({"a": ['a', 1]}))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "pyarrow/table.pxi", line 1177, in pyarrow.lib.Table.from_pandas
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in dataframe_to_arrays
>     for c, f in zip(columns_to_convert, convert_fields)]
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in <listcomp>
>     for c, f in zip(columns_to_convert, convert_fields)]
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 566, in convert_column
>     raise e
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 560, in convert_column
>     result = pa.array(col, type=type_, from_pandas=True, safe=safe)
>   File "pyarrow/array.pxi", line 265, in pyarrow.lib.array
>   File "pyarrow/array.pxi", line 80, in pyarrow.lib._ndarray_to_array
>   File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
> pyarrow.lib.ArrowTypeError: ("Expected a bytes object, got a 'int' object", 'Conversion failed for column a with type object')
> >>> pa.Table.from_pandas(pd.DataFrame({"a": [1, 'a']}))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "pyarrow/table.pxi", line 1177, in pyarrow.lib.Table.from_pandas
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in dataframe_to_arrays
>     for c, f in zip(columns_to_convert, convert_fields)]
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 575, in <listcomp>
>     for c, f in zip(columns_to_convert, convert_fields)]
>   File "/Users/lidavidm/Flight/arrow/build/python/lib.macosx-10.12-x86_64-3.7/pyarrow/pandas_compat.py", line 560, in convert_column
>     result = pa.array(col, type=type_, from_pandas=True, safe=safe)
>   File "pyarrow/array.pxi", line 265, in pyarrow.lib.array
>   File "pyarrow/array.pxi", line 80, in pyarrow.lib._ndarray_to_array
> TypeError: an integer is required (got type str)
> {noformat}
> Noticed on 0.15.1 and on master when we tried to upgrade. On 0.14.1, both cases gave ArrowTypeError.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)