You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/08/02 23:12:00 UTC

[jira] [Commented] (ARROW-1309) pyarrow.lib.ArrowNotImplementedError: NotImplemented: null

    [ https://issues.apache.org/jira/browse/ARROW-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16111899#comment-16111899 ] 

Wes McKinney commented on ARROW-1309:
-------------------------------------

Thanks [~virtualluke]. Any chance you can show the input data that triggered this error? There should be a single column in the data frame that is causing the problem (it's getting passed to {{pyarrow.Array.from_pandas}})

If it's not possible to fix this immediately, we would definitely want to make the error message more informative than that

> pyarrow.lib.ArrowNotImplementedError: NotImplemented: null
> ----------------------------------------------------------
>
>                 Key: ARROW-1309
>                 URL: https://issues.apache.org/jira/browse/ARROW-1309
>             Project: Apache Arrow
>          Issue Type: Bug
>         Environment: centos 7.3
>            Reporter: Luke Higgins
>            Priority: Minor
>             Fix For: 0.6.0
>
>
> I have an avro file in hdfs that I am reading in using fastavro, converting to a pandas dataframe and then trying to create an arrow table and get as error:
> >>> table=pyarrow.Table.from_pandas(my_dataframe)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "pyarrow/table.pxi", line 746, in pyarrow.lib.Table.from_pandas (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:34089)
>   File "pyarrow/table.pxi", line 346, in pyarrow.lib._dataframe_to_arrays (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:30476)
>   File "pyarrow/array.pxi", line 182, in pyarrow.lib.Array.from_pandas (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:22110)
>   File "pyarrow/error.pxi", line 66, in pyarrow.lib.check_status (/arrow/python/build/temp.linux-x86_64-3.6/lib.cxx:7702)
> pyarrow.lib.ArrowNotImplementedError: NotImplemented: null
> The avro schema indeed has null fields possible.  Is this not implemented?  I am using pyarrow 0.5.0.  Also, for what I am doing I am not using pandas at all, I just read in the avro and I have a list of dicts and really want to write them to disk in parquet format and am utilizing these steps (which isn't optimal but may be necessary without writing more code of my own).
> thanks,
> Luke



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)