You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Andrew Redd (Jira)" <ji...@apache.org> on 2020/05/07 14:13:00 UTC
[jira] [Created] (ARROW-8731) Error when using toPandas with
PyArrow
Andrew Redd created ARROW-8731:
----------------------------------
Summary: Error when using toPandas with PyArrow
Key: ARROW-8731
URL: https://issues.apache.org/jira/browse/ARROW-8731
Project: Apache Arrow
Issue Type: Bug
Environment: Python Environment on the worker and driver
- jupyter==1.0.0
- pandas==1.0.3
- pyarrow==0.14.0
- pyspark==2.4.0
- py4j==0.10.7
- pyarrow==0.14.0
Reporter: Andrew Redd
I'm getting the following error when calling toPandas on a spark dataframe
* This is a blocker to our use of pyarrow on a project
{code:java}
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-e2ed63d96b43> in <module>
----> 1 s.load_table_to_df('csn_customer.tblcustomerpro').limit(100).toPandas()
/venv/lib/python3.6/site-packages/pyspark/sql/dataframe.py in toPandas(self)
2119 _check_dataframe_localize_timestamps
2120 import pyarrow
-> 2121 batches = self._collectAsArrow()
2122 if len(batches) > 0:
2123 table = pyarrow.Table.from_batches(batches)
/venv/lib/python3.6/site-packages/pyspark/sql/dataframe.py in _collectAsArrow(self)
2177 with SCCallSiteSync(self._sc) as css:
2178 sock_info = self._jdf.collectAsArrowToPython()
-> 2179 return list(_load_from_socket(sock_info, ArrowStreamSerializer()))
2180
2181 ##########################################################################################
/venv/lib/python3.6/site-packages/pyspark/rdd.py in _load_from_socket(sock_info, serializer)
142
143 def _load_from_socket(sock_info, serializer):
--> 144 (sockfile, sock) = local_connect_and_auth(*sock_info)
145 # The RDD materialization time is unpredicable, if we set a timeout for socket reading
146 # operation, it will very possibly fail. See SPARK-18281.
TypeError: local_connect_and_auth() takes 2 positional arguments but 3 were given
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)