You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hichame El Khalfi (JIRA)" <ji...@apache.org> on 2018/06/25 04:04:00 UTC

[jira] [Created] (SPARK-24644) Pyarrow exception while running pandas_udf on pyspark 2.3.1

Hichame El Khalfi created SPARK-24644:
-----------------------------------------

             Summary: Pyarrow exception while running pandas_udf on pyspark 2.3.1
                 Key: SPARK-24644
                 URL: https://issues.apache.org/jira/browse/SPARK-24644
             Project: Spark
          Issue Type: Bug
          Components: Block Manager
    Affects Versions: 2.3.1
         Environment: os: centos

pyspark 2.3.1

spark 2.3.1

pyarrow >= 0.8.0
            Reporter: Hichame El Khalfi


Hello,

When I try to run a `pandas_udf` on my spark dataframe, I get this error

 
{code:java}
  File "/mnt/ephemeral3/yarn/nm/usercache/user/appcache/application_1524574803975_205774/container_e280_1524574803975_205774_01_000044/pyspark.zip/pyspark/serializers.py", lin
e 280, in load_stream
    pdf = batch.to_pandas()
  File "pyarrow/table.pxi", line 677, in pyarrow.lib.RecordBatch.to_pandas (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:43226)
    return Table.from_batches([self]).to_pandas(nthreads=nthreads)
  File "pyarrow/table.pxi", line 1043, in pyarrow.lib.Table.to_pandas (/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:46331)
    mgr = pdcompat.table_to_blockmanager(options, self, memory_pool,
  File "/usr/lib64/python2.7/site-packages/pyarrow/pandas_compat.py", line 528, in table_to_blockmanager
    blocks = _table_to_blocks(options, block_table, nthreads, memory_pool)
  File "/usr/lib64/python2.7/site-packages/pyarrow/pandas_compat.py", line 622, in _table_to_blocks
    return [_reconstruct_block(item) for item in result]
  File "/usr/lib64/python2.7/site-packages/pyarrow/pandas_compat.py", line 446, in _reconstruct_block
    block = _int.make_block(block_arr, placement=placement)
TypeError: make_block() takes at least 3 arguments (2 given)
{code}
 
 More than happy to provide any additional information



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org