You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2015/06/30 13:11:04 UTC

[jira] [Commented] (SPARK-8535) PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name

    [ https://issues.apache.org/jira/browse/SPARK-8535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608132#comment-14608132 ] 

Apache Spark commented on SPARK-8535:
-------------------------------------

User 'x1-' has created a pull request for this issue:
https://github.com/apache/spark/pull/7124

> PySpark : Can't create DataFrame from Pandas dataframe with no explicit column name
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-8535
>                 URL: https://issues.apache.org/jira/browse/SPARK-8535
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.4.0
>            Reporter: Christophe Bourguignat
>
> Trying to create a Spark DataFrame from a pandas dataframe with no explicit column name : 
> pandasDF = pd.DataFrame([[1, 2], [5, 6]])
> sparkDF = sqlContext.createDataFrame(pandasDF)
> ***********
> ----> 1 sparkDF = sqlContext.createDataFrame(pandasDF)
> /usr/local/Cellar/apache-spark/1.4.0/libexec/python/pyspark/sql/context.pyc in createDataFrame(self, data, schema, samplingRatio)
>     344 
>     345         jrdd = self._jvm.SerDeUtil.toJavaArray(rdd._to_java_object_rdd())
> --> 346         df = self._ssql_ctx.applySchemaToPythonRDD(jrdd.rdd(), schema.json())
>     347         return DataFrame(df, self)
>     348 
> /usr/local/Cellar/apache-spark/1.4.0/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py in __call__(self, *args)
>     536         answer = self.gateway_client.send_command(command)
>     537         return_value = get_return_value(answer, self.gateway_client,
> --> 538                 self.target_id, self.name)
>     539 
>     540         for temp_arg in temp_args:
> /usr/local/Cellar/apache-spark/1.4.0/libexec/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
>     298                 raise Py4JJavaError(
>     299                     'An error occurred while calling {0}{1}{2}.\n'.
> --> 300                     format(target_id, '.', name), value)
>     301             else:
>     302                 raise Py4JError(
> Py4JJavaError: An error occurred while calling o87.applySchemaToPythonRDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org