You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Winston Chen (JIRA)" <ji...@apache.org> on 2015/01/21 23:43:34 UTC
[jira] [Updated] (SPARK-5361) python tuple not supported while
converting PythonRDD back to JavaRDD
[ https://issues.apache.org/jira/browse/SPARK-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Winston Chen updated SPARK-5361:
--------------------------------
Summary: python tuple not supported while converting PythonRDD back to JavaRDD (was: add in tuple handling for converting python RDD back to JavaRDD)
> python tuple not supported while converting PythonRDD back to JavaRDD
> ---------------------------------------------------------------------
>
> Key: SPARK-5361
> URL: https://issues.apache.org/jira/browse/SPARK-5361
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Reporter: Winston Chen
>
> Existing `SerDeUtil.pythonToJava` implementation does not count in tuple cases: Pyrolite `python tuple` => `java Object[]`.
> So with the following data:
> {noformat}
> [
> (u'2', {u'director': u'David Lean', u'genres': (u'Adventure', u'Biography', u'Drama'), u'title': u'Lawrence of Arabia', u'year': 1962}),
> (u'7', {u'director': u'Andrew Dominik', u'genres': (u'Biography', u'Crime', u'Drama'), u'title': u'The Assassination of Jesse James by the Coward Robert Ford', u'year': 2007})
> ]
> {noformat}
> Exceptions happen with the `genres` part:
> {noformat}
> 15/01/16 10:28:31 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 7)
> java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to java.util.ArrayList
> at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:157)
> at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:153)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
> {noformat}
> There is already a pull-request for this bug:
> https://github.com/apache/spark/pull/4146
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org