You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Winston Chen (JIRA)" <ji...@apache.org> on 2015/01/28 19:25:34 UTC

[jira] [Updated] (SPARK-5361) Multiple Java RDD <-> Python RDD conversions not working correctly

     [ https://issues.apache.org/jira/browse/SPARK-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Winston Chen updated SPARK-5361:
--------------------------------
    Summary: Multiple Java RDD <-> Python RDD conversions not working correctly  (was: python tuple not supported while converting PythonRDD back to JavaRDD)

> Multiple Java RDD <-> Python RDD conversions not working correctly
> ------------------------------------------------------------------
>
>                 Key: SPARK-5361
>                 URL: https://issues.apache.org/jira/browse/SPARK-5361
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.2.0
>            Reporter: Winston Chen
>
> Existing `SerDeUtil.pythonToJava` implementation does not count in tuple cases: Pyrolite `python tuple` => `java Object[]`.
> So with the following data:
> {noformat}
> [
> (u'2', {u'director': u'David Lean', u'genres': (u'Adventure', u'Biography', u'Drama'), u'title': u'Lawrence of Arabia', u'year': 1962}), 
> (u'7', {u'director': u'Andrew Dominik', u'genres': (u'Biography', u'Crime', u'Drama'), u'title': u'The Assassination of Jesse James by the Coward Robert Ford', u'year': 2007})
> ]
> {noformat}
> Exceptions happen at the `genres` part:
> {noformat}
> 15/01/16 10:28:31 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 7)
> java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to java.util.ArrayList
> 	at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:157)
> 	at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:153)
> 	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> 	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
> {noformat}
> There is already a pull-request for this bug:
> https://github.com/apache/spark/pull/4146



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org