You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2015/09/16 00:13:45 UTC
[jira] [Commented] (SPARK-6513) Add zipWithUniqueId (and other RDD
APIs) to RDDApi
[ https://issues.apache.org/jira/browse/SPARK-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746365#comment-14746365 ]
Josh Rosen commented on SPARK-6513:
-----------------------------------
[~marmbrus], safe to say that this is "Won't Fix" given the RDDApi removal?
> Add zipWithUniqueId (and other RDD APIs) to RDDApi
> --------------------------------------------------
>
> Key: SPARK-6513
> URL: https://issues.apache.org/jira/browse/SPARK-6513
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.3.0
> Environment: Windows 7 64bit, Scala 2.11.6, JDK 1.7.0_21 (though I don't think it's relevant)
> Reporter: Eran Medan
> Priority: Minor
>
> It will be nice if we could treat a Dataframe just like an RDD (wherever it makes sense)
> *Worked in 1.2.1*
> {code}
> val sqlContext = new HiveContext(sc)
> import sqlContext._
> val jsonRDD = sqlContext.jsonFile(jsonFilePath)
> jsonRDD.registerTempTable("jsonTable")
> val jsonResult = sql(s"select * from jsonTable")
> val foo = jsonResult.zipWithUniqueId().map {
> case (Row(...), uniqueId) => // do something useful
> ...
> }
> foo.registerTempTable("...")
> {code}
> *Stopped working in 1.3.0*
> {code}
> jsonResult.zipWithUniqueId() //since RDDApi doesn't implement that method
> {code}
> **Not working workaround:**
> although this might give me an {{RDD\[Row\]}}:
> {code}
> jsonResult.rdd.zipWithUniqueId()
> {code}
> Now this won't work obviously since {{RDD\[Row\]}} does not have a {{registerTempTable}} method of course
> {code}
> foo.registerTempTable("...")
> {code}
> (see related SO question: http://stackoverflow.com/questions/29243186/is-this-a-regression-bug-in-spark-1-3)
> EDIT: changed from issue to enhancement request
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org