You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Josh Rosen (JIRA)" <ji...@apache.org> on 2015/05/31 23:14:17 UTC
[jira] [Resolved] (SPARK-4533) SchemaRDD API error: Can only
subtract another SchemaRDD
[ https://issues.apache.org/jira/browse/SPARK-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen resolved SPARK-4533.
-------------------------------
Resolution: Invalid
AFAIK this is not a bug, since calling {{keyBy()}} on a SchemaRDD returns a non-schema RDD. Therefore, I'm marking this issue as "Invalid" and closing it.
> SchemaRDD API error: Can only subtract another SchemaRDD
> --------------------------------------------------------
>
> Key: SPARK-4533
> URL: https://issues.apache.org/jira/browse/SPARK-4533
> Project: Spark
> Issue Type: Bug
> Components: PySpark
> Affects Versions: 1.1.0
> Environment: JDK6/7
> Reporter: Shawn Guo
> Priority: Minor
>
> There are two unexpected validations in below SchemaRDD APIs.
> subtract(self, other, numPartitions=None)
> "Can only subtract another SchemaRDD"
> intersection(self, other)
> "Can only intersect with another SchemaRDD"
> "Can only subtract another SchemaRDD" will be thrown when SchemaRDD subtract other types of RDD.
> Reproduce Steps:
> A = SchemaRDD
> B = SchemaRDD
> A_APX= A.keyBy(lambda line: None)
> B_APX=B.keyBy(lambda line: None)
> {color:red}
> CROSSED = A_APX.join(B_APX).map(lambda line: line[1]).filter(filter condition).map(lambda line: line[0]))
> {color}
> C=A.subtract(CROSSED) {color:red}#ERROR:Can only subtract another SchemaRDD{color}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org