You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Tofigh (JIRA)" <ji...@apache.org> on 2016/06/30 09:24:10 UTC
[jira] [Created] (SPARK-16325) reduceByKey requires an implicit
ordering which it never uses
Tofigh created SPARK-16325:
------------------------------
Summary: reduceByKey requires an implicit ordering which it never uses
Key: SPARK-16325
URL: https://issues.apache.org/jira/browse/SPARK-16325
Project: Spark
Issue Type: Bug
Reporter: Tofigh
Priority: Minor
assume there is a case class as follows:
case class UnorderedPair[A](left: A, right: A) extends Serializable {
override def equals(obj: Any): Boolean = obj match {
case other: UnorderedPair[A] => (this.left == other.left && this.right == other.right) || (this.left == other.right && this.right == other.left)
case _ => false
}
override def hashCode(): Int = left.hashCode() * right.hashCode()
def toSeq(): Seq[A] = Seq(left, right)
}
and assume an RDD of UnorderedPair and Seq(Long):
val rdd = sc.parallelize(Seq( (UnorderedPair(12,14), Seq(123L)), (UnorderedPair(12,14), Seq(123L)) ))
then the following code:
rdd.reduceByKey(_ ++ _ )
throws an error that an implicit Ordering is required.
The dummy solution was to rewrite the case class as follows:
case class UnorderedPair[A](left: A, right: A) extends Ordered[UnorderedPair[A]] with Serializable {
override def equals(obj: Any): Boolean = obj match {
case other: UnorderedPair[A] => (this.left == other.left && this.right == other.right) || (this.left == other.right && this.right == other.left)
case _ => false
}
override def hashCode(): Int = left.hashCode() * right.hashCode()
def toSeq(): Seq[A] = Seq(left, right)
override def compare(that: UnorderedPair[A]): Int = throw new Defect("This function should not be called. It is a workaround for a Spark bug in reduceByKey which requires an Ordering function.")
}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org