You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Rex Xiong (Jira)" <ji...@apache.org> on 2020/05/07 17:53:00 UTC
[jira] [Created] (SPARK-31660) Dataset.joinWith supports JoinType
object as input parameter
Rex Xiong created SPARK-31660:
---------------------------------
Summary: Dataset.joinWith supports JoinType object as input parameter
Key: SPARK-31660
URL: https://issues.apache.org/jira/browse/SPARK-31660
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.4.5
Reporter: Rex Xiong
Current Dataset.joinWith API accepts String type joinType, it doesn't support JoinType object.
I prefer JoinType object (like enum) than String, less chance to have typo and has better readability
{code:scala}
def joinWith[U](other: Dataset[U], condition: Column, joinType: String): Dataset[(T, U)] = {{code}
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
If I pass LeftOuter.sql to joinType, it will throw exception, since there's a white space in LeftOuter.sql
{code:scala}
case object LeftOuter extends JoinType {
override def sql: String = "LEFT OUTER"
}
{code}
While the constructor of JoinType only removes underscore, doesn't handle white spaces,
{code:scala}
object JoinType {
def apply(typ: String): JoinType = typ.toLowerCase(Locale.ROOT).replace("_", "") match {
case "inner" => Inner
case "outer" | "full" | "fullouter" => FullOuter
case "leftouter" | "left" => LeftOuter
case "rightouter" | "right" => RightOuter
case "leftsemi" | "semi" => LeftSemi
case "leftanti" | "anti" => LeftAnti
case "cross" => Cross
case _ =>
val supported = Seq(
"inner",
"outer", "full", "fullouter", "full_outer",
"leftouter", "left", "left_outer",
"rightouter", "right", "right_outer",
"leftsemi", "left_semi", "semi",
"leftanti", "left_anti", "anti",
"cross")
throw new IllegalArgumentException(s"Unsupported join type '$typ'. " +
"Supported join types include: " + supported.mkString("'", "', '", "'") + ".")
}
}{code}
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala
I suggest we either add another set of APIs which provide JoinType instead of String, or change JoinType.apply to remove white space as well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org