You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2020/03/16 01:58:00 UTC
[jira] [Commented] (SPARK-31123) Drop does not work after join with
aliases
[ https://issues.apache.org/jira/browse/SPARK-31123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17059889#comment-17059889 ]
L. C. Hsieh commented on SPARK-31123:
-------------------------------------
I tested with current master branch. Looks it is resolved:
{code:java}
scala> joined.show
+---+---+---+---+
| a|dup| b|dup|
+---+---+---+---+
| a|dup| a|dup|
+---+---+---+---+
scala> joined.drop(dupCol).show
+---+---+---+
| a| b|dup|
+---+---+---+
| a| a|dup|
+---+---+---+
{code}
> Drop does not work after join with aliases
> ------------------------------------------
>
> Key: SPARK-31123
> URL: https://issues.apache.org/jira/browse/SPARK-31123
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.4.2
> Reporter: Mikel San Vicente
> Priority: Major
>
>
> Hi,
> I am seeing a really strange behaviour in drop method after a join with aliases. It doesn't seem to find the column when I reference to it using dataframe("columnName") syntax, but it does work with other combinators like select
> {code:java}
> case class Record(a: String, dup: String)
> case class Record2(b: String, dup: String)
> val df = Seq(Record("a", "dup")).toDF
> val joined = df.alias("a").join(df2.alias("b"), df("a") === df2("b"))
> val dupCol = df("dup")
> joined.drop(dupCol) // Does not drop anything
> joined.drop(func.col("a.dup")) // It drops the column
> joined.select(dupCol) // It selects the column
> {code}
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org