You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2021/03/07 05:42:00 UTC
[jira] [Commented] (SPARK-34640) unable to access grouping column
after groupBy
[ https://issues.apache.org/jira/browse/SPARK-34640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296740#comment-17296740 ]
Hyukjin Kwon commented on SPARK-34640:
--------------------------------------
You can use backquotes with `$"..."`:
{code}
scala> df.groupBy($"s.a2").count.select($"`s.a2`").show()
+----+
|s.a2|
+----+
| s3|
| s1|
+----+
{code}
> unable to access grouping column after groupBy
> ----------------------------------------------
>
> Key: SPARK-34640
> URL: https://issues.apache.org/jira/browse/SPARK-34640
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.1.1
> Reporter: Jiri Humpolicek
> Priority: Major
>
> When I group by nested column, I am unable to reference it after groupBy operation.
> Example:
> 1) Preparing dataframe with nested column:
> {code:scala}
> case class Sub(a2: String)
> case class Top(a1: String, s: Sub)
> val s = Seq(
> Top("r1", Sub("s1")),
> Top("r2", Sub("s3"))
> )
> val df = s.toDF
> df.printSchema
> // root
> // |-- a1: string (nullable = true)
> // |-- s: struct (nullable = true)
> // | |-- a2: string (nullable = true)
> {code}
> 2) try to access grouping column after groupBy:
> {code:scala}
> df.groupBy($"s.a2").count.select('a2)
> // org.apache.spark.sql.AnalysisException: cannot resolve '`a2`' given input columns: [count, s.a2];
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org