You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2021/03/07 05:42:00 UTC

[jira] [Commented] (SPARK-34640) unable to access grouping column after groupBy

    [ https://issues.apache.org/jira/browse/SPARK-34640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296740#comment-17296740 ] 

Hyukjin Kwon commented on SPARK-34640:
--------------------------------------

You can use backquotes with `$"..."`:

{code}
scala> df.groupBy($"s.a2").count.select($"`s.a2`").show()
+----+
|s.a2|
+----+
|  s3|
|  s1|
+----+
{code}

> unable to access grouping column after groupBy
> ----------------------------------------------
>
>                 Key: SPARK-34640
>                 URL: https://issues.apache.org/jira/browse/SPARK-34640
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.1.1
>            Reporter: Jiri Humpolicek
>            Priority: Major
>
> When I group by nested column, I am unable to reference it after groupBy operation.
>  Example:
>  1) Preparing dataframe with nested column:
> {code:scala}
> case class Sub(a2: String)
> case class Top(a1: String, s: Sub)
> val s = Seq(
>     Top("r1", Sub("s1")),
>     Top("r2", Sub("s3"))
> )
> val df = s.toDF
> df.printSchema
> // root
> //  |-- a1: string (nullable = true)
> //  |-- s: struct (nullable = true)
> //  |    |-- a2: string (nullable = true)
> {code}
> 2) try to access grouping column after groupBy:
> {code:scala}
> df.groupBy($"s.a2").count.select('a2)
> // org.apache.spark.sql.AnalysisException: cannot resolve '`a2`' given input columns: [count, s.a2];
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org