You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nattavut Sutyanyong (JIRA)" <ji...@apache.org> on 2017/01/05 16:33:58 UTC

[jira] [Created] (SPARK-19086) Improper scoping of name resolution of columns in HAVING clause

Nattavut Sutyanyong created SPARK-19086:
-------------------------------------------

             Summary: Improper scoping of name resolution of columns in HAVING clause
                 Key: SPARK-19086
                 URL: https://issues.apache.org/jira/browse/SPARK-19086
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.1.0
            Reporter: Nattavut Sutyanyong
            Priority: Minor


There seems to be a problem on the scoping of name resolution of columns in a HAVING clause.

Here is a scenario of the problem:
{code}
// A simplified version of TC 01.13 from PR-16337
Seq((1,1,1)).toDF("t1a", "t1b", "t1c").createOrReplaceTempView("t1")
Seq((1,1,1)).toDF("t2a", "t2b", "t2c").createOrReplaceTempView("t2")

// This is okay. 
// Error: t2c is unresolved
sql("select t2a from t2 group by t2a having t2c = 8").show

// This is okay as t2c is resolved to the t2 on the parent side
// because t2 in the subquery does not output column t2c.
sql("select * from t2 where t2a in (select t2a from (select t2a from t2) t2 group by t2a having t2c = 8)").explain(true)

// This is the problem.
sql("select * from t2 where t2a in (select t2a from t2 group by t2a having t2c = 8)").explain(true)

== Analyzed Logical Plan ==
t2a: int, t2b: int, t2c: int
Project [t2a#22, t2b#23, t2c#24]
+- Filter predicate-subquery#38 [(t2a#22 = t2a#22#49) && (t2c#24 = 8)]
   :  +- Project [t2a#22 AS t2a#22#49]
   :     +- Aggregate [t2a#22], [t2a#22]
   :        +- SubqueryAlias t2, `t2`
   :           +- Project [_1#18 AS t2a#22, _2#19 AS t2b#23, _3#20 AS t2c#24]
   :              +- LocalRelation [_1#18, _2#19, _3#20]
   +- SubqueryAlias t2, `t2`
      +- Project [_1#18 AS t2a#22, _2#19 AS t2b#23, _3#20 AS t2c#24]
         +- LocalRelation [_1#18, _2#19, _3#20]
{code}

We should not resolve {{t2c}} in the subquery to the outer {{t2}} on the parent side. It should try to resolve {{t2c}} to the {{t2}} in the subquery from its current scope and raise an exception because it is invalid to pull up the column {{t2c}} from the {{Aggregate}} operator below.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org