You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by gu...@apache.org on 2019/03/26 00:26:14 UTC
[spark] branch master updated: [SPARK-27246][SQL] Add an assert on invalid Scalar subquery plan with no column

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 0bc030c  [SPARK-27246][SQL] Add an assert on invalid Scalar subquery plan with no column
0bc030c is described below

commit 0bc030c859ddb340d34085010428261ad767f10a
Author: sandeep-katta <sa...@gmail.com>
AuthorDate: Tue Mar 26 09:25:57 2019 +0900

    [SPARK-27246][SQL] Add an assert on invalid Scalar subquery plan with no column
    
    ## What changes were proposed in this pull request?
    
    This PR proposes to add an assert on `ScalarSubquery`'s `dataType` because there's a possibility that `dataType` can be called alone before throwing analysis exception.
    
    This was found while working on [SPARK-27088](https://issues.apache.org/jira/browse/SPARK-27088). This change calls `treeString` for logging purpose, and the specific test "scalar subquery with no column" under `AnalysisErrorSuite` was being failed with:
    
    ```
    Caused by: sbt.ForkMain$ForkError: java.util.NoSuchElementException: next on empty iterator
    	...
    	at scala.collection.mutable.ArrayOps$ofRef.head(ArrayOps.scala:198)
    	at org.apache.spark.sql.catalyst.expressions.ScalarSubquery.dataType(subquery.scala:251)
    	at org.apache.spark.sql.catalyst.expressions.Alias.dataType(namedExpressions.scala:163)
            ...
    	at org.apache.spark.sql.catalyst.trees.TreeNode.simpleString(TreeNode.scala:465)
            ...
    	at org.apache.spark.sql.catalyst.rules.RuleExecutor$PlanChangeLogger.logRule(RuleExecutor.scala:176)
    	at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:116)
    	...
    ```
    
    The reason is that `treeString` for logging happened to call `dataType` on `ScalarSubquery` but one test has empty column plan. So, it happened to throw `NoSuchElementException` before checking analysis.
    
    ## How was this patch tested?
    
    Manually tested.
    
    ```scala
    ScalarSubquery(LocalRelation()).treeString
    ```
    
    ```
    An exception or error caused a run to abort: assertion failed: Scala subquery should have only one column
    java.lang.AssertionError: assertion failed: Scala subquery should have only one column
    	at scala.Predef$.assert(Predef.scala:223)
    	at org.apache.spark.sql.catalyst.expressions.ScalarSubquery.dataType(subquery.scala:252)
    	at org.apache.spark.sql.catalyst.analysis.AnalysisErrorSuite.<init>(AnalysisErrorSuite.scala:116)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    	at java.lang.Class.newInstance(Class.java:442)
    	at org.scalatest.tools.Runner$.genSuiteConfig(Runner.scala:1428)
    	at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$8(Runner.scala:1236)
    	at scala.collection.immutable.List.map(List.scala:286)
    	at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1235)
    ```
    
    Closes #24182 from sandeep-katta/subqueryissue.
    
    Authored-by: sandeep-katta <sa...@gmail.com>
    Signed-off-by: Hyukjin Kwon <gu...@apache.org>
---
 .../scala/org/apache/spark/sql/catalyst/expressions/subquery.scala   | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
index fc1caed..0431134 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/subquery.scala
@@ -248,7 +248,10 @@ case class ScalarSubquery(
     children: Seq[Expression] = Seq.empty,
     exprId: ExprId = NamedExpression.newExprId)
   extends SubqueryExpression(plan, children, exprId) with Unevaluable {
-  override def dataType: DataType = plan.schema.fields.head.dataType
+  override def dataType: DataType = {
+    assert(plan.schema.fields.nonEmpty, "Scalar subquery should have only one column")
+    plan.schema.fields.head.dataType
+  }
   override def nullable: Boolean = true
   override def withNewPlan(plan: LogicalPlan): ScalarSubquery = copy(plan = plan)
   override def toString: String = s"scalar-subquery#${exprId.id} $conditionString"


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org