You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by hvanhovell <gi...@git.apache.org> on 2016/01/05 02:37:43 UTC

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move Core Pars...

GitHub user hvanhovell opened a pull request:

    https://github.com/apache/spark/pull/10583

    [SPARK-12573][SPARK-12574][SQL] Move Core Parser from Hive to Catalyst [WIP]

    This PR moves a major part of the new SQL parser to catalyst. Only the SQL specific/Hive specific parts remain in their prospective sub-projects.
    
    The current PR is a WIP, and I am submitting it to get some testing in.
    
    cc @rxin 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hvanhovell/spark SPARK-12575

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10583.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10583
    
----
commit 774de5d1f8059a4746f2e83d3c3ed8fc4f782e37
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-01-02T23:19:06Z

    Factor out Hive dependencies.
    
    Factor out Hive dependencies - 2.
    
    Factor out hard coded UDFT's; let the Hive function registry deal resolve generators.
    
    Split Ql into Catalyst/Spark/Hive part; move parser to catalyst
    
    Style.

commit 01ffaf12fb1ff8dbf813a9484de5ee0430d8c0f2
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-01-05T01:01:21Z

    Changsed in Rollup/Cube

commit fb3b4a4c461391866bc12a51dd1e60eadeaff916
Author: Herman van Hovell <hv...@questtec.nl>
Date:   2016-01-05T01:30:03Z

    Updated deps. Make test work again

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168955204
  
    **[Test build #2322 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2322/consoleFull)** for PR 10583 at commit [`fb3b4a4`](https://github.com/apache/spark/commit/fb3b4a4c461391866bc12a51dd1e60eadeaff916).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169129174
  
    **[Test build #48789 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48789/consoleFull)** for PR 10583 at commit [`157d178`](https://github.com/apache/spark/commit/157d1785a8362f17700f18106ec4aba5d70dc90f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169007990
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48923378
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,969 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +  // TODO improve the parse error - so we don't need this anymore.
    +  val errorRegEx = "line (\\d+):(\\d+) (.*)".r
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case pe: ParseException =>
    +        pe.getMessage match {
    +          case errorRegEx(line, start, message) =>
    +            throw new AnalysisException(message, Some(line.toInt), Some(start.toInt))
    +          case otherMessage =>
    +            throw new AnalysisException(otherMessage)
    +        }
    +      case e: MatchError => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree =
    +      try {
    +        getAst(ddl)
    +      } catch {
    +        case pe: ParseException =>
    +          throw new RuntimeException(s"Failed to parse ddl: '$ddl'", pe)
    +      }
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", columns) =>
    +        columns.foldLeft(0)((bitmap, col) => {
    +          val keyIndex = keyMap.find(_._1.treeEquals(col)).map(_._2)
    +          bitmap | 1 << keyIndex.getOrElse(
    +            throw new AnalysisException(s"${col.treeString} doesn't show up in the GROUP BY list"))
    +        })
    +      case _ => sys.error("Expect GROUPING SETS clause")
    +    }
    +
    +    (keys, bitmasks)
    +  }
    +
    +  protected def nodeToPlan(node: ASTNode): LogicalPlan = node match {
    +    case Token("TOK_QUERY", queryArgs @ Token("TOK_CTE" | "TOK_FROM" | "TOK_INSERT", _) :: _) =>
    +      val (fromClause: Option[ASTNode], insertClauses, cteRelations) =
    +        queryArgs match {
    +          case Token("TOK_CTE", ctes) :: Token("TOK_FROM", from) :: inserts =>
    +            val cteRelations = ctes.map { node =>
    +              val relation = nodeToRelation(node).asInstanceOf[Subquery]
    +              relation.alias -> relation
    +            }
    +            (Some(from.head), inserts, Some(cteRelations.toMap))
    +          case Token("TOK_FROM", from) :: inserts =>
    +            (Some(from.head), inserts, None)
    +          case Token("TOK_INSERT", _) :: Nil =>
    +            (None, queryArgs, None)
    +        }
    +
    +      // Return one query for each insert clause.
    +      val queries = insertClauses.map {
    +        case Token("TOK_INSERT", singleInsert) =>
    +          val (
    +            intoClause ::
    +              destClause ::
    +              selectClause ::
    +              selectDistinctClause ::
    +              whereClause ::
    +              groupByClause ::
    +              rollupGroupByClause ::
    +              cubeGroupByClause ::
    +              groupingSetsClause ::
    +              orderByClause ::
    +              havingClause ::
    +              sortByClause ::
    +              clusterByClause ::
    +              distributeByClause ::
    +              limitClause ::
    +              lateralViewClause ::
    +              windowClause :: Nil) = {
    +            getClauses(
    +              Seq(
    +                "TOK_INSERT_INTO",
    +                "TOK_DESTINATION",
    +                "TOK_SELECT",
    +                "TOK_SELECTDI",
    +                "TOK_WHERE",
    +                "TOK_GROUPBY",
    +                "TOK_ROLLUP_GROUPBY",
    +                "TOK_CUBE_GROUPBY",
    +                "TOK_GROUPING_SETS",
    +                "TOK_ORDERBY",
    +                "TOK_HAVING",
    +                "TOK_SORTBY",
    +                "TOK_CLUSTERBY",
    +                "TOK_DISTRIBUTEBY",
    +                "TOK_LIMIT",
    +                "TOK_LATERAL_VIEW",
    +                "WINDOW"),
    +              singleInsert)
    +          }
    +
    +          val relations = fromClause match {
    +            case Some(f) => nodeToRelation(f)
    +            case None => OneRowRelation
    +          }
    +
    +          val withWhere = whereClause.map { whereNode =>
    +            val Seq(whereExpr) = whereNode.children
    +            Filter(nodeToExpr(whereExpr), relations)
    +          }.getOrElse(relations)
    +
    +          val select = (selectClause orElse selectDistinctClause)
    +            .getOrElse(sys.error("No select clause."))
    +
    +          val transformation = nodeToTransformation(select.children.head, withWhere)
    +
    +          val withLateralView = lateralViewClause.map { lv =>
    +            nodeToGenerate(lv.children.head, outer = false, withWhere)
    +          }.getOrElse(withWhere)
    +
    +          // The projection of the query can either be a normal projection, an aggregation
    +          // (if there is a group by) or a script transformation.
    +          val withProject: LogicalPlan = transformation.getOrElse {
    +            val selectExpressions =
    +              select.children.flatMap(selExprNodeToExpr).map(UnresolvedAlias(_))
    +            Seq(
    +              groupByClause.map(e => e match {
    +                case Token("TOK_GROUPBY", children) =>
    +                  // Not a transformation so must be either project or aggregation.
    +                  Aggregate(children.map(nodeToExpr), selectExpressions, withLateralView)
    +                case _ => sys.error("Expect GROUP BY")
    +              }),
    +              groupingSetsClause.map(e => e match {
    +                case Token("TOK_GROUPING_SETS", children) =>
    +                  val(groupByExprs, masks) = extractGroupingSet(children)
    +                  GroupingSets(masks, groupByExprs, withLateralView, selectExpressions)
    +                case _ => sys.error("Expect GROUPING SETS")
    +              }),
    +              rollupGroupByClause.map(e => e match {
    +                case Token("TOK_ROLLUP_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Rollup(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH ROLLUP")
    +              }),
    +              cubeGroupByClause.map(e => e match {
    +                case Token("TOK_CUBE_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Cube(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH CUBE")
    +              }),
    +              Some(Project(selectExpressions, withLateralView))).flatten.head
    +          }
    +
    +          // Handle HAVING clause.
    +          val withHaving = havingClause.map { h =>
    +            val havingExpr = h.children match { case Seq(hexpr) => nodeToExpr(hexpr) }
    +            // Note that we added a cast to boolean. If the expression itself is already boolean,
    +            // the optimizer will get rid of the unnecessary cast.
    +            Filter(Cast(havingExpr, BooleanType), withProject)
    +          }.getOrElse(withProject)
    +
    +          // Handle SELECT DISTINCT
    +          val withDistinct =
    +            if (selectDistinctClause.isDefined) Distinct(withHaving) else withHaving
    +
    +          // Handle ORDER BY, SORT BY, DISTRIBUTE BY, and CLUSTER BY clause.
    +          val withSort =
    +            (orderByClause, sortByClause, distributeByClause, clusterByClause) match {
    +              case (Some(totalOrdering), None, None, None) =>
    +                Sort(totalOrdering.children.map(nodeToSortOrder), global = true, withDistinct)
    +              case (None, Some(perPartitionOrdering), None, None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder),
    +                  global = false, withDistinct)
    +              case (None, None, Some(partitionExprs), None) =>
    +                RepartitionByExpression(
    +                  partitionExprs.children.map(nodeToExpr), withDistinct)
    +              case (None, Some(perPartitionOrdering), Some(partitionExprs), None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder), global = false,
    +                  RepartitionByExpression(
    +                    partitionExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, Some(clusterExprs)) =>
    +                Sort(
    +                  clusterExprs.children.map(nodeToExpr).map(SortOrder(_, Ascending)),
    +                  global = false,
    +                  RepartitionByExpression(
    +                    clusterExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, None) => withDistinct
    +              case _ => sys.error("Unsupported set of ordering / distribution clauses.")
    +            }
    +
    +          val withLimit =
    +            limitClause.map(l => nodeToExpr(l.children.head))
    +              .map(Limit(_, withSort))
    +              .getOrElse(withSort)
    +
    +          // Collect all window specifications defined in the WINDOW clause.
    +          val windowDefinitions = windowClause.map(_.children.collect {
    +            case Token("TOK_WINDOWDEF",
    +            Token(windowName, Nil) :: Token("TOK_WINDOWSPEC", spec) :: Nil) =>
    +              windowName -> nodesToWindowSpecification(spec)
    +          }.toMap)
    +          // Handle cases like
    +          // window w1 as (partition by p_mfgr order by p_name
    +          //               range between 2 preceding and 2 following),
    +          //        w2 as w1
    +          val resolvedCrossReference = windowDefinitions.map {
    +            windowDefMap => windowDefMap.map {
    +              case (windowName, WindowSpecReference(other)) =>
    +                (windowName, windowDefMap(other).asInstanceOf[WindowSpecDefinition])
    +              case o => o.asInstanceOf[(String, WindowSpecDefinition)]
    +            }
    +          }
    +
    +          val withWindowDefinitions =
    +            resolvedCrossReference.map(WithWindowDefinition(_, withLimit)).getOrElse(withLimit)
    +
    +          // TOK_INSERT_INTO means to add files to the table.
    +          // TOK_DESTINATION means to overwrite the table.
    +          val resultDestination =
    +            (intoClause orElse destClause).getOrElse(sys.error("No destination found."))
    +          val overwrite = intoClause.isEmpty
    +          nodeToDest(
    +            resultDestination,
    +            withWindowDefinitions,
    +            overwrite)
    +      }
    +
    +      // If there are multiple INSERTS just UNION them together into on query.
    +      val query = queries.reduceLeft(Union)
    +
    +      // return With plan if there is CTE
    +      cteRelations.map(With(query, _)).getOrElse(query)
    +
    +    // HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
    +    case Token("TOK_UNIONALL", left :: right :: Nil) =>
    +      Union(nodeToPlan(left), nodeToPlan(right))
    +
    +    case _ =>
    +      noParseRule("Plan", node)
    +  }
    +
    +  val allJoinTokens = "(TOK_.*JOIN)".r
    +  val laterViewToken = "TOK_LATERAL_VIEW(.*)".r
    +  protected def nodeToRelation(node: ASTNode): LogicalPlan = {
    +    node match {
    +      case Token("TOK_SUBQUERY", query :: Token(alias, Nil) :: Nil) =>
    +        Subquery(cleanIdentifier(alias), nodeToPlan(query))
    +
    +      case Token(laterViewToken(isOuter), selectClause :: relationClause :: Nil) =>
    +        nodeToGenerate(
    +          selectClause,
    +          outer = isOuter.nonEmpty,
    +          nodeToRelation(relationClause))
    +
    +      /* All relations, possibly with aliases or sampling clauses. */
    +      case Token("TOK_TABREF", clauses) =>
    +        // If the last clause is not a token then it's the alias of the table.
    +        val (nonAliasClauses, aliasClause) =
    +          if (clauses.last.text.startsWith("TOK")) {
    +            (clauses, None)
    +          } else {
    +            (clauses.dropRight(1), Some(clauses.last))
    +          }
    +
    +        val (Some(tableNameParts) ::
    +          splitSampleClause ::
    +          bucketSampleClause :: Nil) = {
    +          getClauses(Seq("TOK_TABNAME", "TOK_TABLESPLITSAMPLE", "TOK_TABLEBUCKETSAMPLE"),
    +            nonAliasClauses)
    +        }
    +
    +        val tableIdent = extractTableIdent(tableNameParts)
    +        val alias = aliasClause.map { case Token(a, Nil) => cleanIdentifier(a) }
    +        val relation = UnresolvedRelation(tableIdent, alias)
    +
    +        // Apply sampling if requested.
    +        (bucketSampleClause orElse splitSampleClause).map {
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_ROWCOUNT", Nil) :: Token(count, Nil) :: Nil) =>
    +            Limit(Literal(count.toInt), relation)
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_PERCENT", Nil) :: Token(fraction, Nil) :: Nil) =>
    +            // The range of fraction accepted by Sample is [0, 1]. Because Hive's block sampling
    +            // function takes X PERCENT as the input and the range of X is [0, 100], we need to
    +            // adjust the fraction.
    +            require(
    +              fraction.toDouble >= (0.0 - RandomSampler.roundingEpsilon)
    +                && fraction.toDouble <= (100.0 + RandomSampler.roundingEpsilon),
    +              s"Sampling fraction ($fraction) must be on interval [0, 100]")
    +            Sample(0.0, fraction.toDouble / 100, withReplacement = false,
    +              (math.random * 1000).toInt,
    +              relation)
    +          case Token("TOK_TABLEBUCKETSAMPLE",
    +          Token(numerator, Nil) ::
    +            Token(denominator, Nil) :: Nil) =>
    +            val fraction = numerator.toDouble / denominator.toDouble
    +            Sample(0.0, fraction, withReplacement = false, (math.random * 1000).toInt, relation)
    +          case a =>
    +            noParseRule("Sampling", a)
    +        }.getOrElse(relation)
    +
    +      case Token(allJoinTokens(joinToken), relation1 :: relation2 :: other) =>
    +        if (!(other.size <= 1)) {
    +          sys.error(s"Unsupported join operation: $other")
    +        }
    +
    +        val joinType = joinToken match {
    +          case "TOK_JOIN" => Inner
    +          case "TOK_CROSSJOIN" => Inner
    +          case "TOK_RIGHTOUTERJOIN" => RightOuter
    +          case "TOK_LEFTOUTERJOIN" => LeftOuter
    +          case "TOK_FULLOUTERJOIN" => FullOuter
    +          case "TOK_LEFTSEMIJOIN" => LeftSemi
    +          case "TOK_UNIQUEJOIN" => noParseRule("Unique Join", node)
    +          case "TOK_ANTIJOIN" => noParseRule("Anti Join", node)
    +        }
    +        Join(nodeToRelation(relation1),
    +          nodeToRelation(relation2),
    +          joinType,
    +          other.headOption.map(nodeToExpr))
    +
    +      case _ =>
    +        noParseRule("Relation", node)
    +    }
    +  }
    +
    +  protected def nodeToSortOrder(node: ASTNode): SortOrder = node match {
    +    case Token("TOK_TABSORTCOLNAMEASC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Ascending)
    +    case Token("TOK_TABSORTCOLNAMEDESC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Descending)
    +    case _ =>
    +      noParseRule("SortOrder", node)
    +  }
    +
    +  val destinationToken = "TOK_DESTINATION|TOK_INSERT_INTO".r
    +  protected def nodeToDest(
    +      node: ASTNode,
    +      query: LogicalPlan,
    +      overwrite: Boolean): LogicalPlan = node match {
    +    case Token(destinationToken(),
    +    Token("TOK_DIR",
    +    Token("TOK_TMP_FILE", Nil) :: Nil) :: Nil) =>
    +      query
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = false)
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) ::
    +      Token("TOK_IFNOTEXISTS",
    +      ifNotExists) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = true)
    +
    +    case _ =>
    +      noParseRule("Destination", node)
    +  }
    +
    +  protected def selExprNodeToExpr(node: ASTNode): Option[Expression] = node match {
    +    case Token("TOK_SELEXPR", e :: Nil) =>
    +      Some(nodeToExpr(e))
    +
    +    case Token("TOK_SELEXPR", e :: Token(alias, Nil) :: Nil) =>
    +      Some(Alias(nodeToExpr(e), cleanIdentifier(alias))())
    +
    +    case Token("TOK_SELEXPR", e :: aliasChildren) =>
    +      val aliasNames = aliasChildren.collect {
    +        case Token(name, Nil) => cleanIdentifier(name)
    +      }
    +      Some(MultiAlias(nodeToExpr(e), aliasNames))
    +
    +    /* Hints are ignored */
    +    case Token("TOK_HINTLIST", _) => None
    +
    +    case _ =>
    +      noParseRule("Select", node)
    +  }
    +
    +  protected val escapedIdentifier = "`([^`]+)`".r
    +  protected val doubleQuotedString = "\"([^\"]+)\"".r
    +  protected val singleQuotedString = "'([^']+)'".r
    +
    +  protected def unquoteString(str: String) = str match {
    +    case singleQuotedString(s) => s
    +    case doubleQuotedString(s) => s
    +    case other => other
    +  }
    +
    +  /** Strips backticks from ident if present */
    +  protected def cleanIdentifier(ident: String): String = ident match {
    +    case escapedIdentifier(i) => i
    +    case plainIdent => plainIdent
    +  }
    +
    +  val numericAstTypes = Seq(
    +    SparkSqlParser.Number,
    +    SparkSqlParser.TinyintLiteral,
    +    SparkSqlParser.SmallintLiteral,
    +    SparkSqlParser.BigintLiteral,
    +    SparkSqlParser.DecimalLiteral)
    +
    +  /* Case insensitive matches */
    +  val COUNT = "(?i)COUNT".r
    +  val SUM = "(?i)SUM".r
    +  val AND = "(?i)AND".r
    +  val OR = "(?i)OR".r
    +  val NOT = "(?i)NOT".r
    +  val TRUE = "(?i)TRUE".r
    +  val FALSE = "(?i)FALSE".r
    +  val LIKE = "(?i)LIKE".r
    +  val RLIKE = "(?i)RLIKE".r
    +  val REGEXP = "(?i)REGEXP".r
    +  val IN = "(?i)IN".r
    +  val DIV = "(?i)DIV".r
    +  val BETWEEN = "(?i)BETWEEN".r
    +  val WHEN = "(?i)WHEN".r
    +  val CASE = "(?i)CASE".r
    +
    +  protected def nodeToExpr(node: ASTNode): Expression = node match {
    +    /* Attribute References */
    +    case Token("TOK_TABLE_OR_COL", Token(name, Nil) :: Nil) =>
    +      UnresolvedAttribute.quoted(cleanIdentifier(name))
    +    case Token(".", qualifier :: Token(attr, Nil) :: Nil) =>
    +      nodeToExpr(qualifier) match {
    +        case UnresolvedAttribute(nameParts) =>
    +          UnresolvedAttribute(nameParts :+ cleanIdentifier(attr))
    +        case other => UnresolvedExtractValue(other, Literal(attr))
    +      }
    +
    +    /* Stars (*) */
    +    case Token("TOK_ALLCOLREF", Nil) => UnresolvedStar(None)
    +    // The format of dbName.tableName.* cannot be parsed by HiveParser. TOK_TABNAME will only
    +    // has a single child which is tableName.
    +    case Token("TOK_ALLCOLREF", Token("TOK_TABNAME", Token(name, Nil) :: Nil) :: Nil) =>
    +      UnresolvedStar(Some(UnresolvedAttribute.parseAttributeName(name)))
    +
    +    /* Aggregate Functions */
    +    case Token("TOK_FUNCTIONDI", Token(COUNT(), Nil) :: args) =>
    +      Count(args.map(nodeToExpr)).toAggregateExpression(isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(COUNT(), Nil) :: Nil) =>
    +      Count(Literal(1)).toAggregateExpression()
    +
    +    /* Casts */
    +    case Token("TOK_FUNCTION", Token("TOK_STRING", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_VARCHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_CHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_INT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), IntegerType)
    +    case Token("TOK_FUNCTION", Token("TOK_BIGINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), LongType)
    +    case Token("TOK_FUNCTION", Token("TOK_FLOAT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), FloatType)
    +    case Token("TOK_FUNCTION", Token("TOK_DOUBLE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DoubleType)
    +    case Token("TOK_FUNCTION", Token("TOK_SMALLINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ShortType)
    +    case Token("TOK_FUNCTION", Token("TOK_TINYINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ByteType)
    +    case Token("TOK_FUNCTION", Token("TOK_BINARY", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BinaryType)
    +    case Token("TOK_FUNCTION", Token("TOK_BOOLEAN", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BooleanType)
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: scale :: nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, scale.text.toInt))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, 0))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType.USER_DEFAULT)
    +    case Token("TOK_FUNCTION", Token("TOK_TIMESTAMP", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), TimestampType)
    +    case Token("TOK_FUNCTION", Token("TOK_DATE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DateType)
    +
    +    /* Arithmetic */
    +    case Token("+", child :: Nil) => nodeToExpr(child)
    +    case Token("-", child :: Nil) => UnaryMinus(nodeToExpr(child))
    +    case Token("~", child :: Nil) => BitwiseNot(nodeToExpr(child))
    +    case Token("+", left :: right:: Nil) => Add(nodeToExpr(left), nodeToExpr(right))
    +    case Token("-", left :: right:: Nil) => Subtract(nodeToExpr(left), nodeToExpr(right))
    +    case Token("*", left :: right:: Nil) => Multiply(nodeToExpr(left), nodeToExpr(right))
    +    case Token("/", left :: right:: Nil) => Divide(nodeToExpr(left), nodeToExpr(right))
    +    case Token(DIV(), left :: right:: Nil) =>
    +      Cast(Divide(nodeToExpr(left), nodeToExpr(right)), LongType)
    +    case Token("%", left :: right:: Nil) => Remainder(nodeToExpr(left), nodeToExpr(right))
    +    case Token("&", left :: right:: Nil) => BitwiseAnd(nodeToExpr(left), nodeToExpr(right))
    +    case Token("|", left :: right:: Nil) => BitwiseOr(nodeToExpr(left), nodeToExpr(right))
    +    case Token("^", left :: right:: Nil) => BitwiseXor(nodeToExpr(left), nodeToExpr(right))
    +
    +    /* Comparisons */
    +    case Token("=", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("==", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=>", left :: right:: Nil) => EqualNullSafe(nodeToExpr(left), nodeToExpr(right))
    +    case Token("!=", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token("<>", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token(">", left :: right:: Nil) => GreaterThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token(">=", left :: right:: Nil) => GreaterThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<", left :: right:: Nil) => LessThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=", left :: right:: Nil) => LessThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token(LIKE(), left :: right:: Nil) => Like(nodeToExpr(left), nodeToExpr(right))
    +    case Token(RLIKE(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token(REGEXP(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNOTNULL", Nil) :: child :: Nil) =>
    +      IsNotNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNULL", Nil) :: child :: Nil) =>
    +      IsNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token(IN(), Nil) :: value :: list) =>
    +      In(nodeToExpr(value), list.map(nodeToExpr))
    +    case Token("TOK_FUNCTION",
    +    Token(BETWEEN(), Nil) ::
    +      kw ::
    +      target ::
    +      minValue ::
    +      maxValue :: Nil) =>
    +
    +      val targetExpression = nodeToExpr(target)
    +      val betweenExpr =
    +        And(
    +          GreaterThanOrEqual(targetExpression, nodeToExpr(minValue)),
    +          LessThanOrEqual(targetExpression, nodeToExpr(maxValue)))
    +      kw match {
    +        case Token("KW_FALSE", Nil) => betweenExpr
    +        case Token("KW_TRUE", Nil) => Not(betweenExpr)
    +      }
    +
    +    /* Boolean Logic */
    +    case Token(AND(), left :: right:: Nil) => And(nodeToExpr(left), nodeToExpr(right))
    +    case Token(OR(), left :: right:: Nil) => Or(nodeToExpr(left), nodeToExpr(right))
    +    case Token(NOT(), child :: Nil) => Not(nodeToExpr(child))
    +    case Token("!", child :: Nil) => Not(nodeToExpr(child))
    +
    +    /* Case statements */
    +    case Token("TOK_FUNCTION", Token(WHEN(), Nil) :: branches) =>
    +      CaseWhen(branches.map(nodeToExpr))
    +    case Token("TOK_FUNCTION", Token(CASE(), Nil) :: branches) =>
    +      val keyExpr = nodeToExpr(branches.head)
    +      CaseKeyWhen(keyExpr, branches.drop(1).map(nodeToExpr))
    +
    +    /* Complex datatype manipulation */
    +    case Token("[", child :: ordinal :: Nil) =>
    +      UnresolvedExtractValue(nodeToExpr(child), nodeToExpr(ordinal))
    +
    +    /* Window Functions */
    +    case Token(text, args :+ Token("TOK_WINDOWSPEC", spec)) =>
    +      val function = nodeToExpr(node.copy(children = node.children.init))
    +      nodesToWindowSpecification(spec) match {
    +        case reference: WindowSpecReference =>
    +          UnresolvedWindowExpression(function, reference)
    +        case definition: WindowSpecDefinition =>
    +          WindowExpression(function, definition)
    +      }
    +
    +    /* UDFs - Must be last otherwise will preempt built in functions */
    +    case Token("TOK_FUNCTION", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = false)
    +    // Aggregate function with DISTINCT keyword.
    +    case Token("TOK_FUNCTIONDI", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, UnresolvedStar(None) :: Nil, isDistinct = false)
    +
    +    /* Literals */
    +    case Token("TOK_NULL", Nil) => Literal.create(null, NullType)
    +    case Token(TRUE(), Nil) => Literal.create(true, BooleanType)
    +    case Token(FALSE(), Nil) => Literal.create(false, BooleanType)
    +    case Token("TOK_STRINGLITERALSEQUENCE", strings) =>
    +      Literal(strings.map(s => ParseUtils.unescapeSQLString(s.text)).mkString)
    +
    +    // This code is adapted from
    +    // /ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java#L223
    +    case ast: ASTNode if numericAstTypes contains ast.tokenType =>
    +      var v: Literal = null
    +      try {
    +        if (ast.text.endsWith("L")) {
    +          // Literal bigint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toLong, LongType)
    +        } else if (ast.text.endsWith("S")) {
    +          // Literal smallint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toShort, ShortType)
    +        } else if (ast.text.endsWith("Y")) {
    +          // Literal tinyint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toByte, ByteType)
    +        } else if (ast.text.endsWith("BD") || ast.text.endsWith("D")) {
    +          // Literal decimal
    +          val strVal = ast.text.stripSuffix("D").stripSuffix("B")
    +          v = Literal(Decimal(strVal))
    +        } else {
    +          v = Literal.create(ast.text.toDouble, DoubleType)
    +          v = Literal.create(ast.text.toLong, LongType)
    +          v = Literal.create(ast.text.toInt, IntegerType)
    +        }
    +      } catch {
    +        case nfe: NumberFormatException => // Do nothing
    +      }
    +
    +      if (v == null) {
    +        sys.error(s"Failed to parse number '${ast.text}'.")
    +      } else {
    +        v
    +      }
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.StringLiteral =>
    +      Literal(ParseUtils.unescapeSQLString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_DATELITERAL =>
    +      Literal(Date.valueOf(ast.text.substring(1, ast.text.length - 1)))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_CHARSETLITERAL =>
    +      Literal(ParseUtils.charSetString(ast.children.head.text, ast.children(1).text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromYearMonthString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_TIME_LITERAL =>
    +      Literal(CalendarInterval.fromDayTimeString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("year", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("month", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("day", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_HOUR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("hour", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MINUTE_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("minute", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_SECOND_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("second", ast.text))
    +
    +    case _ =>
    +      noParseRule("Expression", node)
    +  }
    +
    +  /* Case insensitive matches for Window Specification */
    +  val PRECEDING = "(?i)preceding".r
    +  val FOLLOWING = "(?i)following".r
    +  val CURRENT = "(?i)current".r
    +  protected def nodesToWindowSpecification(nodes: Seq[ASTNode]): WindowSpec = nodes match {
    +    case Token(windowName, Nil) :: Nil =>
    +      // Refer to a window spec defined in the window clause.
    +      WindowSpecReference(windowName)
    +    case Nil =>
    +      // OVER()
    +      WindowSpecDefinition(
    +        partitionSpec = Nil,
    +        orderSpec = Nil,
    +        frameSpecification = UnspecifiedFrame)
    +    case spec =>
    +      val (partitionClause :: rowFrame :: rangeFrame :: Nil) =
    +        getClauses(
    +          Seq(
    +            "TOK_PARTITIONINGSPEC",
    +            "TOK_WINDOWRANGE",
    +            "TOK_WINDOWVALUES"),
    +          spec)
    +
    +      // Handle Partition By and Order By.
    +      val (partitionSpec, orderSpec) = partitionClause.map { partitionAndOrdering =>
    +        val (partitionByClause :: orderByClause :: sortByClause :: clusterByClause :: Nil) =
    +          getClauses(
    +            Seq("TOK_DISTRIBUTEBY", "TOK_ORDERBY", "TOK_SORTBY", "TOK_CLUSTERBY"),
    +            partitionAndOrdering.children)
    +
    +        (partitionByClause, orderByClause.orElse(sortByClause), clusterByClause) match {
    +          case (Some(partitionByExpr), Some(orderByExpr), None) =>
    +            (partitionByExpr.children.map(nodeToExpr),
    +              orderByExpr.children.map(nodeToSortOrder))
    +          case (Some(partitionByExpr), None, None) =>
    +            (partitionByExpr.children.map(nodeToExpr), Nil)
    +          case (None, Some(orderByExpr), None) =>
    +            (Nil, orderByExpr.children.map(nodeToSortOrder))
    +          case (None, None, Some(clusterByExpr)) =>
    +            val expressions = clusterByExpr.children.map(nodeToExpr)
    +            (expressions, expressions.map(SortOrder(_, Ascending)))
    +          case _ =>
    +            noParseRule("Partition & Ordering", partitionAndOrdering)
    +        }
    +      }.getOrElse {
    +        (Nil, Nil)
    +      }
    +
    +      // Handle Window Frame
    +      val windowFrame =
    +        if (rowFrame.isEmpty && rangeFrame.isEmpty) {
    +          UnspecifiedFrame
    +        } else {
    +          val frameType = rowFrame.map(_ => RowFrame).getOrElse(RangeFrame)
    +          def nodeToBoundary(node: ASTNode): FrameBoundary = node match {
    +            case Token(PRECEDING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedPreceding
    +              } else {
    +                ValuePreceding(count.toInt)
    +              }
    +            case Token(FOLLOWING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedFollowing
    +              } else {
    +                ValueFollowing(count.toInt)
    +              }
    +            case Token(CURRENT(), Nil) => CurrentRow
    +            case _ =>
    +              noParseRule("Window Frame Boundary", node)
    +          }
    +
    +          rowFrame.orElse(rangeFrame).map { frame =>
    +            frame.children match {
    +              case precedingNode :: followingNode :: Nil =>
    +                SpecifiedWindowFrame(
    +                  frameType,
    +                  nodeToBoundary(precedingNode),
    +                  nodeToBoundary(followingNode))
    +              case precedingNode :: Nil =>
    +                SpecifiedWindowFrame(frameType, nodeToBoundary(precedingNode), CurrentRow)
    +              case _ =>
    +                noParseRule("Window Frame", frame)
    +            }
    +          }.getOrElse(sys.error(s"If you see this, please file a bug report with your query."))
    +        }
    +
    +      WindowSpecDefinition(partitionSpec, orderSpec, windowFrame)
    +  }
    +
    +  protected def nodeToTransformation(
    +      node: ASTNode,
    +      child: LogicalPlan): Option[ScriptTransformation] = None
    +
    +  protected def nodeToGenerate(node: ASTNode, outer: Boolean, child: LogicalPlan): Generate = {
    +    val Token("TOK_SELECT", Token("TOK_SELEXPR", clauses) :: Nil) = node
    +
    +    val alias = getClause("TOK_TABALIAS", clauses).children.head.text
    +
    +    val generator = clauses.head match {
    +      case Token("TOK_FUNCTION", Token(functionName, Nil) :: children) =>
    +        UnresolvedGenerator(functionName, children.map(nodeToExpr))
    --- End diff --
    
    Does lateral view a hive feature? Should we support it in catalyst?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48988457
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,961 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case e: MatchError => throw e
    +      case e: AnalysisException => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree = getAst(ddl)
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    --- End diff --
    
    what do you mean?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169049383
  
    **[Test build #48771 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48771/consoleFull)** for PR 10583 at commit [`43c29b7`](https://github.com/apache/spark/commit/43c29b7a2ba3598e50561a974dba1d763e90746c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169326184
  
    **[Test build #48853 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48853/consoleFull)** for PR 10583 at commit [`3680d4c`](https://github.com/apache/spark/commit/3680d4c4179d63584a42f6d33bbe3c718fa3075d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168931844
  
    **[Test build #2322 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/2322/consoleFull)** for PR 10583 at commit [`fb3b4a4`](https://github.com/apache/spark/commit/fb3b4a4c461391866bc12a51dd1e60eadeaff916).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169126939
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168906958
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48709/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168964867
  
    **[Test build #48747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48747/consoleFull)** for PR 10583 at commit [`fb3b4a4`](https://github.com/apache/spark/commit/fb3b4a4c461391866bc12a51dd1e60eadeaff916).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48922609
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
    @@ -451,6 +452,19 @@ private[spark] object SQLConf {
         doc = "When true, we could use `datasource`.`path` as table in SQL query"
       )
     
    +  val PARSER_SUPPORT_QUOTEDID = stringConf("spark.sql.parser.supportQuotedIdentifiers",
    --- End diff --
    
    should this be a boolean conf?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169122639
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169166339
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48789/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48995383
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,961 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case e: MatchError => throw e
    +      case e: AnalysisException => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree = getAst(ddl)
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    --- End diff --
    
    It is is a fruitless test, children are never ```null``` if there aren't any ```Nil``` is returned.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169196055
  
    cc @cloud-fan  can you take a look at this? Thanks.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/10583


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48934317
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
    @@ -451,6 +452,19 @@ private[spark] object SQLConf {
         doc = "When true, we could use `datasource`.`path` as table in SQL query"
       )
     
    +  val PARSER_SUPPORT_QUOTEDID = stringConf("spark.sql.parser.supportQuotedIdentifiers",
    --- End diff --
    
    I ported this directly form Hive. It has only two options:
    - ```none```: no quoting. I'll map this to false.
    - ```column```: quoting. I'll map this to true.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169122642
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48780/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48934251
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,969 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +  // TODO improve the parse error - so we don't need this anymore.
    +  val errorRegEx = "line (\\d+):(\\d+) (.*)".r
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case pe: ParseException =>
    +        pe.getMessage match {
    +          case errorRegEx(line, start, message) =>
    +            throw new AnalysisException(message, Some(line.toInt), Some(start.toInt))
    +          case otherMessage =>
    +            throw new AnalysisException(otherMessage)
    +        }
    +      case e: MatchError => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree =
    +      try {
    +        getAst(ddl)
    +      } catch {
    +        case pe: ParseException =>
    +          throw new RuntimeException(s"Failed to parse ddl: '$ddl'", pe)
    +      }
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", columns) =>
    +        columns.foldLeft(0)((bitmap, col) => {
    +          val keyIndex = keyMap.find(_._1.treeEquals(col)).map(_._2)
    +          bitmap | 1 << keyIndex.getOrElse(
    +            throw new AnalysisException(s"${col.treeString} doesn't show up in the GROUP BY list"))
    +        })
    +      case _ => sys.error("Expect GROUPING SETS clause")
    +    }
    +
    +    (keys, bitmasks)
    +  }
    +
    +  protected def nodeToPlan(node: ASTNode): LogicalPlan = node match {
    +    case Token("TOK_QUERY", queryArgs @ Token("TOK_CTE" | "TOK_FROM" | "TOK_INSERT", _) :: _) =>
    +      val (fromClause: Option[ASTNode], insertClauses, cteRelations) =
    +        queryArgs match {
    +          case Token("TOK_CTE", ctes) :: Token("TOK_FROM", from) :: inserts =>
    +            val cteRelations = ctes.map { node =>
    +              val relation = nodeToRelation(node).asInstanceOf[Subquery]
    +              relation.alias -> relation
    +            }
    +            (Some(from.head), inserts, Some(cteRelations.toMap))
    +          case Token("TOK_FROM", from) :: inserts =>
    +            (Some(from.head), inserts, None)
    +          case Token("TOK_INSERT", _) :: Nil =>
    +            (None, queryArgs, None)
    +        }
    +
    +      // Return one query for each insert clause.
    +      val queries = insertClauses.map {
    +        case Token("TOK_INSERT", singleInsert) =>
    +          val (
    +            intoClause ::
    +              destClause ::
    +              selectClause ::
    +              selectDistinctClause ::
    +              whereClause ::
    +              groupByClause ::
    +              rollupGroupByClause ::
    +              cubeGroupByClause ::
    +              groupingSetsClause ::
    +              orderByClause ::
    +              havingClause ::
    +              sortByClause ::
    +              clusterByClause ::
    +              distributeByClause ::
    +              limitClause ::
    +              lateralViewClause ::
    +              windowClause :: Nil) = {
    +            getClauses(
    +              Seq(
    +                "TOK_INSERT_INTO",
    +                "TOK_DESTINATION",
    +                "TOK_SELECT",
    +                "TOK_SELECTDI",
    +                "TOK_WHERE",
    +                "TOK_GROUPBY",
    +                "TOK_ROLLUP_GROUPBY",
    +                "TOK_CUBE_GROUPBY",
    +                "TOK_GROUPING_SETS",
    +                "TOK_ORDERBY",
    +                "TOK_HAVING",
    +                "TOK_SORTBY",
    +                "TOK_CLUSTERBY",
    +                "TOK_DISTRIBUTEBY",
    +                "TOK_LIMIT",
    +                "TOK_LATERAL_VIEW",
    +                "WINDOW"),
    +              singleInsert)
    +          }
    +
    +          val relations = fromClause match {
    +            case Some(f) => nodeToRelation(f)
    +            case None => OneRowRelation
    +          }
    +
    +          val withWhere = whereClause.map { whereNode =>
    +            val Seq(whereExpr) = whereNode.children
    +            Filter(nodeToExpr(whereExpr), relations)
    +          }.getOrElse(relations)
    +
    +          val select = (selectClause orElse selectDistinctClause)
    +            .getOrElse(sys.error("No select clause."))
    +
    +          val transformation = nodeToTransformation(select.children.head, withWhere)
    +
    +          val withLateralView = lateralViewClause.map { lv =>
    +            nodeToGenerate(lv.children.head, outer = false, withWhere)
    +          }.getOrElse(withWhere)
    +
    +          // The projection of the query can either be a normal projection, an aggregation
    +          // (if there is a group by) or a script transformation.
    +          val withProject: LogicalPlan = transformation.getOrElse {
    +            val selectExpressions =
    +              select.children.flatMap(selExprNodeToExpr).map(UnresolvedAlias(_))
    +            Seq(
    +              groupByClause.map(e => e match {
    +                case Token("TOK_GROUPBY", children) =>
    +                  // Not a transformation so must be either project or aggregation.
    +                  Aggregate(children.map(nodeToExpr), selectExpressions, withLateralView)
    +                case _ => sys.error("Expect GROUP BY")
    +              }),
    +              groupingSetsClause.map(e => e match {
    +                case Token("TOK_GROUPING_SETS", children) =>
    +                  val(groupByExprs, masks) = extractGroupingSet(children)
    +                  GroupingSets(masks, groupByExprs, withLateralView, selectExpressions)
    +                case _ => sys.error("Expect GROUPING SETS")
    +              }),
    +              rollupGroupByClause.map(e => e match {
    +                case Token("TOK_ROLLUP_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Rollup(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH ROLLUP")
    +              }),
    +              cubeGroupByClause.map(e => e match {
    +                case Token("TOK_CUBE_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Cube(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH CUBE")
    +              }),
    +              Some(Project(selectExpressions, withLateralView))).flatten.head
    +          }
    +
    +          // Handle HAVING clause.
    +          val withHaving = havingClause.map { h =>
    +            val havingExpr = h.children match { case Seq(hexpr) => nodeToExpr(hexpr) }
    +            // Note that we added a cast to boolean. If the expression itself is already boolean,
    +            // the optimizer will get rid of the unnecessary cast.
    +            Filter(Cast(havingExpr, BooleanType), withProject)
    +          }.getOrElse(withProject)
    +
    +          // Handle SELECT DISTINCT
    +          val withDistinct =
    +            if (selectDistinctClause.isDefined) Distinct(withHaving) else withHaving
    +
    +          // Handle ORDER BY, SORT BY, DISTRIBUTE BY, and CLUSTER BY clause.
    +          val withSort =
    +            (orderByClause, sortByClause, distributeByClause, clusterByClause) match {
    +              case (Some(totalOrdering), None, None, None) =>
    +                Sort(totalOrdering.children.map(nodeToSortOrder), global = true, withDistinct)
    +              case (None, Some(perPartitionOrdering), None, None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder),
    +                  global = false, withDistinct)
    +              case (None, None, Some(partitionExprs), None) =>
    +                RepartitionByExpression(
    +                  partitionExprs.children.map(nodeToExpr), withDistinct)
    +              case (None, Some(perPartitionOrdering), Some(partitionExprs), None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder), global = false,
    +                  RepartitionByExpression(
    +                    partitionExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, Some(clusterExprs)) =>
    +                Sort(
    +                  clusterExprs.children.map(nodeToExpr).map(SortOrder(_, Ascending)),
    +                  global = false,
    +                  RepartitionByExpression(
    +                    clusterExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, None) => withDistinct
    +              case _ => sys.error("Unsupported set of ordering / distribution clauses.")
    +            }
    +
    +          val withLimit =
    +            limitClause.map(l => nodeToExpr(l.children.head))
    +              .map(Limit(_, withSort))
    +              .getOrElse(withSort)
    +
    +          // Collect all window specifications defined in the WINDOW clause.
    +          val windowDefinitions = windowClause.map(_.children.collect {
    +            case Token("TOK_WINDOWDEF",
    +            Token(windowName, Nil) :: Token("TOK_WINDOWSPEC", spec) :: Nil) =>
    +              windowName -> nodesToWindowSpecification(spec)
    +          }.toMap)
    +          // Handle cases like
    +          // window w1 as (partition by p_mfgr order by p_name
    +          //               range between 2 preceding and 2 following),
    +          //        w2 as w1
    +          val resolvedCrossReference = windowDefinitions.map {
    +            windowDefMap => windowDefMap.map {
    +              case (windowName, WindowSpecReference(other)) =>
    +                (windowName, windowDefMap(other).asInstanceOf[WindowSpecDefinition])
    +              case o => o.asInstanceOf[(String, WindowSpecDefinition)]
    +            }
    +          }
    +
    +          val withWindowDefinitions =
    +            resolvedCrossReference.map(WithWindowDefinition(_, withLimit)).getOrElse(withLimit)
    +
    +          // TOK_INSERT_INTO means to add files to the table.
    +          // TOK_DESTINATION means to overwrite the table.
    +          val resultDestination =
    +            (intoClause orElse destClause).getOrElse(sys.error("No destination found."))
    +          val overwrite = intoClause.isEmpty
    +          nodeToDest(
    +            resultDestination,
    +            withWindowDefinitions,
    +            overwrite)
    +      }
    +
    +      // If there are multiple INSERTS just UNION them together into on query.
    +      val query = queries.reduceLeft(Union)
    +
    +      // return With plan if there is CTE
    +      cteRelations.map(With(query, _)).getOrElse(query)
    +
    +    // HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
    +    case Token("TOK_UNIONALL", left :: right :: Nil) =>
    +      Union(nodeToPlan(left), nodeToPlan(right))
    +
    +    case _ =>
    +      noParseRule("Plan", node)
    +  }
    +
    +  val allJoinTokens = "(TOK_.*JOIN)".r
    +  val laterViewToken = "TOK_LATERAL_VIEW(.*)".r
    +  protected def nodeToRelation(node: ASTNode): LogicalPlan = {
    +    node match {
    +      case Token("TOK_SUBQUERY", query :: Token(alias, Nil) :: Nil) =>
    +        Subquery(cleanIdentifier(alias), nodeToPlan(query))
    +
    +      case Token(laterViewToken(isOuter), selectClause :: relationClause :: Nil) =>
    +        nodeToGenerate(
    +          selectClause,
    +          outer = isOuter.nonEmpty,
    +          nodeToRelation(relationClause))
    +
    +      /* All relations, possibly with aliases or sampling clauses. */
    +      case Token("TOK_TABREF", clauses) =>
    +        // If the last clause is not a token then it's the alias of the table.
    +        val (nonAliasClauses, aliasClause) =
    +          if (clauses.last.text.startsWith("TOK")) {
    +            (clauses, None)
    +          } else {
    +            (clauses.dropRight(1), Some(clauses.last))
    +          }
    +
    +        val (Some(tableNameParts) ::
    +          splitSampleClause ::
    +          bucketSampleClause :: Nil) = {
    +          getClauses(Seq("TOK_TABNAME", "TOK_TABLESPLITSAMPLE", "TOK_TABLEBUCKETSAMPLE"),
    +            nonAliasClauses)
    +        }
    +
    +        val tableIdent = extractTableIdent(tableNameParts)
    +        val alias = aliasClause.map { case Token(a, Nil) => cleanIdentifier(a) }
    +        val relation = UnresolvedRelation(tableIdent, alias)
    +
    +        // Apply sampling if requested.
    +        (bucketSampleClause orElse splitSampleClause).map {
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_ROWCOUNT", Nil) :: Token(count, Nil) :: Nil) =>
    +            Limit(Literal(count.toInt), relation)
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_PERCENT", Nil) :: Token(fraction, Nil) :: Nil) =>
    +            // The range of fraction accepted by Sample is [0, 1]. Because Hive's block sampling
    +            // function takes X PERCENT as the input and the range of X is [0, 100], we need to
    +            // adjust the fraction.
    +            require(
    +              fraction.toDouble >= (0.0 - RandomSampler.roundingEpsilon)
    +                && fraction.toDouble <= (100.0 + RandomSampler.roundingEpsilon),
    +              s"Sampling fraction ($fraction) must be on interval [0, 100]")
    +            Sample(0.0, fraction.toDouble / 100, withReplacement = false,
    +              (math.random * 1000).toInt,
    +              relation)
    +          case Token("TOK_TABLEBUCKETSAMPLE",
    +          Token(numerator, Nil) ::
    +            Token(denominator, Nil) :: Nil) =>
    +            val fraction = numerator.toDouble / denominator.toDouble
    +            Sample(0.0, fraction, withReplacement = false, (math.random * 1000).toInt, relation)
    +          case a =>
    +            noParseRule("Sampling", a)
    +        }.getOrElse(relation)
    +
    +      case Token(allJoinTokens(joinToken), relation1 :: relation2 :: other) =>
    +        if (!(other.size <= 1)) {
    +          sys.error(s"Unsupported join operation: $other")
    +        }
    +
    +        val joinType = joinToken match {
    +          case "TOK_JOIN" => Inner
    +          case "TOK_CROSSJOIN" => Inner
    +          case "TOK_RIGHTOUTERJOIN" => RightOuter
    +          case "TOK_LEFTOUTERJOIN" => LeftOuter
    +          case "TOK_FULLOUTERJOIN" => FullOuter
    +          case "TOK_LEFTSEMIJOIN" => LeftSemi
    +          case "TOK_UNIQUEJOIN" => noParseRule("Unique Join", node)
    +          case "TOK_ANTIJOIN" => noParseRule("Anti Join", node)
    +        }
    +        Join(nodeToRelation(relation1),
    +          nodeToRelation(relation2),
    +          joinType,
    +          other.headOption.map(nodeToExpr))
    +
    +      case _ =>
    +        noParseRule("Relation", node)
    +    }
    +  }
    +
    +  protected def nodeToSortOrder(node: ASTNode): SortOrder = node match {
    +    case Token("TOK_TABSORTCOLNAMEASC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Ascending)
    +    case Token("TOK_TABSORTCOLNAMEDESC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Descending)
    +    case _ =>
    +      noParseRule("SortOrder", node)
    +  }
    +
    +  val destinationToken = "TOK_DESTINATION|TOK_INSERT_INTO".r
    +  protected def nodeToDest(
    +      node: ASTNode,
    +      query: LogicalPlan,
    +      overwrite: Boolean): LogicalPlan = node match {
    +    case Token(destinationToken(),
    +    Token("TOK_DIR",
    +    Token("TOK_TMP_FILE", Nil) :: Nil) :: Nil) =>
    +      query
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = false)
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) ::
    +      Token("TOK_IFNOTEXISTS",
    +      ifNotExists) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = true)
    +
    +    case _ =>
    +      noParseRule("Destination", node)
    +  }
    +
    +  protected def selExprNodeToExpr(node: ASTNode): Option[Expression] = node match {
    +    case Token("TOK_SELEXPR", e :: Nil) =>
    +      Some(nodeToExpr(e))
    +
    +    case Token("TOK_SELEXPR", e :: Token(alias, Nil) :: Nil) =>
    +      Some(Alias(nodeToExpr(e), cleanIdentifier(alias))())
    +
    +    case Token("TOK_SELEXPR", e :: aliasChildren) =>
    +      val aliasNames = aliasChildren.collect {
    +        case Token(name, Nil) => cleanIdentifier(name)
    +      }
    +      Some(MultiAlias(nodeToExpr(e), aliasNames))
    +
    +    /* Hints are ignored */
    +    case Token("TOK_HINTLIST", _) => None
    +
    +    case _ =>
    +      noParseRule("Select", node)
    +  }
    +
    +  protected val escapedIdentifier = "`([^`]+)`".r
    +  protected val doubleQuotedString = "\"([^\"]+)\"".r
    +  protected val singleQuotedString = "'([^']+)'".r
    +
    +  protected def unquoteString(str: String) = str match {
    +    case singleQuotedString(s) => s
    +    case doubleQuotedString(s) => s
    +    case other => other
    +  }
    +
    +  /** Strips backticks from ident if present */
    +  protected def cleanIdentifier(ident: String): String = ident match {
    +    case escapedIdentifier(i) => i
    +    case plainIdent => plainIdent
    +  }
    +
    +  val numericAstTypes = Seq(
    +    SparkSqlParser.Number,
    +    SparkSqlParser.TinyintLiteral,
    +    SparkSqlParser.SmallintLiteral,
    +    SparkSqlParser.BigintLiteral,
    +    SparkSqlParser.DecimalLiteral)
    +
    +  /* Case insensitive matches */
    +  val COUNT = "(?i)COUNT".r
    +  val SUM = "(?i)SUM".r
    +  val AND = "(?i)AND".r
    +  val OR = "(?i)OR".r
    +  val NOT = "(?i)NOT".r
    +  val TRUE = "(?i)TRUE".r
    +  val FALSE = "(?i)FALSE".r
    +  val LIKE = "(?i)LIKE".r
    +  val RLIKE = "(?i)RLIKE".r
    +  val REGEXP = "(?i)REGEXP".r
    +  val IN = "(?i)IN".r
    +  val DIV = "(?i)DIV".r
    +  val BETWEEN = "(?i)BETWEEN".r
    +  val WHEN = "(?i)WHEN".r
    +  val CASE = "(?i)CASE".r
    +
    +  protected def nodeToExpr(node: ASTNode): Expression = node match {
    +    /* Attribute References */
    +    case Token("TOK_TABLE_OR_COL", Token(name, Nil) :: Nil) =>
    +      UnresolvedAttribute.quoted(cleanIdentifier(name))
    +    case Token(".", qualifier :: Token(attr, Nil) :: Nil) =>
    +      nodeToExpr(qualifier) match {
    +        case UnresolvedAttribute(nameParts) =>
    +          UnresolvedAttribute(nameParts :+ cleanIdentifier(attr))
    +        case other => UnresolvedExtractValue(other, Literal(attr))
    +      }
    +
    +    /* Stars (*) */
    +    case Token("TOK_ALLCOLREF", Nil) => UnresolvedStar(None)
    +    // The format of dbName.tableName.* cannot be parsed by HiveParser. TOK_TABNAME will only
    +    // has a single child which is tableName.
    +    case Token("TOK_ALLCOLREF", Token("TOK_TABNAME", Token(name, Nil) :: Nil) :: Nil) =>
    +      UnresolvedStar(Some(UnresolvedAttribute.parseAttributeName(name)))
    +
    +    /* Aggregate Functions */
    +    case Token("TOK_FUNCTIONDI", Token(COUNT(), Nil) :: args) =>
    +      Count(args.map(nodeToExpr)).toAggregateExpression(isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(COUNT(), Nil) :: Nil) =>
    +      Count(Literal(1)).toAggregateExpression()
    +
    +    /* Casts */
    +    case Token("TOK_FUNCTION", Token("TOK_STRING", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_VARCHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_CHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_INT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), IntegerType)
    +    case Token("TOK_FUNCTION", Token("TOK_BIGINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), LongType)
    +    case Token("TOK_FUNCTION", Token("TOK_FLOAT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), FloatType)
    +    case Token("TOK_FUNCTION", Token("TOK_DOUBLE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DoubleType)
    +    case Token("TOK_FUNCTION", Token("TOK_SMALLINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ShortType)
    +    case Token("TOK_FUNCTION", Token("TOK_TINYINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ByteType)
    +    case Token("TOK_FUNCTION", Token("TOK_BINARY", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BinaryType)
    +    case Token("TOK_FUNCTION", Token("TOK_BOOLEAN", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BooleanType)
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: scale :: nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, scale.text.toInt))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, 0))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType.USER_DEFAULT)
    +    case Token("TOK_FUNCTION", Token("TOK_TIMESTAMP", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), TimestampType)
    +    case Token("TOK_FUNCTION", Token("TOK_DATE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DateType)
    +
    +    /* Arithmetic */
    +    case Token("+", child :: Nil) => nodeToExpr(child)
    +    case Token("-", child :: Nil) => UnaryMinus(nodeToExpr(child))
    +    case Token("~", child :: Nil) => BitwiseNot(nodeToExpr(child))
    +    case Token("+", left :: right:: Nil) => Add(nodeToExpr(left), nodeToExpr(right))
    +    case Token("-", left :: right:: Nil) => Subtract(nodeToExpr(left), nodeToExpr(right))
    +    case Token("*", left :: right:: Nil) => Multiply(nodeToExpr(left), nodeToExpr(right))
    +    case Token("/", left :: right:: Nil) => Divide(nodeToExpr(left), nodeToExpr(right))
    +    case Token(DIV(), left :: right:: Nil) =>
    +      Cast(Divide(nodeToExpr(left), nodeToExpr(right)), LongType)
    +    case Token("%", left :: right:: Nil) => Remainder(nodeToExpr(left), nodeToExpr(right))
    +    case Token("&", left :: right:: Nil) => BitwiseAnd(nodeToExpr(left), nodeToExpr(right))
    +    case Token("|", left :: right:: Nil) => BitwiseOr(nodeToExpr(left), nodeToExpr(right))
    +    case Token("^", left :: right:: Nil) => BitwiseXor(nodeToExpr(left), nodeToExpr(right))
    +
    +    /* Comparisons */
    +    case Token("=", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("==", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=>", left :: right:: Nil) => EqualNullSafe(nodeToExpr(left), nodeToExpr(right))
    +    case Token("!=", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token("<>", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token(">", left :: right:: Nil) => GreaterThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token(">=", left :: right:: Nil) => GreaterThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<", left :: right:: Nil) => LessThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=", left :: right:: Nil) => LessThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token(LIKE(), left :: right:: Nil) => Like(nodeToExpr(left), nodeToExpr(right))
    +    case Token(RLIKE(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token(REGEXP(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNOTNULL", Nil) :: child :: Nil) =>
    +      IsNotNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNULL", Nil) :: child :: Nil) =>
    +      IsNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token(IN(), Nil) :: value :: list) =>
    +      In(nodeToExpr(value), list.map(nodeToExpr))
    +    case Token("TOK_FUNCTION",
    +    Token(BETWEEN(), Nil) ::
    +      kw ::
    +      target ::
    +      minValue ::
    +      maxValue :: Nil) =>
    +
    +      val targetExpression = nodeToExpr(target)
    +      val betweenExpr =
    +        And(
    +          GreaterThanOrEqual(targetExpression, nodeToExpr(minValue)),
    +          LessThanOrEqual(targetExpression, nodeToExpr(maxValue)))
    +      kw match {
    +        case Token("KW_FALSE", Nil) => betweenExpr
    +        case Token("KW_TRUE", Nil) => Not(betweenExpr)
    +      }
    +
    +    /* Boolean Logic */
    +    case Token(AND(), left :: right:: Nil) => And(nodeToExpr(left), nodeToExpr(right))
    +    case Token(OR(), left :: right:: Nil) => Or(nodeToExpr(left), nodeToExpr(right))
    +    case Token(NOT(), child :: Nil) => Not(nodeToExpr(child))
    +    case Token("!", child :: Nil) => Not(nodeToExpr(child))
    +
    +    /* Case statements */
    +    case Token("TOK_FUNCTION", Token(WHEN(), Nil) :: branches) =>
    +      CaseWhen(branches.map(nodeToExpr))
    +    case Token("TOK_FUNCTION", Token(CASE(), Nil) :: branches) =>
    +      val keyExpr = nodeToExpr(branches.head)
    +      CaseKeyWhen(keyExpr, branches.drop(1).map(nodeToExpr))
    +
    +    /* Complex datatype manipulation */
    +    case Token("[", child :: ordinal :: Nil) =>
    +      UnresolvedExtractValue(nodeToExpr(child), nodeToExpr(ordinal))
    +
    +    /* Window Functions */
    +    case Token(text, args :+ Token("TOK_WINDOWSPEC", spec)) =>
    +      val function = nodeToExpr(node.copy(children = node.children.init))
    +      nodesToWindowSpecification(spec) match {
    +        case reference: WindowSpecReference =>
    +          UnresolvedWindowExpression(function, reference)
    +        case definition: WindowSpecDefinition =>
    +          WindowExpression(function, definition)
    +      }
    +
    +    /* UDFs - Must be last otherwise will preempt built in functions */
    +    case Token("TOK_FUNCTION", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = false)
    +    // Aggregate function with DISTINCT keyword.
    +    case Token("TOK_FUNCTIONDI", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, UnresolvedStar(None) :: Nil, isDistinct = false)
    +
    +    /* Literals */
    +    case Token("TOK_NULL", Nil) => Literal.create(null, NullType)
    +    case Token(TRUE(), Nil) => Literal.create(true, BooleanType)
    +    case Token(FALSE(), Nil) => Literal.create(false, BooleanType)
    +    case Token("TOK_STRINGLITERALSEQUENCE", strings) =>
    +      Literal(strings.map(s => ParseUtils.unescapeSQLString(s.text)).mkString)
    +
    +    // This code is adapted from
    +    // /ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java#L223
    +    case ast: ASTNode if numericAstTypes contains ast.tokenType =>
    +      var v: Literal = null
    +      try {
    +        if (ast.text.endsWith("L")) {
    +          // Literal bigint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toLong, LongType)
    +        } else if (ast.text.endsWith("S")) {
    +          // Literal smallint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toShort, ShortType)
    +        } else if (ast.text.endsWith("Y")) {
    +          // Literal tinyint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toByte, ByteType)
    +        } else if (ast.text.endsWith("BD") || ast.text.endsWith("D")) {
    +          // Literal decimal
    +          val strVal = ast.text.stripSuffix("D").stripSuffix("B")
    +          v = Literal(Decimal(strVal))
    +        } else {
    +          v = Literal.create(ast.text.toDouble, DoubleType)
    +          v = Literal.create(ast.text.toLong, LongType)
    +          v = Literal.create(ast.text.toInt, IntegerType)
    +        }
    +      } catch {
    +        case nfe: NumberFormatException => // Do nothing
    +      }
    +
    +      if (v == null) {
    +        sys.error(s"Failed to parse number '${ast.text}'.")
    +      } else {
    +        v
    +      }
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.StringLiteral =>
    +      Literal(ParseUtils.unescapeSQLString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_DATELITERAL =>
    +      Literal(Date.valueOf(ast.text.substring(1, ast.text.length - 1)))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_CHARSETLITERAL =>
    +      Literal(ParseUtils.charSetString(ast.children.head.text, ast.children(1).text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromYearMonthString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_TIME_LITERAL =>
    +      Literal(CalendarInterval.fromDayTimeString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("year", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("month", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("day", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_HOUR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("hour", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MINUTE_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("minute", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_SECOND_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("second", ast.text))
    +
    +    case _ =>
    +      noParseRule("Expression", node)
    +  }
    +
    +  /* Case insensitive matches for Window Specification */
    +  val PRECEDING = "(?i)preceding".r
    +  val FOLLOWING = "(?i)following".r
    +  val CURRENT = "(?i)current".r
    +  protected def nodesToWindowSpecification(nodes: Seq[ASTNode]): WindowSpec = nodes match {
    +    case Token(windowName, Nil) :: Nil =>
    +      // Refer to a window spec defined in the window clause.
    +      WindowSpecReference(windowName)
    +    case Nil =>
    +      // OVER()
    +      WindowSpecDefinition(
    +        partitionSpec = Nil,
    +        orderSpec = Nil,
    +        frameSpecification = UnspecifiedFrame)
    +    case spec =>
    +      val (partitionClause :: rowFrame :: rangeFrame :: Nil) =
    +        getClauses(
    +          Seq(
    +            "TOK_PARTITIONINGSPEC",
    +            "TOK_WINDOWRANGE",
    +            "TOK_WINDOWVALUES"),
    +          spec)
    +
    +      // Handle Partition By and Order By.
    +      val (partitionSpec, orderSpec) = partitionClause.map { partitionAndOrdering =>
    +        val (partitionByClause :: orderByClause :: sortByClause :: clusterByClause :: Nil) =
    +          getClauses(
    +            Seq("TOK_DISTRIBUTEBY", "TOK_ORDERBY", "TOK_SORTBY", "TOK_CLUSTERBY"),
    +            partitionAndOrdering.children)
    +
    +        (partitionByClause, orderByClause.orElse(sortByClause), clusterByClause) match {
    +          case (Some(partitionByExpr), Some(orderByExpr), None) =>
    +            (partitionByExpr.children.map(nodeToExpr),
    +              orderByExpr.children.map(nodeToSortOrder))
    +          case (Some(partitionByExpr), None, None) =>
    +            (partitionByExpr.children.map(nodeToExpr), Nil)
    +          case (None, Some(orderByExpr), None) =>
    +            (Nil, orderByExpr.children.map(nodeToSortOrder))
    +          case (None, None, Some(clusterByExpr)) =>
    +            val expressions = clusterByExpr.children.map(nodeToExpr)
    +            (expressions, expressions.map(SortOrder(_, Ascending)))
    +          case _ =>
    +            noParseRule("Partition & Ordering", partitionAndOrdering)
    +        }
    +      }.getOrElse {
    +        (Nil, Nil)
    +      }
    +
    +      // Handle Window Frame
    +      val windowFrame =
    +        if (rowFrame.isEmpty && rangeFrame.isEmpty) {
    +          UnspecifiedFrame
    +        } else {
    +          val frameType = rowFrame.map(_ => RowFrame).getOrElse(RangeFrame)
    +          def nodeToBoundary(node: ASTNode): FrameBoundary = node match {
    +            case Token(PRECEDING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedPreceding
    +              } else {
    +                ValuePreceding(count.toInt)
    +              }
    +            case Token(FOLLOWING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedFollowing
    +              } else {
    +                ValueFollowing(count.toInt)
    +              }
    +            case Token(CURRENT(), Nil) => CurrentRow
    +            case _ =>
    +              noParseRule("Window Frame Boundary", node)
    +          }
    +
    +          rowFrame.orElse(rangeFrame).map { frame =>
    +            frame.children match {
    +              case precedingNode :: followingNode :: Nil =>
    +                SpecifiedWindowFrame(
    +                  frameType,
    +                  nodeToBoundary(precedingNode),
    +                  nodeToBoundary(followingNode))
    +              case precedingNode :: Nil =>
    +                SpecifiedWindowFrame(frameType, nodeToBoundary(precedingNode), CurrentRow)
    +              case _ =>
    +                noParseRule("Window Frame", frame)
    +            }
    +          }.getOrElse(sys.error(s"If you see this, please file a bug report with your query."))
    +        }
    +
    +      WindowSpecDefinition(partitionSpec, orderSpec, windowFrame)
    +  }
    +
    +  protected def nodeToTransformation(
    +      node: ASTNode,
    +      child: LogicalPlan): Option[ScriptTransformation] = None
    +
    +  protected def nodeToGenerate(node: ASTNode, outer: Boolean, child: LogicalPlan): Generate = {
    +    val Token("TOK_SELECT", Token("TOK_SELEXPR", clauses) :: Nil) = node
    +
    +    val alias = getClause("TOK_TABALIAS", clauses).children.head.text
    +
    +    val generator = clauses.head match {
    +      case Token("TOK_FUNCTION", Token(functionName, Nil) :: children) =>
    +        UnresolvedGenerator(functionName, children.map(nodeToExpr))
    --- End diff --
    
    @cloud-fan we currently also support the use of all of Hive's UDTF's:
    - GenericUDTFInline
    - GenericUDTFParseUrlTuple
    - GenericUDTFPosExplode
    - GenericUDTFStack
    
    I didn't like the fact that HiveQl was touching Hive's FunctionRegistry (seems like the wrong place to do it), so I came up with the ```UnresolvedGenerator``` path. The added bonus of this was that I could remove the two hardcoded generators from the code.
    
    I assumed that we wanted support for ```LATERAL VIEW``` from the ground up. I am fine leaving this a Hive-only feature for now; so I am going to revert the ```UnresolvedGenerator``` path.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48951342
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
    @@ -451,6 +452,19 @@ private[spark] object SQLConf {
         doc = "When true, we could use `datasource`.`path` as table in SQL query"
       )
     
    +  val PARSER_SUPPORT_QUOTEDID = booleanConf("spark.sql.parser.supportQuotedIdentifiers",
    +    defaultValue = Some(true),
    +    isPublic = false,
    +    doc = "Whether to use quoted identifier.\n  false: default(past) behavior. Implies only" +
    +      "alphaNumeric and underscore are valid characters in identifiers.\n" +
    +      "  true: implies column names can contain any character.")
    +
    +  val PARSER_SUPPORT_SQL11_RESERVED_KEYWORDS = booleanConf(
    +    "spark.sql.parser.supportSQL11ReservedKeywords",
    +    defaultValue = Some(false),
    --- End diff --
    
    Some of our out-dated Hive Compatibility Tests are using SQL11 reserved keywords as identifiers. We should first fix those before we can set this flag to true.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169305716
  
    **[Test build #48853 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48853/consoleFull)** for PR 10583 at commit [`3680d4c`](https://github.com/apache/spark/commit/3680d4c4179d63584a42f6d33bbe3c718fa3075d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48919389
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/parser/ParseUtils.java ---
    @@ -0,0 +1,163 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.parser;
    +
    +import java.io.UnsupportedEncodingException;
    +
    +/**
    + * A couple of utility methods that help with parsing ASTs.
    + *
    + * Both methods in this class were take from the SemanticAnalyzer in Hive:
    + * ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
    + */
    +public final class ParseUtils {
    +  private ParseUtils() {
    +    super();
    +  }
    +
    +  public static String charSetString(String charSetName, String charSetString)
    --- End diff --
    
    are there unit tests from Hive that we can take?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48940364
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,969 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +  // TODO improve the parse error - so we don't need this anymore.
    +  val errorRegEx = "line (\\d+):(\\d+) (.*)".r
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case pe: ParseException =>
    +        pe.getMessage match {
    +          case errorRegEx(line, start, message) =>
    +            throw new AnalysisException(message, Some(line.toInt), Some(start.toInt))
    +          case otherMessage =>
    +            throw new AnalysisException(otherMessage)
    +        }
    +      case e: MatchError => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree =
    +      try {
    +        getAst(ddl)
    +      } catch {
    +        case pe: ParseException =>
    +          throw new RuntimeException(s"Failed to parse ddl: '$ddl'", pe)
    +      }
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", columns) =>
    +        columns.foldLeft(0)((bitmap, col) => {
    +          val keyIndex = keyMap.find(_._1.treeEquals(col)).map(_._2)
    +          bitmap | 1 << keyIndex.getOrElse(
    +            throw new AnalysisException(s"${col.treeString} doesn't show up in the GROUP BY list"))
    +        })
    +      case _ => sys.error("Expect GROUPING SETS clause")
    +    }
    +
    +    (keys, bitmasks)
    +  }
    +
    +  protected def nodeToPlan(node: ASTNode): LogicalPlan = node match {
    +    case Token("TOK_QUERY", queryArgs @ Token("TOK_CTE" | "TOK_FROM" | "TOK_INSERT", _) :: _) =>
    +      val (fromClause: Option[ASTNode], insertClauses, cteRelations) =
    +        queryArgs match {
    +          case Token("TOK_CTE", ctes) :: Token("TOK_FROM", from) :: inserts =>
    +            val cteRelations = ctes.map { node =>
    +              val relation = nodeToRelation(node).asInstanceOf[Subquery]
    +              relation.alias -> relation
    +            }
    +            (Some(from.head), inserts, Some(cteRelations.toMap))
    +          case Token("TOK_FROM", from) :: inserts =>
    +            (Some(from.head), inserts, None)
    +          case Token("TOK_INSERT", _) :: Nil =>
    +            (None, queryArgs, None)
    +        }
    +
    +      // Return one query for each insert clause.
    +      val queries = insertClauses.map {
    +        case Token("TOK_INSERT", singleInsert) =>
    +          val (
    +            intoClause ::
    +              destClause ::
    +              selectClause ::
    +              selectDistinctClause ::
    +              whereClause ::
    +              groupByClause ::
    +              rollupGroupByClause ::
    +              cubeGroupByClause ::
    +              groupingSetsClause ::
    +              orderByClause ::
    +              havingClause ::
    +              sortByClause ::
    +              clusterByClause ::
    +              distributeByClause ::
    +              limitClause ::
    +              lateralViewClause ::
    +              windowClause :: Nil) = {
    +            getClauses(
    +              Seq(
    +                "TOK_INSERT_INTO",
    +                "TOK_DESTINATION",
    +                "TOK_SELECT",
    +                "TOK_SELECTDI",
    +                "TOK_WHERE",
    +                "TOK_GROUPBY",
    +                "TOK_ROLLUP_GROUPBY",
    +                "TOK_CUBE_GROUPBY",
    +                "TOK_GROUPING_SETS",
    +                "TOK_ORDERBY",
    +                "TOK_HAVING",
    +                "TOK_SORTBY",
    +                "TOK_CLUSTERBY",
    +                "TOK_DISTRIBUTEBY",
    +                "TOK_LIMIT",
    +                "TOK_LATERAL_VIEW",
    +                "WINDOW"),
    +              singleInsert)
    +          }
    +
    +          val relations = fromClause match {
    +            case Some(f) => nodeToRelation(f)
    +            case None => OneRowRelation
    +          }
    +
    +          val withWhere = whereClause.map { whereNode =>
    +            val Seq(whereExpr) = whereNode.children
    +            Filter(nodeToExpr(whereExpr), relations)
    +          }.getOrElse(relations)
    +
    +          val select = (selectClause orElse selectDistinctClause)
    +            .getOrElse(sys.error("No select clause."))
    +
    +          val transformation = nodeToTransformation(select.children.head, withWhere)
    +
    +          val withLateralView = lateralViewClause.map { lv =>
    +            nodeToGenerate(lv.children.head, outer = false, withWhere)
    +          }.getOrElse(withWhere)
    +
    +          // The projection of the query can either be a normal projection, an aggregation
    +          // (if there is a group by) or a script transformation.
    +          val withProject: LogicalPlan = transformation.getOrElse {
    +            val selectExpressions =
    +              select.children.flatMap(selExprNodeToExpr).map(UnresolvedAlias(_))
    +            Seq(
    +              groupByClause.map(e => e match {
    +                case Token("TOK_GROUPBY", children) =>
    +                  // Not a transformation so must be either project or aggregation.
    +                  Aggregate(children.map(nodeToExpr), selectExpressions, withLateralView)
    +                case _ => sys.error("Expect GROUP BY")
    +              }),
    +              groupingSetsClause.map(e => e match {
    +                case Token("TOK_GROUPING_SETS", children) =>
    +                  val(groupByExprs, masks) = extractGroupingSet(children)
    +                  GroupingSets(masks, groupByExprs, withLateralView, selectExpressions)
    +                case _ => sys.error("Expect GROUPING SETS")
    +              }),
    +              rollupGroupByClause.map(e => e match {
    +                case Token("TOK_ROLLUP_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Rollup(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH ROLLUP")
    +              }),
    +              cubeGroupByClause.map(e => e match {
    +                case Token("TOK_CUBE_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Cube(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH CUBE")
    +              }),
    +              Some(Project(selectExpressions, withLateralView))).flatten.head
    +          }
    +
    +          // Handle HAVING clause.
    +          val withHaving = havingClause.map { h =>
    +            val havingExpr = h.children match { case Seq(hexpr) => nodeToExpr(hexpr) }
    +            // Note that we added a cast to boolean. If the expression itself is already boolean,
    +            // the optimizer will get rid of the unnecessary cast.
    +            Filter(Cast(havingExpr, BooleanType), withProject)
    +          }.getOrElse(withProject)
    +
    +          // Handle SELECT DISTINCT
    +          val withDistinct =
    +            if (selectDistinctClause.isDefined) Distinct(withHaving) else withHaving
    +
    +          // Handle ORDER BY, SORT BY, DISTRIBUTE BY, and CLUSTER BY clause.
    +          val withSort =
    +            (orderByClause, sortByClause, distributeByClause, clusterByClause) match {
    +              case (Some(totalOrdering), None, None, None) =>
    +                Sort(totalOrdering.children.map(nodeToSortOrder), global = true, withDistinct)
    +              case (None, Some(perPartitionOrdering), None, None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder),
    +                  global = false, withDistinct)
    +              case (None, None, Some(partitionExprs), None) =>
    +                RepartitionByExpression(
    +                  partitionExprs.children.map(nodeToExpr), withDistinct)
    +              case (None, Some(perPartitionOrdering), Some(partitionExprs), None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder), global = false,
    +                  RepartitionByExpression(
    +                    partitionExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, Some(clusterExprs)) =>
    +                Sort(
    +                  clusterExprs.children.map(nodeToExpr).map(SortOrder(_, Ascending)),
    +                  global = false,
    +                  RepartitionByExpression(
    +                    clusterExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, None) => withDistinct
    +              case _ => sys.error("Unsupported set of ordering / distribution clauses.")
    +            }
    +
    +          val withLimit =
    +            limitClause.map(l => nodeToExpr(l.children.head))
    +              .map(Limit(_, withSort))
    +              .getOrElse(withSort)
    +
    +          // Collect all window specifications defined in the WINDOW clause.
    +          val windowDefinitions = windowClause.map(_.children.collect {
    +            case Token("TOK_WINDOWDEF",
    +            Token(windowName, Nil) :: Token("TOK_WINDOWSPEC", spec) :: Nil) =>
    +              windowName -> nodesToWindowSpecification(spec)
    +          }.toMap)
    +          // Handle cases like
    +          // window w1 as (partition by p_mfgr order by p_name
    +          //               range between 2 preceding and 2 following),
    +          //        w2 as w1
    +          val resolvedCrossReference = windowDefinitions.map {
    +            windowDefMap => windowDefMap.map {
    +              case (windowName, WindowSpecReference(other)) =>
    +                (windowName, windowDefMap(other).asInstanceOf[WindowSpecDefinition])
    +              case o => o.asInstanceOf[(String, WindowSpecDefinition)]
    +            }
    +          }
    +
    +          val withWindowDefinitions =
    +            resolvedCrossReference.map(WithWindowDefinition(_, withLimit)).getOrElse(withLimit)
    +
    +          // TOK_INSERT_INTO means to add files to the table.
    +          // TOK_DESTINATION means to overwrite the table.
    +          val resultDestination =
    +            (intoClause orElse destClause).getOrElse(sys.error("No destination found."))
    +          val overwrite = intoClause.isEmpty
    +          nodeToDest(
    +            resultDestination,
    +            withWindowDefinitions,
    +            overwrite)
    +      }
    +
    +      // If there are multiple INSERTS just UNION them together into on query.
    +      val query = queries.reduceLeft(Union)
    +
    +      // return With plan if there is CTE
    +      cteRelations.map(With(query, _)).getOrElse(query)
    +
    +    // HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
    +    case Token("TOK_UNIONALL", left :: right :: Nil) =>
    +      Union(nodeToPlan(left), nodeToPlan(right))
    +
    +    case _ =>
    +      noParseRule("Plan", node)
    +  }
    +
    +  val allJoinTokens = "(TOK_.*JOIN)".r
    +  val laterViewToken = "TOK_LATERAL_VIEW(.*)".r
    +  protected def nodeToRelation(node: ASTNode): LogicalPlan = {
    +    node match {
    +      case Token("TOK_SUBQUERY", query :: Token(alias, Nil) :: Nil) =>
    +        Subquery(cleanIdentifier(alias), nodeToPlan(query))
    +
    +      case Token(laterViewToken(isOuter), selectClause :: relationClause :: Nil) =>
    +        nodeToGenerate(
    +          selectClause,
    +          outer = isOuter.nonEmpty,
    +          nodeToRelation(relationClause))
    +
    +      /* All relations, possibly with aliases or sampling clauses. */
    +      case Token("TOK_TABREF", clauses) =>
    +        // If the last clause is not a token then it's the alias of the table.
    +        val (nonAliasClauses, aliasClause) =
    +          if (clauses.last.text.startsWith("TOK")) {
    +            (clauses, None)
    +          } else {
    +            (clauses.dropRight(1), Some(clauses.last))
    +          }
    +
    +        val (Some(tableNameParts) ::
    +          splitSampleClause ::
    +          bucketSampleClause :: Nil) = {
    +          getClauses(Seq("TOK_TABNAME", "TOK_TABLESPLITSAMPLE", "TOK_TABLEBUCKETSAMPLE"),
    +            nonAliasClauses)
    +        }
    +
    +        val tableIdent = extractTableIdent(tableNameParts)
    +        val alias = aliasClause.map { case Token(a, Nil) => cleanIdentifier(a) }
    +        val relation = UnresolvedRelation(tableIdent, alias)
    +
    +        // Apply sampling if requested.
    +        (bucketSampleClause orElse splitSampleClause).map {
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_ROWCOUNT", Nil) :: Token(count, Nil) :: Nil) =>
    +            Limit(Literal(count.toInt), relation)
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_PERCENT", Nil) :: Token(fraction, Nil) :: Nil) =>
    +            // The range of fraction accepted by Sample is [0, 1]. Because Hive's block sampling
    +            // function takes X PERCENT as the input and the range of X is [0, 100], we need to
    +            // adjust the fraction.
    +            require(
    +              fraction.toDouble >= (0.0 - RandomSampler.roundingEpsilon)
    +                && fraction.toDouble <= (100.0 + RandomSampler.roundingEpsilon),
    +              s"Sampling fraction ($fraction) must be on interval [0, 100]")
    +            Sample(0.0, fraction.toDouble / 100, withReplacement = false,
    +              (math.random * 1000).toInt,
    +              relation)
    +          case Token("TOK_TABLEBUCKETSAMPLE",
    +          Token(numerator, Nil) ::
    +            Token(denominator, Nil) :: Nil) =>
    +            val fraction = numerator.toDouble / denominator.toDouble
    +            Sample(0.0, fraction, withReplacement = false, (math.random * 1000).toInt, relation)
    +          case a =>
    +            noParseRule("Sampling", a)
    +        }.getOrElse(relation)
    +
    +      case Token(allJoinTokens(joinToken), relation1 :: relation2 :: other) =>
    +        if (!(other.size <= 1)) {
    +          sys.error(s"Unsupported join operation: $other")
    +        }
    +
    +        val joinType = joinToken match {
    +          case "TOK_JOIN" => Inner
    +          case "TOK_CROSSJOIN" => Inner
    +          case "TOK_RIGHTOUTERJOIN" => RightOuter
    +          case "TOK_LEFTOUTERJOIN" => LeftOuter
    +          case "TOK_FULLOUTERJOIN" => FullOuter
    +          case "TOK_LEFTSEMIJOIN" => LeftSemi
    +          case "TOK_UNIQUEJOIN" => noParseRule("Unique Join", node)
    +          case "TOK_ANTIJOIN" => noParseRule("Anti Join", node)
    +        }
    +        Join(nodeToRelation(relation1),
    +          nodeToRelation(relation2),
    +          joinType,
    +          other.headOption.map(nodeToExpr))
    +
    +      case _ =>
    +        noParseRule("Relation", node)
    +    }
    +  }
    +
    +  protected def nodeToSortOrder(node: ASTNode): SortOrder = node match {
    +    case Token("TOK_TABSORTCOLNAMEASC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Ascending)
    +    case Token("TOK_TABSORTCOLNAMEDESC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Descending)
    +    case _ =>
    +      noParseRule("SortOrder", node)
    +  }
    +
    +  val destinationToken = "TOK_DESTINATION|TOK_INSERT_INTO".r
    +  protected def nodeToDest(
    +      node: ASTNode,
    +      query: LogicalPlan,
    +      overwrite: Boolean): LogicalPlan = node match {
    +    case Token(destinationToken(),
    +    Token("TOK_DIR",
    +    Token("TOK_TMP_FILE", Nil) :: Nil) :: Nil) =>
    +      query
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = false)
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) ::
    +      Token("TOK_IFNOTEXISTS",
    +      ifNotExists) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = true)
    +
    +    case _ =>
    +      noParseRule("Destination", node)
    +  }
    +
    +  protected def selExprNodeToExpr(node: ASTNode): Option[Expression] = node match {
    +    case Token("TOK_SELEXPR", e :: Nil) =>
    +      Some(nodeToExpr(e))
    +
    +    case Token("TOK_SELEXPR", e :: Token(alias, Nil) :: Nil) =>
    +      Some(Alias(nodeToExpr(e), cleanIdentifier(alias))())
    +
    +    case Token("TOK_SELEXPR", e :: aliasChildren) =>
    +      val aliasNames = aliasChildren.collect {
    +        case Token(name, Nil) => cleanIdentifier(name)
    +      }
    +      Some(MultiAlias(nodeToExpr(e), aliasNames))
    +
    +    /* Hints are ignored */
    +    case Token("TOK_HINTLIST", _) => None
    +
    +    case _ =>
    +      noParseRule("Select", node)
    +  }
    +
    +  protected val escapedIdentifier = "`([^`]+)`".r
    +  protected val doubleQuotedString = "\"([^\"]+)\"".r
    +  protected val singleQuotedString = "'([^']+)'".r
    +
    +  protected def unquoteString(str: String) = str match {
    +    case singleQuotedString(s) => s
    +    case doubleQuotedString(s) => s
    +    case other => other
    +  }
    +
    +  /** Strips backticks from ident if present */
    +  protected def cleanIdentifier(ident: String): String = ident match {
    +    case escapedIdentifier(i) => i
    +    case plainIdent => plainIdent
    +  }
    +
    +  val numericAstTypes = Seq(
    +    SparkSqlParser.Number,
    +    SparkSqlParser.TinyintLiteral,
    +    SparkSqlParser.SmallintLiteral,
    +    SparkSqlParser.BigintLiteral,
    +    SparkSqlParser.DecimalLiteral)
    +
    +  /* Case insensitive matches */
    +  val COUNT = "(?i)COUNT".r
    +  val SUM = "(?i)SUM".r
    +  val AND = "(?i)AND".r
    +  val OR = "(?i)OR".r
    +  val NOT = "(?i)NOT".r
    +  val TRUE = "(?i)TRUE".r
    +  val FALSE = "(?i)FALSE".r
    +  val LIKE = "(?i)LIKE".r
    +  val RLIKE = "(?i)RLIKE".r
    +  val REGEXP = "(?i)REGEXP".r
    +  val IN = "(?i)IN".r
    +  val DIV = "(?i)DIV".r
    +  val BETWEEN = "(?i)BETWEEN".r
    +  val WHEN = "(?i)WHEN".r
    +  val CASE = "(?i)CASE".r
    +
    +  protected def nodeToExpr(node: ASTNode): Expression = node match {
    +    /* Attribute References */
    +    case Token("TOK_TABLE_OR_COL", Token(name, Nil) :: Nil) =>
    +      UnresolvedAttribute.quoted(cleanIdentifier(name))
    +    case Token(".", qualifier :: Token(attr, Nil) :: Nil) =>
    +      nodeToExpr(qualifier) match {
    +        case UnresolvedAttribute(nameParts) =>
    +          UnresolvedAttribute(nameParts :+ cleanIdentifier(attr))
    +        case other => UnresolvedExtractValue(other, Literal(attr))
    +      }
    +
    +    /* Stars (*) */
    +    case Token("TOK_ALLCOLREF", Nil) => UnresolvedStar(None)
    +    // The format of dbName.tableName.* cannot be parsed by HiveParser. TOK_TABNAME will only
    +    // has a single child which is tableName.
    +    case Token("TOK_ALLCOLREF", Token("TOK_TABNAME", Token(name, Nil) :: Nil) :: Nil) =>
    +      UnresolvedStar(Some(UnresolvedAttribute.parseAttributeName(name)))
    +
    +    /* Aggregate Functions */
    +    case Token("TOK_FUNCTIONDI", Token(COUNT(), Nil) :: args) =>
    +      Count(args.map(nodeToExpr)).toAggregateExpression(isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(COUNT(), Nil) :: Nil) =>
    +      Count(Literal(1)).toAggregateExpression()
    +
    +    /* Casts */
    +    case Token("TOK_FUNCTION", Token("TOK_STRING", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_VARCHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_CHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_INT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), IntegerType)
    +    case Token("TOK_FUNCTION", Token("TOK_BIGINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), LongType)
    +    case Token("TOK_FUNCTION", Token("TOK_FLOAT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), FloatType)
    +    case Token("TOK_FUNCTION", Token("TOK_DOUBLE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DoubleType)
    +    case Token("TOK_FUNCTION", Token("TOK_SMALLINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ShortType)
    +    case Token("TOK_FUNCTION", Token("TOK_TINYINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ByteType)
    +    case Token("TOK_FUNCTION", Token("TOK_BINARY", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BinaryType)
    +    case Token("TOK_FUNCTION", Token("TOK_BOOLEAN", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BooleanType)
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: scale :: nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, scale.text.toInt))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, 0))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType.USER_DEFAULT)
    +    case Token("TOK_FUNCTION", Token("TOK_TIMESTAMP", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), TimestampType)
    +    case Token("TOK_FUNCTION", Token("TOK_DATE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DateType)
    +
    +    /* Arithmetic */
    +    case Token("+", child :: Nil) => nodeToExpr(child)
    +    case Token("-", child :: Nil) => UnaryMinus(nodeToExpr(child))
    +    case Token("~", child :: Nil) => BitwiseNot(nodeToExpr(child))
    +    case Token("+", left :: right:: Nil) => Add(nodeToExpr(left), nodeToExpr(right))
    +    case Token("-", left :: right:: Nil) => Subtract(nodeToExpr(left), nodeToExpr(right))
    +    case Token("*", left :: right:: Nil) => Multiply(nodeToExpr(left), nodeToExpr(right))
    +    case Token("/", left :: right:: Nil) => Divide(nodeToExpr(left), nodeToExpr(right))
    +    case Token(DIV(), left :: right:: Nil) =>
    +      Cast(Divide(nodeToExpr(left), nodeToExpr(right)), LongType)
    +    case Token("%", left :: right:: Nil) => Remainder(nodeToExpr(left), nodeToExpr(right))
    +    case Token("&", left :: right:: Nil) => BitwiseAnd(nodeToExpr(left), nodeToExpr(right))
    +    case Token("|", left :: right:: Nil) => BitwiseOr(nodeToExpr(left), nodeToExpr(right))
    +    case Token("^", left :: right:: Nil) => BitwiseXor(nodeToExpr(left), nodeToExpr(right))
    +
    +    /* Comparisons */
    +    case Token("=", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("==", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=>", left :: right:: Nil) => EqualNullSafe(nodeToExpr(left), nodeToExpr(right))
    +    case Token("!=", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token("<>", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token(">", left :: right:: Nil) => GreaterThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token(">=", left :: right:: Nil) => GreaterThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<", left :: right:: Nil) => LessThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=", left :: right:: Nil) => LessThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token(LIKE(), left :: right:: Nil) => Like(nodeToExpr(left), nodeToExpr(right))
    +    case Token(RLIKE(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token(REGEXP(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNOTNULL", Nil) :: child :: Nil) =>
    +      IsNotNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNULL", Nil) :: child :: Nil) =>
    +      IsNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token(IN(), Nil) :: value :: list) =>
    +      In(nodeToExpr(value), list.map(nodeToExpr))
    +    case Token("TOK_FUNCTION",
    +    Token(BETWEEN(), Nil) ::
    +      kw ::
    +      target ::
    +      minValue ::
    +      maxValue :: Nil) =>
    +
    +      val targetExpression = nodeToExpr(target)
    +      val betweenExpr =
    +        And(
    +          GreaterThanOrEqual(targetExpression, nodeToExpr(minValue)),
    +          LessThanOrEqual(targetExpression, nodeToExpr(maxValue)))
    +      kw match {
    +        case Token("KW_FALSE", Nil) => betweenExpr
    +        case Token("KW_TRUE", Nil) => Not(betweenExpr)
    +      }
    +
    +    /* Boolean Logic */
    +    case Token(AND(), left :: right:: Nil) => And(nodeToExpr(left), nodeToExpr(right))
    +    case Token(OR(), left :: right:: Nil) => Or(nodeToExpr(left), nodeToExpr(right))
    +    case Token(NOT(), child :: Nil) => Not(nodeToExpr(child))
    +    case Token("!", child :: Nil) => Not(nodeToExpr(child))
    +
    +    /* Case statements */
    +    case Token("TOK_FUNCTION", Token(WHEN(), Nil) :: branches) =>
    +      CaseWhen(branches.map(nodeToExpr))
    +    case Token("TOK_FUNCTION", Token(CASE(), Nil) :: branches) =>
    +      val keyExpr = nodeToExpr(branches.head)
    +      CaseKeyWhen(keyExpr, branches.drop(1).map(nodeToExpr))
    +
    +    /* Complex datatype manipulation */
    +    case Token("[", child :: ordinal :: Nil) =>
    +      UnresolvedExtractValue(nodeToExpr(child), nodeToExpr(ordinal))
    +
    +    /* Window Functions */
    +    case Token(text, args :+ Token("TOK_WINDOWSPEC", spec)) =>
    +      val function = nodeToExpr(node.copy(children = node.children.init))
    +      nodesToWindowSpecification(spec) match {
    +        case reference: WindowSpecReference =>
    +          UnresolvedWindowExpression(function, reference)
    +        case definition: WindowSpecDefinition =>
    +          WindowExpression(function, definition)
    +      }
    +
    +    /* UDFs - Must be last otherwise will preempt built in functions */
    +    case Token("TOK_FUNCTION", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = false)
    +    // Aggregate function with DISTINCT keyword.
    +    case Token("TOK_FUNCTIONDI", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, UnresolvedStar(None) :: Nil, isDistinct = false)
    +
    +    /* Literals */
    +    case Token("TOK_NULL", Nil) => Literal.create(null, NullType)
    +    case Token(TRUE(), Nil) => Literal.create(true, BooleanType)
    +    case Token(FALSE(), Nil) => Literal.create(false, BooleanType)
    +    case Token("TOK_STRINGLITERALSEQUENCE", strings) =>
    +      Literal(strings.map(s => ParseUtils.unescapeSQLString(s.text)).mkString)
    +
    +    // This code is adapted from
    +    // /ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java#L223
    +    case ast: ASTNode if numericAstTypes contains ast.tokenType =>
    +      var v: Literal = null
    +      try {
    +        if (ast.text.endsWith("L")) {
    +          // Literal bigint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toLong, LongType)
    +        } else if (ast.text.endsWith("S")) {
    +          // Literal smallint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toShort, ShortType)
    +        } else if (ast.text.endsWith("Y")) {
    +          // Literal tinyint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toByte, ByteType)
    +        } else if (ast.text.endsWith("BD") || ast.text.endsWith("D")) {
    +          // Literal decimal
    +          val strVal = ast.text.stripSuffix("D").stripSuffix("B")
    +          v = Literal(Decimal(strVal))
    +        } else {
    +          v = Literal.create(ast.text.toDouble, DoubleType)
    +          v = Literal.create(ast.text.toLong, LongType)
    +          v = Literal.create(ast.text.toInt, IntegerType)
    +        }
    +      } catch {
    +        case nfe: NumberFormatException => // Do nothing
    +      }
    +
    +      if (v == null) {
    +        sys.error(s"Failed to parse number '${ast.text}'.")
    +      } else {
    +        v
    +      }
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.StringLiteral =>
    +      Literal(ParseUtils.unescapeSQLString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_DATELITERAL =>
    +      Literal(Date.valueOf(ast.text.substring(1, ast.text.length - 1)))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_CHARSETLITERAL =>
    +      Literal(ParseUtils.charSetString(ast.children.head.text, ast.children(1).text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromYearMonthString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_TIME_LITERAL =>
    +      Literal(CalendarInterval.fromDayTimeString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("year", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("month", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("day", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_HOUR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("hour", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MINUTE_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("minute", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_SECOND_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("second", ast.text))
    +
    +    case _ =>
    +      noParseRule("Expression", node)
    +  }
    +
    +  /* Case insensitive matches for Window Specification */
    +  val PRECEDING = "(?i)preceding".r
    +  val FOLLOWING = "(?i)following".r
    +  val CURRENT = "(?i)current".r
    +  protected def nodesToWindowSpecification(nodes: Seq[ASTNode]): WindowSpec = nodes match {
    +    case Token(windowName, Nil) :: Nil =>
    +      // Refer to a window spec defined in the window clause.
    +      WindowSpecReference(windowName)
    +    case Nil =>
    +      // OVER()
    +      WindowSpecDefinition(
    +        partitionSpec = Nil,
    +        orderSpec = Nil,
    +        frameSpecification = UnspecifiedFrame)
    +    case spec =>
    +      val (partitionClause :: rowFrame :: rangeFrame :: Nil) =
    +        getClauses(
    +          Seq(
    +            "TOK_PARTITIONINGSPEC",
    +            "TOK_WINDOWRANGE",
    +            "TOK_WINDOWVALUES"),
    +          spec)
    +
    +      // Handle Partition By and Order By.
    +      val (partitionSpec, orderSpec) = partitionClause.map { partitionAndOrdering =>
    +        val (partitionByClause :: orderByClause :: sortByClause :: clusterByClause :: Nil) =
    +          getClauses(
    +            Seq("TOK_DISTRIBUTEBY", "TOK_ORDERBY", "TOK_SORTBY", "TOK_CLUSTERBY"),
    +            partitionAndOrdering.children)
    +
    +        (partitionByClause, orderByClause.orElse(sortByClause), clusterByClause) match {
    +          case (Some(partitionByExpr), Some(orderByExpr), None) =>
    +            (partitionByExpr.children.map(nodeToExpr),
    +              orderByExpr.children.map(nodeToSortOrder))
    +          case (Some(partitionByExpr), None, None) =>
    +            (partitionByExpr.children.map(nodeToExpr), Nil)
    +          case (None, Some(orderByExpr), None) =>
    +            (Nil, orderByExpr.children.map(nodeToSortOrder))
    +          case (None, None, Some(clusterByExpr)) =>
    +            val expressions = clusterByExpr.children.map(nodeToExpr)
    +            (expressions, expressions.map(SortOrder(_, Ascending)))
    +          case _ =>
    +            noParseRule("Partition & Ordering", partitionAndOrdering)
    +        }
    +      }.getOrElse {
    +        (Nil, Nil)
    +      }
    +
    +      // Handle Window Frame
    +      val windowFrame =
    +        if (rowFrame.isEmpty && rangeFrame.isEmpty) {
    +          UnspecifiedFrame
    +        } else {
    +          val frameType = rowFrame.map(_ => RowFrame).getOrElse(RangeFrame)
    +          def nodeToBoundary(node: ASTNode): FrameBoundary = node match {
    +            case Token(PRECEDING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedPreceding
    +              } else {
    +                ValuePreceding(count.toInt)
    +              }
    +            case Token(FOLLOWING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedFollowing
    +              } else {
    +                ValueFollowing(count.toInt)
    +              }
    +            case Token(CURRENT(), Nil) => CurrentRow
    +            case _ =>
    +              noParseRule("Window Frame Boundary", node)
    +          }
    +
    +          rowFrame.orElse(rangeFrame).map { frame =>
    +            frame.children match {
    +              case precedingNode :: followingNode :: Nil =>
    +                SpecifiedWindowFrame(
    +                  frameType,
    +                  nodeToBoundary(precedingNode),
    +                  nodeToBoundary(followingNode))
    +              case precedingNode :: Nil =>
    +                SpecifiedWindowFrame(frameType, nodeToBoundary(precedingNode), CurrentRow)
    +              case _ =>
    +                noParseRule("Window Frame", frame)
    +            }
    +          }.getOrElse(sys.error(s"If you see this, please file a bug report with your query."))
    +        }
    +
    +      WindowSpecDefinition(partitionSpec, orderSpec, windowFrame)
    +  }
    +
    +  protected def nodeToTransformation(
    +      node: ASTNode,
    +      child: LogicalPlan): Option[ScriptTransformation] = None
    +
    +  protected def nodeToGenerate(node: ASTNode, outer: Boolean, child: LogicalPlan): Generate = {
    +    val Token("TOK_SELECT", Token("TOK_SELEXPR", clauses) :: Nil) = node
    +
    +    val alias = getClause("TOK_TABALIAS", clauses).children.head.text
    +
    +    val generator = clauses.head match {
    +      case Token("TOK_FUNCTION", Token(functionName, Nil) :: children) =>
    +        UnresolvedGenerator(functionName, children.map(nodeToExpr))
    --- End diff --
    
    I went for the intermediate solution. ```LATERAL VIEW``` is supported in Catalyst for ```explode``` and ```json-tuple``` (these are hardcoded). HiveQl will also support HiveUDFTs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169049668
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169208476
  
    BTW given the size of the pull request, I think we can also merge it provided that it has no structural problems, and then review feedback in follow-up prs.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169404802
  
    Regarding the questions about `LATERAL VIEW`.  Ideally I think we will support one query language that has a super-set of the features that were previously present in HiveQL and the SparkSQLParser.  Where this is not possible I'd probably defer to HiveQL or a SQL Standard.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168866570
  
    **[Test build #48709 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48709/consoleFull)** for PR 10583 at commit [`fb3b4a4`](https://github.com/apache/spark/commit/fb3b4a4c461391866bc12a51dd1e60eadeaff916).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48951642
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,961 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case e: MatchError => throw e
    +      case e: AnalysisException => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree = getAst(ddl)
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    --- End diff --
    
    Impossible


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169007832
  
    **[Test build #48766 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48766/consoleFull)** for PR 10583 at commit [`e397370`](https://github.com/apache/spark/commit/e39737023920c3916ad8ed6e4d4b46072bfe4f7a).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168929936
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169425486
  
    Due to the size of the patch, I'm going to merge this in now. @hvanhovell can address more comments as follow-up prs.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169326444
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48853/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169171058
  
    Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169165991
  
    **[Test build #48789 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48789/consoleFull)** for PR 10583 at commit [`157d178`](https://github.com/apache/spark/commit/157d1785a8362f17700f18106ec4aba5d70dc90f).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by yhuai <gi...@git.apache.org>.

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48921238
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -587,6 +586,13 @@ class Analyzer(
                     case other => other
                   }
                 }
    +          case u @ UnresolvedGenerator(name, children) =>
    --- End diff --
    
    Do we need to add `UnresolvedGenerator`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48940817
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/parser/ParseUtils.java ---
    @@ -0,0 +1,163 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.parser;
    +
    +import java.io.UnsupportedEncodingException;
    +
    +/**
    + * A couple of utility methods that help with parsing ASTs.
    + *
    + * Both methods in this class were take from the SemanticAnalyzer in Hive:
    + * ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
    + */
    +public final class ParseUtils {
    +  private ParseUtils() {
    +    super();
    +  }
    +
    +  public static String charSetString(String charSetName, String charSetString)
    --- End diff --
    
    I just checked. There aren't any? Bit strange...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169169980
  
    Can you update the pull request description? It still says WIP.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169094038
  
    **[Test build #48780 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48780/consoleFull)** for PR 10583 at commit [`157d178`](https://github.com/apache/spark/commit/157d1785a8362f17700f18106ec4aba5d70dc90f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168984129
  
    **[Test build #48766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48766/consoleFull)** for PR 10583 at commit [`e397370`](https://github.com/apache/spark/commit/e39737023920c3916ad8ed6e4d4b46072bfe4f7a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48919450
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala ---
    @@ -587,6 +586,13 @@ class Analyzer(
                     case other => other
                   }
                 }
    +          case u @ UnresolvedGenerator(name, children) =>
    --- End diff --
    
    cc @yhuai for this change


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169122427
  
    **[Test build #48780 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48780/consoleFull)** for PR 10583 at commit [`157d178`](https://github.com/apache/spark/commit/157d1785a8362f17700f18106ec4aba5d70dc90f).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48919353
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/parser/ParseUtils.java ---
    @@ -0,0 +1,163 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.parser;
    +
    +import java.io.UnsupportedEncodingException;
    +
    +/**
    + * A couple of utility methods that help with parsing ASTs.
    + *
    + * Both methods in this class were take from the SemanticAnalyzer in Hive:
    + * ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
    + */
    +public final class ParseUtils {
    +  private ParseUtils() {
    +    super();
    +  }
    +
    +  public static String charSetString(String charSetName, String charSetString)
    +      throws UnsupportedEncodingException {
    +      // The character set name starts with a _, so strip that
    --- End diff --
    
    looks like the indentation is off here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169166337
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169049675
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48771/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168906957
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by marmbrus <gi...@git.apache.org>.

Github user marmbrus commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48805084
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/parser/ParseUtils.java ---
    @@ -0,0 +1,163 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.parser;
    --- End diff --
    
    Should we put this in `catalyst`?  Just since we tend to hide non-public APIs there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169204037
  
    Does case-sensitivity config still work with the new parser?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168965051
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48747/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169326441
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168932606
  
    **[Test build #48747 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48747/consoleFull)** for PR 10583 at commit [`fb3b4a4`](https://github.com/apache/spark/commit/fb3b4a4c461391866bc12a51dd1e60eadeaff916).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168906918
  
    **[Test build #48709 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48709/consoleFull)** for PR 10583 at commit [`fb3b4a4`](https://github.com/apache/spark/commit/fb3b4a4c461391866bc12a51dd1e60eadeaff916).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48919495
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala ---
    @@ -129,23 +129,39 @@ object UnresolvedAttribute {
         nameParts.toSeq
       }
     }
    +trait UnresolvedFunctionLike extends Expression {
    --- End diff --
    
    I'd remove this trait since it is only used twice. The overhead of having an abstraction doesn't seem worth it.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by cloud-fan <gi...@git.apache.org>.

Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48923332
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystQl.scala ---
    @@ -0,0 +1,969 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.spark.sql.catalyst
    +
    +import java.sql.Date
    +
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.analysis._
    +import org.apache.spark.sql.catalyst.expressions._
    +import org.apache.spark.sql.catalyst.expressions.aggregate.Count
    +import org.apache.spark.sql.catalyst.plans._
    +import org.apache.spark.sql.catalyst.plans.logical._
    +import org.apache.spark.sql.catalyst.trees.CurrentOrigin
    +import org.apache.spark.sql.catalyst.parser._
    +import org.apache.spark.sql.types._
    +import org.apache.spark.unsafe.types.CalendarInterval
    +import org.apache.spark.util.random.RandomSampler
    +
    +/**
    + * This class translates a HQL String to a Catalyst [[LogicalPlan]] or [[Expression]].
    + */
    +private[sql] class CatalystQl(val conf: ParserConf = SimpleParserConf()) {
    +  object Token {
    +    def unapply(node: ASTNode): Some[(String, List[ASTNode])] = {
    +      CurrentOrigin.setPosition(node.line, node.positionInLine)
    +      node.pattern
    +    }
    +  }
    +
    +  // TODO improve the parse error - so we don't need this anymore.
    +  val errorRegEx = "line (\\d+):(\\d+) (.*)".r
    +
    +  /**
    +   * Returns the AST for the given SQL string.
    +   */
    +  protected def getAst(sql: String): ASTNode = ParseDriver.parse(sql, conf)
    +
    +  /** Creates LogicalPlan for a given HiveQL string. */
    +  def createPlan(sql: String): LogicalPlan = {
    +    try {
    +      createPlan(sql, ParseDriver.parse(sql, conf))
    +    } catch {
    +      case pe: ParseException =>
    +        pe.getMessage match {
    +          case errorRegEx(line, start, message) =>
    +            throw new AnalysisException(message, Some(line.toInt), Some(start.toInt))
    +          case otherMessage =>
    +            throw new AnalysisException(otherMessage)
    +        }
    +      case e: MatchError => throw e
    +      case e: Exception =>
    +        throw new AnalysisException(e.getMessage)
    +      case e: NotImplementedError =>
    +        throw new AnalysisException(
    +          s"""
    +             |Unsupported language features in query: $sql
    +             |${getAst(sql).treeString}
    +             |$e
    +             |${e.getStackTrace.head}
    +          """.stripMargin)
    +    }
    +  }
    +
    +  protected def createPlan(sql: String, tree: ASTNode): LogicalPlan = nodeToPlan(tree)
    +
    +  def parseDdl(ddl: String): Seq[Attribute] = {
    +    val tree =
    +      try {
    +        getAst(ddl)
    +      } catch {
    +        case pe: ParseException =>
    +          throw new RuntimeException(s"Failed to parse ddl: '$ddl'", pe)
    +      }
    +    assert(tree.text == "TOK_CREATETABLE", "Only CREATE TABLE supported.")
    +    val tableOps = tree.children
    +    val colList = tableOps
    +      .find(_.text == "TOK_TABCOLLIST")
    +      .getOrElse(sys.error("No columnList!"))
    +
    +    colList.children.map(nodeToAttribute)
    +  }
    +
    +  protected def getClauses(
    +      clauseNames: Seq[String],
    +      nodeList: Seq[ASTNode]): Seq[Option[ASTNode]] = {
    +    var remainingNodes = nodeList
    +    val clauses = clauseNames.map { clauseName =>
    +      val (matches, nonMatches) = remainingNodes.partition(_.text.toUpperCase == clauseName)
    +      remainingNodes = nonMatches ++ (if (matches.nonEmpty) matches.tail else Nil)
    +      matches.headOption
    +    }
    +
    +    if (remainingNodes.nonEmpty) {
    +      sys.error(
    +        s"""Unhandled clauses: ${remainingNodes.map(_.treeString).mkString("\n")}.
    +            |You are likely trying to use an unsupported Hive feature."""".stripMargin)
    +    }
    +    clauses
    +  }
    +
    +  protected def getClause(clauseName: String, nodeList: Seq[ASTNode]): ASTNode =
    +    getClauseOption(clauseName, nodeList).getOrElse(sys.error(
    +      s"Expected clause $clauseName missing from ${nodeList.map(_.treeString).mkString("\n")}"))
    +
    +  protected def getClauseOption(clauseName: String, nodeList: Seq[ASTNode]): Option[ASTNode] = {
    +    nodeList.filter { case ast: ASTNode => ast.text == clauseName } match {
    +      case Seq(oneMatch) => Some(oneMatch)
    +      case Seq() => None
    +      case _ => sys.error(s"Found multiple instances of clause $clauseName")
    +    }
    +  }
    +
    +  protected def nodeToAttribute(node: ASTNode): Attribute = node match {
    +    case Token("TOK_TABCOL", Token(colName, Nil) :: dataType :: Nil) =>
    +      AttributeReference(colName, nodeToDataType(dataType), nullable = true)()
    +    case _ =>
    +      noParseRule("Attribute", node)
    +  }
    +
    +  protected def nodeToDataType(node: ASTNode): DataType = node match {
    +    case Token("TOK_DECIMAL", precision :: scale :: Nil) =>
    +      DecimalType(precision.text.toInt, scale.text.toInt)
    +    case Token("TOK_DECIMAL", precision :: Nil) =>
    +      DecimalType(precision.text.toInt, 0)
    +    case Token("TOK_DECIMAL", Nil) => DecimalType.USER_DEFAULT
    +    case Token("TOK_BIGINT", Nil) => LongType
    +    case Token("TOK_INT", Nil) => IntegerType
    +    case Token("TOK_TINYINT", Nil) => ByteType
    +    case Token("TOK_SMALLINT", Nil) => ShortType
    +    case Token("TOK_BOOLEAN", Nil) => BooleanType
    +    case Token("TOK_STRING", Nil) => StringType
    +    case Token("TOK_VARCHAR", Token(_, Nil) :: Nil) => StringType
    +    case Token("TOK_FLOAT", Nil) => FloatType
    +    case Token("TOK_DOUBLE", Nil) => DoubleType
    +    case Token("TOK_DATE", Nil) => DateType
    +    case Token("TOK_TIMESTAMP", Nil) => TimestampType
    +    case Token("TOK_BINARY", Nil) => BinaryType
    +    case Token("TOK_LIST", elementType :: Nil) => ArrayType(nodeToDataType(elementType))
    +    case Token("TOK_STRUCT", Token("TOK_TABCOLLIST", fields) :: Nil) =>
    +      StructType(fields.map(nodeToStructField))
    +    case Token("TOK_MAP", keyType :: valueType :: Nil) =>
    +      MapType(nodeToDataType(keyType), nodeToDataType(valueType))
    +    case _ =>
    +      noParseRule("DataType", node)
    +  }
    +
    +  protected def nodeToStructField(node: ASTNode): StructField = node match {
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case Token("TOK_TABCOL", Token(fieldName, Nil) :: dataType :: _ /* comment */:: Nil) =>
    +      StructField(fieldName, nodeToDataType(dataType), nullable = true)
    +    case _ =>
    +      noParseRule("StructField", node)
    +  }
    +
    +  protected def extractTableIdent(tableNameParts: ASTNode): TableIdentifier = {
    +    tableNameParts.children.map {
    +      case Token(part, Nil) => cleanIdentifier(part)
    +    } match {
    +      case Seq(tableOnly) => TableIdentifier(tableOnly)
    +      case Seq(databaseName, table) => TableIdentifier(table, Some(databaseName))
    +      case other => sys.error("Hive only supports tables names like 'tableName' " +
    +        s"or 'databaseName.tableName', found '$other'")
    +    }
    +  }
    +
    +  /**
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2, k3 GROUPING SETS((k1, k2), (k2))
    +   * is equivalent to
    +   * SELECT MAX(value) FROM src GROUP BY k1, k2 UNION SELECT MAX(value) FROM src GROUP BY k2
    +   * Check the following link for details.
    +   *
    +https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation%2C+Cube%2C+Grouping+and+Rollup
    +   *
    +   * The bitmask denotes the grouping expressions validity for a grouping set,
    +   * the bitmask also be called as grouping id (`GROUPING__ID`, the virtual column in Hive)
    +   * e.g. In superset (k1, k2, k3), (bit 0: k1, bit 1: k2, and bit 2: k3), the grouping id of
    +   * GROUPING SETS (k1, k2) and (k2) should be 3 and 2 respectively.
    +   */
    +  protected def extractGroupingSet(children: Seq[ASTNode]): (Seq[Expression], Seq[Int]) = {
    +    val (keyASTs, setASTs) = children.partition {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", _) => false // grouping sets
    +      case _ => true // grouping keys
    +    }
    +
    +    val keys = keyASTs.map(nodeToExpr)
    +    val keyMap = keyASTs.zipWithIndex.toMap
    +
    +    val bitmasks: Seq[Int] = setASTs.map {
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", null) => 0
    +      case Token("TOK_GROUPING_SETS_EXPRESSION", columns) =>
    +        columns.foldLeft(0)((bitmap, col) => {
    +          val keyIndex = keyMap.find(_._1.treeEquals(col)).map(_._2)
    +          bitmap | 1 << keyIndex.getOrElse(
    +            throw new AnalysisException(s"${col.treeString} doesn't show up in the GROUP BY list"))
    +        })
    +      case _ => sys.error("Expect GROUPING SETS clause")
    +    }
    +
    +    (keys, bitmasks)
    +  }
    +
    +  protected def nodeToPlan(node: ASTNode): LogicalPlan = node match {
    +    case Token("TOK_QUERY", queryArgs @ Token("TOK_CTE" | "TOK_FROM" | "TOK_INSERT", _) :: _) =>
    +      val (fromClause: Option[ASTNode], insertClauses, cteRelations) =
    +        queryArgs match {
    +          case Token("TOK_CTE", ctes) :: Token("TOK_FROM", from) :: inserts =>
    +            val cteRelations = ctes.map { node =>
    +              val relation = nodeToRelation(node).asInstanceOf[Subquery]
    +              relation.alias -> relation
    +            }
    +            (Some(from.head), inserts, Some(cteRelations.toMap))
    +          case Token("TOK_FROM", from) :: inserts =>
    +            (Some(from.head), inserts, None)
    +          case Token("TOK_INSERT", _) :: Nil =>
    +            (None, queryArgs, None)
    +        }
    +
    +      // Return one query for each insert clause.
    +      val queries = insertClauses.map {
    +        case Token("TOK_INSERT", singleInsert) =>
    +          val (
    +            intoClause ::
    +              destClause ::
    +              selectClause ::
    +              selectDistinctClause ::
    +              whereClause ::
    +              groupByClause ::
    +              rollupGroupByClause ::
    +              cubeGroupByClause ::
    +              groupingSetsClause ::
    +              orderByClause ::
    +              havingClause ::
    +              sortByClause ::
    +              clusterByClause ::
    +              distributeByClause ::
    +              limitClause ::
    +              lateralViewClause ::
    +              windowClause :: Nil) = {
    +            getClauses(
    +              Seq(
    +                "TOK_INSERT_INTO",
    +                "TOK_DESTINATION",
    +                "TOK_SELECT",
    +                "TOK_SELECTDI",
    +                "TOK_WHERE",
    +                "TOK_GROUPBY",
    +                "TOK_ROLLUP_GROUPBY",
    +                "TOK_CUBE_GROUPBY",
    +                "TOK_GROUPING_SETS",
    +                "TOK_ORDERBY",
    +                "TOK_HAVING",
    +                "TOK_SORTBY",
    +                "TOK_CLUSTERBY",
    +                "TOK_DISTRIBUTEBY",
    +                "TOK_LIMIT",
    +                "TOK_LATERAL_VIEW",
    +                "WINDOW"),
    +              singleInsert)
    +          }
    +
    +          val relations = fromClause match {
    +            case Some(f) => nodeToRelation(f)
    +            case None => OneRowRelation
    +          }
    +
    +          val withWhere = whereClause.map { whereNode =>
    +            val Seq(whereExpr) = whereNode.children
    +            Filter(nodeToExpr(whereExpr), relations)
    +          }.getOrElse(relations)
    +
    +          val select = (selectClause orElse selectDistinctClause)
    +            .getOrElse(sys.error("No select clause."))
    +
    +          val transformation = nodeToTransformation(select.children.head, withWhere)
    +
    +          val withLateralView = lateralViewClause.map { lv =>
    +            nodeToGenerate(lv.children.head, outer = false, withWhere)
    +          }.getOrElse(withWhere)
    +
    +          // The projection of the query can either be a normal projection, an aggregation
    +          // (if there is a group by) or a script transformation.
    +          val withProject: LogicalPlan = transformation.getOrElse {
    +            val selectExpressions =
    +              select.children.flatMap(selExprNodeToExpr).map(UnresolvedAlias(_))
    +            Seq(
    +              groupByClause.map(e => e match {
    +                case Token("TOK_GROUPBY", children) =>
    +                  // Not a transformation so must be either project or aggregation.
    +                  Aggregate(children.map(nodeToExpr), selectExpressions, withLateralView)
    +                case _ => sys.error("Expect GROUP BY")
    +              }),
    +              groupingSetsClause.map(e => e match {
    +                case Token("TOK_GROUPING_SETS", children) =>
    +                  val(groupByExprs, masks) = extractGroupingSet(children)
    +                  GroupingSets(masks, groupByExprs, withLateralView, selectExpressions)
    +                case _ => sys.error("Expect GROUPING SETS")
    +              }),
    +              rollupGroupByClause.map(e => e match {
    +                case Token("TOK_ROLLUP_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Rollup(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH ROLLUP")
    +              }),
    +              cubeGroupByClause.map(e => e match {
    +                case Token("TOK_CUBE_GROUPBY", children) =>
    +                  Aggregate(
    +                    Seq(Cube(children.map(nodeToExpr))),
    +                    selectExpressions,
    +                    withLateralView)
    +                case _ => sys.error("Expect WITH CUBE")
    +              }),
    +              Some(Project(selectExpressions, withLateralView))).flatten.head
    +          }
    +
    +          // Handle HAVING clause.
    +          val withHaving = havingClause.map { h =>
    +            val havingExpr = h.children match { case Seq(hexpr) => nodeToExpr(hexpr) }
    +            // Note that we added a cast to boolean. If the expression itself is already boolean,
    +            // the optimizer will get rid of the unnecessary cast.
    +            Filter(Cast(havingExpr, BooleanType), withProject)
    +          }.getOrElse(withProject)
    +
    +          // Handle SELECT DISTINCT
    +          val withDistinct =
    +            if (selectDistinctClause.isDefined) Distinct(withHaving) else withHaving
    +
    +          // Handle ORDER BY, SORT BY, DISTRIBUTE BY, and CLUSTER BY clause.
    +          val withSort =
    +            (orderByClause, sortByClause, distributeByClause, clusterByClause) match {
    +              case (Some(totalOrdering), None, None, None) =>
    +                Sort(totalOrdering.children.map(nodeToSortOrder), global = true, withDistinct)
    +              case (None, Some(perPartitionOrdering), None, None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder),
    +                  global = false, withDistinct)
    +              case (None, None, Some(partitionExprs), None) =>
    +                RepartitionByExpression(
    +                  partitionExprs.children.map(nodeToExpr), withDistinct)
    +              case (None, Some(perPartitionOrdering), Some(partitionExprs), None) =>
    +                Sort(
    +                  perPartitionOrdering.children.map(nodeToSortOrder), global = false,
    +                  RepartitionByExpression(
    +                    partitionExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, Some(clusterExprs)) =>
    +                Sort(
    +                  clusterExprs.children.map(nodeToExpr).map(SortOrder(_, Ascending)),
    +                  global = false,
    +                  RepartitionByExpression(
    +                    clusterExprs.children.map(nodeToExpr),
    +                    withDistinct))
    +              case (None, None, None, None) => withDistinct
    +              case _ => sys.error("Unsupported set of ordering / distribution clauses.")
    +            }
    +
    +          val withLimit =
    +            limitClause.map(l => nodeToExpr(l.children.head))
    +              .map(Limit(_, withSort))
    +              .getOrElse(withSort)
    +
    +          // Collect all window specifications defined in the WINDOW clause.
    +          val windowDefinitions = windowClause.map(_.children.collect {
    +            case Token("TOK_WINDOWDEF",
    +            Token(windowName, Nil) :: Token("TOK_WINDOWSPEC", spec) :: Nil) =>
    +              windowName -> nodesToWindowSpecification(spec)
    +          }.toMap)
    +          // Handle cases like
    +          // window w1 as (partition by p_mfgr order by p_name
    +          //               range between 2 preceding and 2 following),
    +          //        w2 as w1
    +          val resolvedCrossReference = windowDefinitions.map {
    +            windowDefMap => windowDefMap.map {
    +              case (windowName, WindowSpecReference(other)) =>
    +                (windowName, windowDefMap(other).asInstanceOf[WindowSpecDefinition])
    +              case o => o.asInstanceOf[(String, WindowSpecDefinition)]
    +            }
    +          }
    +
    +          val withWindowDefinitions =
    +            resolvedCrossReference.map(WithWindowDefinition(_, withLimit)).getOrElse(withLimit)
    +
    +          // TOK_INSERT_INTO means to add files to the table.
    +          // TOK_DESTINATION means to overwrite the table.
    +          val resultDestination =
    +            (intoClause orElse destClause).getOrElse(sys.error("No destination found."))
    +          val overwrite = intoClause.isEmpty
    +          nodeToDest(
    +            resultDestination,
    +            withWindowDefinitions,
    +            overwrite)
    +      }
    +
    +      // If there are multiple INSERTS just UNION them together into on query.
    +      val query = queries.reduceLeft(Union)
    +
    +      // return With plan if there is CTE
    +      cteRelations.map(With(query, _)).getOrElse(query)
    +
    +    // HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
    +    case Token("TOK_UNIONALL", left :: right :: Nil) =>
    +      Union(nodeToPlan(left), nodeToPlan(right))
    +
    +    case _ =>
    +      noParseRule("Plan", node)
    +  }
    +
    +  val allJoinTokens = "(TOK_.*JOIN)".r
    +  val laterViewToken = "TOK_LATERAL_VIEW(.*)".r
    +  protected def nodeToRelation(node: ASTNode): LogicalPlan = {
    +    node match {
    +      case Token("TOK_SUBQUERY", query :: Token(alias, Nil) :: Nil) =>
    +        Subquery(cleanIdentifier(alias), nodeToPlan(query))
    +
    +      case Token(laterViewToken(isOuter), selectClause :: relationClause :: Nil) =>
    +        nodeToGenerate(
    +          selectClause,
    +          outer = isOuter.nonEmpty,
    +          nodeToRelation(relationClause))
    +
    +      /* All relations, possibly with aliases or sampling clauses. */
    +      case Token("TOK_TABREF", clauses) =>
    +        // If the last clause is not a token then it's the alias of the table.
    +        val (nonAliasClauses, aliasClause) =
    +          if (clauses.last.text.startsWith("TOK")) {
    +            (clauses, None)
    +          } else {
    +            (clauses.dropRight(1), Some(clauses.last))
    +          }
    +
    +        val (Some(tableNameParts) ::
    +          splitSampleClause ::
    +          bucketSampleClause :: Nil) = {
    +          getClauses(Seq("TOK_TABNAME", "TOK_TABLESPLITSAMPLE", "TOK_TABLEBUCKETSAMPLE"),
    +            nonAliasClauses)
    +        }
    +
    +        val tableIdent = extractTableIdent(tableNameParts)
    +        val alias = aliasClause.map { case Token(a, Nil) => cleanIdentifier(a) }
    +        val relation = UnresolvedRelation(tableIdent, alias)
    +
    +        // Apply sampling if requested.
    +        (bucketSampleClause orElse splitSampleClause).map {
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_ROWCOUNT", Nil) :: Token(count, Nil) :: Nil) =>
    +            Limit(Literal(count.toInt), relation)
    +          case Token("TOK_TABLESPLITSAMPLE",
    +          Token("TOK_PERCENT", Nil) :: Token(fraction, Nil) :: Nil) =>
    +            // The range of fraction accepted by Sample is [0, 1]. Because Hive's block sampling
    +            // function takes X PERCENT as the input and the range of X is [0, 100], we need to
    +            // adjust the fraction.
    +            require(
    +              fraction.toDouble >= (0.0 - RandomSampler.roundingEpsilon)
    +                && fraction.toDouble <= (100.0 + RandomSampler.roundingEpsilon),
    +              s"Sampling fraction ($fraction) must be on interval [0, 100]")
    +            Sample(0.0, fraction.toDouble / 100, withReplacement = false,
    +              (math.random * 1000).toInt,
    +              relation)
    +          case Token("TOK_TABLEBUCKETSAMPLE",
    +          Token(numerator, Nil) ::
    +            Token(denominator, Nil) :: Nil) =>
    +            val fraction = numerator.toDouble / denominator.toDouble
    +            Sample(0.0, fraction, withReplacement = false, (math.random * 1000).toInt, relation)
    +          case a =>
    +            noParseRule("Sampling", a)
    +        }.getOrElse(relation)
    +
    +      case Token(allJoinTokens(joinToken), relation1 :: relation2 :: other) =>
    +        if (!(other.size <= 1)) {
    +          sys.error(s"Unsupported join operation: $other")
    +        }
    +
    +        val joinType = joinToken match {
    +          case "TOK_JOIN" => Inner
    +          case "TOK_CROSSJOIN" => Inner
    +          case "TOK_RIGHTOUTERJOIN" => RightOuter
    +          case "TOK_LEFTOUTERJOIN" => LeftOuter
    +          case "TOK_FULLOUTERJOIN" => FullOuter
    +          case "TOK_LEFTSEMIJOIN" => LeftSemi
    +          case "TOK_UNIQUEJOIN" => noParseRule("Unique Join", node)
    +          case "TOK_ANTIJOIN" => noParseRule("Anti Join", node)
    +        }
    +        Join(nodeToRelation(relation1),
    +          nodeToRelation(relation2),
    +          joinType,
    +          other.headOption.map(nodeToExpr))
    +
    +      case _ =>
    +        noParseRule("Relation", node)
    +    }
    +  }
    +
    +  protected def nodeToSortOrder(node: ASTNode): SortOrder = node match {
    +    case Token("TOK_TABSORTCOLNAMEASC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Ascending)
    +    case Token("TOK_TABSORTCOLNAMEDESC", sortExpr :: Nil) =>
    +      SortOrder(nodeToExpr(sortExpr), Descending)
    +    case _ =>
    +      noParseRule("SortOrder", node)
    +  }
    +
    +  val destinationToken = "TOK_DESTINATION|TOK_INSERT_INTO".r
    +  protected def nodeToDest(
    +      node: ASTNode,
    +      query: LogicalPlan,
    +      overwrite: Boolean): LogicalPlan = node match {
    +    case Token(destinationToken(),
    +    Token("TOK_DIR",
    +    Token("TOK_TMP_FILE", Nil) :: Nil) :: Nil) =>
    +      query
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = false)
    +
    +    case Token(destinationToken(),
    +    Token("TOK_TAB",
    +    tableArgs) ::
    +      Token("TOK_IFNOTEXISTS",
    +      ifNotExists) :: Nil) =>
    +      val Some(tableNameParts) :: partitionClause :: Nil =
    +        getClauses(Seq("TOK_TABNAME", "TOK_PARTSPEC"), tableArgs)
    +
    +      val tableIdent = extractTableIdent(tableNameParts)
    +
    +      val partitionKeys = partitionClause.map(_.children.map {
    +        // Parse partitions. We also make keys case insensitive.
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Token(value, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> Some(unquoteString(value))
    +        case Token("TOK_PARTVAL", Token(key, Nil) :: Nil) =>
    +          cleanIdentifier(key.toLowerCase) -> None
    +      }.toMap).getOrElse(Map.empty)
    +
    +      InsertIntoTable(
    +        UnresolvedRelation(tableIdent, None), partitionKeys, query, overwrite, ifNotExists = true)
    +
    +    case _ =>
    +      noParseRule("Destination", node)
    +  }
    +
    +  protected def selExprNodeToExpr(node: ASTNode): Option[Expression] = node match {
    +    case Token("TOK_SELEXPR", e :: Nil) =>
    +      Some(nodeToExpr(e))
    +
    +    case Token("TOK_SELEXPR", e :: Token(alias, Nil) :: Nil) =>
    +      Some(Alias(nodeToExpr(e), cleanIdentifier(alias))())
    +
    +    case Token("TOK_SELEXPR", e :: aliasChildren) =>
    +      val aliasNames = aliasChildren.collect {
    +        case Token(name, Nil) => cleanIdentifier(name)
    +      }
    +      Some(MultiAlias(nodeToExpr(e), aliasNames))
    +
    +    /* Hints are ignored */
    +    case Token("TOK_HINTLIST", _) => None
    +
    +    case _ =>
    +      noParseRule("Select", node)
    +  }
    +
    +  protected val escapedIdentifier = "`([^`]+)`".r
    +  protected val doubleQuotedString = "\"([^\"]+)\"".r
    +  protected val singleQuotedString = "'([^']+)'".r
    +
    +  protected def unquoteString(str: String) = str match {
    +    case singleQuotedString(s) => s
    +    case doubleQuotedString(s) => s
    +    case other => other
    +  }
    +
    +  /** Strips backticks from ident if present */
    +  protected def cleanIdentifier(ident: String): String = ident match {
    +    case escapedIdentifier(i) => i
    +    case plainIdent => plainIdent
    +  }
    +
    +  val numericAstTypes = Seq(
    +    SparkSqlParser.Number,
    +    SparkSqlParser.TinyintLiteral,
    +    SparkSqlParser.SmallintLiteral,
    +    SparkSqlParser.BigintLiteral,
    +    SparkSqlParser.DecimalLiteral)
    +
    +  /* Case insensitive matches */
    +  val COUNT = "(?i)COUNT".r
    +  val SUM = "(?i)SUM".r
    +  val AND = "(?i)AND".r
    +  val OR = "(?i)OR".r
    +  val NOT = "(?i)NOT".r
    +  val TRUE = "(?i)TRUE".r
    +  val FALSE = "(?i)FALSE".r
    +  val LIKE = "(?i)LIKE".r
    +  val RLIKE = "(?i)RLIKE".r
    +  val REGEXP = "(?i)REGEXP".r
    +  val IN = "(?i)IN".r
    +  val DIV = "(?i)DIV".r
    +  val BETWEEN = "(?i)BETWEEN".r
    +  val WHEN = "(?i)WHEN".r
    +  val CASE = "(?i)CASE".r
    +
    +  protected def nodeToExpr(node: ASTNode): Expression = node match {
    +    /* Attribute References */
    +    case Token("TOK_TABLE_OR_COL", Token(name, Nil) :: Nil) =>
    +      UnresolvedAttribute.quoted(cleanIdentifier(name))
    +    case Token(".", qualifier :: Token(attr, Nil) :: Nil) =>
    +      nodeToExpr(qualifier) match {
    +        case UnresolvedAttribute(nameParts) =>
    +          UnresolvedAttribute(nameParts :+ cleanIdentifier(attr))
    +        case other => UnresolvedExtractValue(other, Literal(attr))
    +      }
    +
    +    /* Stars (*) */
    +    case Token("TOK_ALLCOLREF", Nil) => UnresolvedStar(None)
    +    // The format of dbName.tableName.* cannot be parsed by HiveParser. TOK_TABNAME will only
    +    // has a single child which is tableName.
    +    case Token("TOK_ALLCOLREF", Token("TOK_TABNAME", Token(name, Nil) :: Nil) :: Nil) =>
    +      UnresolvedStar(Some(UnresolvedAttribute.parseAttributeName(name)))
    +
    +    /* Aggregate Functions */
    +    case Token("TOK_FUNCTIONDI", Token(COUNT(), Nil) :: args) =>
    +      Count(args.map(nodeToExpr)).toAggregateExpression(isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(COUNT(), Nil) :: Nil) =>
    +      Count(Literal(1)).toAggregateExpression()
    +
    +    /* Casts */
    +    case Token("TOK_FUNCTION", Token("TOK_STRING", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_VARCHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_CHAR", _) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), StringType)
    +    case Token("TOK_FUNCTION", Token("TOK_INT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), IntegerType)
    +    case Token("TOK_FUNCTION", Token("TOK_BIGINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), LongType)
    +    case Token("TOK_FUNCTION", Token("TOK_FLOAT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), FloatType)
    +    case Token("TOK_FUNCTION", Token("TOK_DOUBLE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DoubleType)
    +    case Token("TOK_FUNCTION", Token("TOK_SMALLINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ShortType)
    +    case Token("TOK_FUNCTION", Token("TOK_TINYINT", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), ByteType)
    +    case Token("TOK_FUNCTION", Token("TOK_BINARY", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BinaryType)
    +    case Token("TOK_FUNCTION", Token("TOK_BOOLEAN", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), BooleanType)
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: scale :: nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, scale.text.toInt))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", precision :: Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType(precision.text.toInt, 0))
    +    case Token("TOK_FUNCTION", Token("TOK_DECIMAL", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DecimalType.USER_DEFAULT)
    +    case Token("TOK_FUNCTION", Token("TOK_TIMESTAMP", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), TimestampType)
    +    case Token("TOK_FUNCTION", Token("TOK_DATE", Nil) :: arg :: Nil) =>
    +      Cast(nodeToExpr(arg), DateType)
    +
    +    /* Arithmetic */
    +    case Token("+", child :: Nil) => nodeToExpr(child)
    +    case Token("-", child :: Nil) => UnaryMinus(nodeToExpr(child))
    +    case Token("~", child :: Nil) => BitwiseNot(nodeToExpr(child))
    +    case Token("+", left :: right:: Nil) => Add(nodeToExpr(left), nodeToExpr(right))
    +    case Token("-", left :: right:: Nil) => Subtract(nodeToExpr(left), nodeToExpr(right))
    +    case Token("*", left :: right:: Nil) => Multiply(nodeToExpr(left), nodeToExpr(right))
    +    case Token("/", left :: right:: Nil) => Divide(nodeToExpr(left), nodeToExpr(right))
    +    case Token(DIV(), left :: right:: Nil) =>
    +      Cast(Divide(nodeToExpr(left), nodeToExpr(right)), LongType)
    +    case Token("%", left :: right:: Nil) => Remainder(nodeToExpr(left), nodeToExpr(right))
    +    case Token("&", left :: right:: Nil) => BitwiseAnd(nodeToExpr(left), nodeToExpr(right))
    +    case Token("|", left :: right:: Nil) => BitwiseOr(nodeToExpr(left), nodeToExpr(right))
    +    case Token("^", left :: right:: Nil) => BitwiseXor(nodeToExpr(left), nodeToExpr(right))
    +
    +    /* Comparisons */
    +    case Token("=", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("==", left :: right:: Nil) => EqualTo(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=>", left :: right:: Nil) => EqualNullSafe(nodeToExpr(left), nodeToExpr(right))
    +    case Token("!=", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token("<>", left :: right:: Nil) => Not(EqualTo(nodeToExpr(left), nodeToExpr(right)))
    +    case Token(">", left :: right:: Nil) => GreaterThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token(">=", left :: right:: Nil) => GreaterThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<", left :: right:: Nil) => LessThan(nodeToExpr(left), nodeToExpr(right))
    +    case Token("<=", left :: right:: Nil) => LessThanOrEqual(nodeToExpr(left), nodeToExpr(right))
    +    case Token(LIKE(), left :: right:: Nil) => Like(nodeToExpr(left), nodeToExpr(right))
    +    case Token(RLIKE(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token(REGEXP(), left :: right:: Nil) => RLike(nodeToExpr(left), nodeToExpr(right))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNOTNULL", Nil) :: child :: Nil) =>
    +      IsNotNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token("TOK_ISNULL", Nil) :: child :: Nil) =>
    +      IsNull(nodeToExpr(child))
    +    case Token("TOK_FUNCTION", Token(IN(), Nil) :: value :: list) =>
    +      In(nodeToExpr(value), list.map(nodeToExpr))
    +    case Token("TOK_FUNCTION",
    +    Token(BETWEEN(), Nil) ::
    +      kw ::
    +      target ::
    +      minValue ::
    +      maxValue :: Nil) =>
    +
    +      val targetExpression = nodeToExpr(target)
    +      val betweenExpr =
    +        And(
    +          GreaterThanOrEqual(targetExpression, nodeToExpr(minValue)),
    +          LessThanOrEqual(targetExpression, nodeToExpr(maxValue)))
    +      kw match {
    +        case Token("KW_FALSE", Nil) => betweenExpr
    +        case Token("KW_TRUE", Nil) => Not(betweenExpr)
    +      }
    +
    +    /* Boolean Logic */
    +    case Token(AND(), left :: right:: Nil) => And(nodeToExpr(left), nodeToExpr(right))
    +    case Token(OR(), left :: right:: Nil) => Or(nodeToExpr(left), nodeToExpr(right))
    +    case Token(NOT(), child :: Nil) => Not(nodeToExpr(child))
    +    case Token("!", child :: Nil) => Not(nodeToExpr(child))
    +
    +    /* Case statements */
    +    case Token("TOK_FUNCTION", Token(WHEN(), Nil) :: branches) =>
    +      CaseWhen(branches.map(nodeToExpr))
    +    case Token("TOK_FUNCTION", Token(CASE(), Nil) :: branches) =>
    +      val keyExpr = nodeToExpr(branches.head)
    +      CaseKeyWhen(keyExpr, branches.drop(1).map(nodeToExpr))
    +
    +    /* Complex datatype manipulation */
    +    case Token("[", child :: ordinal :: Nil) =>
    +      UnresolvedExtractValue(nodeToExpr(child), nodeToExpr(ordinal))
    +
    +    /* Window Functions */
    +    case Token(text, args :+ Token("TOK_WINDOWSPEC", spec)) =>
    +      val function = nodeToExpr(node.copy(children = node.children.init))
    +      nodesToWindowSpecification(spec) match {
    +        case reference: WindowSpecReference =>
    +          UnresolvedWindowExpression(function, reference)
    +        case definition: WindowSpecDefinition =>
    +          WindowExpression(function, definition)
    +      }
    +
    +    /* UDFs - Must be last otherwise will preempt built in functions */
    +    case Token("TOK_FUNCTION", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = false)
    +    // Aggregate function with DISTINCT keyword.
    +    case Token("TOK_FUNCTIONDI", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, args.map(nodeToExpr), isDistinct = true)
    +    case Token("TOK_FUNCTIONSTAR", Token(name, Nil) :: args) =>
    +      UnresolvedFunction(name, UnresolvedStar(None) :: Nil, isDistinct = false)
    +
    +    /* Literals */
    +    case Token("TOK_NULL", Nil) => Literal.create(null, NullType)
    +    case Token(TRUE(), Nil) => Literal.create(true, BooleanType)
    +    case Token(FALSE(), Nil) => Literal.create(false, BooleanType)
    +    case Token("TOK_STRINGLITERALSEQUENCE", strings) =>
    +      Literal(strings.map(s => ParseUtils.unescapeSQLString(s.text)).mkString)
    +
    +    // This code is adapted from
    +    // /ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java#L223
    +    case ast: ASTNode if numericAstTypes contains ast.tokenType =>
    +      var v: Literal = null
    +      try {
    +        if (ast.text.endsWith("L")) {
    +          // Literal bigint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toLong, LongType)
    +        } else if (ast.text.endsWith("S")) {
    +          // Literal smallint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toShort, ShortType)
    +        } else if (ast.text.endsWith("Y")) {
    +          // Literal tinyint.
    +          v = Literal.create(ast.text.substring(0, ast.text.length() - 1).toByte, ByteType)
    +        } else if (ast.text.endsWith("BD") || ast.text.endsWith("D")) {
    +          // Literal decimal
    +          val strVal = ast.text.stripSuffix("D").stripSuffix("B")
    +          v = Literal(Decimal(strVal))
    +        } else {
    +          v = Literal.create(ast.text.toDouble, DoubleType)
    +          v = Literal.create(ast.text.toLong, LongType)
    +          v = Literal.create(ast.text.toInt, IntegerType)
    +        }
    +      } catch {
    +        case nfe: NumberFormatException => // Do nothing
    +      }
    +
    +      if (v == null) {
    +        sys.error(s"Failed to parse number '${ast.text}'.")
    +      } else {
    +        v
    +      }
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.StringLiteral =>
    +      Literal(ParseUtils.unescapeSQLString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_DATELITERAL =>
    +      Literal(Date.valueOf(ast.text.substring(1, ast.text.length - 1)))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_CHARSETLITERAL =>
    +      Literal(ParseUtils.charSetString(ast.children.head.text, ast.children(1).text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromYearMonthString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_TIME_LITERAL =>
    +      Literal(CalendarInterval.fromDayTimeString(ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_YEAR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("year", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MONTH_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("month", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_DAY_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("day", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_HOUR_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("hour", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_MINUTE_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("minute", ast.text))
    +
    +    case ast: ASTNode if ast.tokenType == SparkSqlParser.TOK_INTERVAL_SECOND_LITERAL =>
    +      Literal(CalendarInterval.fromSingleUnitString("second", ast.text))
    +
    +    case _ =>
    +      noParseRule("Expression", node)
    +  }
    +
    +  /* Case insensitive matches for Window Specification */
    +  val PRECEDING = "(?i)preceding".r
    +  val FOLLOWING = "(?i)following".r
    +  val CURRENT = "(?i)current".r
    +  protected def nodesToWindowSpecification(nodes: Seq[ASTNode]): WindowSpec = nodes match {
    +    case Token(windowName, Nil) :: Nil =>
    +      // Refer to a window spec defined in the window clause.
    +      WindowSpecReference(windowName)
    +    case Nil =>
    +      // OVER()
    +      WindowSpecDefinition(
    +        partitionSpec = Nil,
    +        orderSpec = Nil,
    +        frameSpecification = UnspecifiedFrame)
    +    case spec =>
    +      val (partitionClause :: rowFrame :: rangeFrame :: Nil) =
    +        getClauses(
    +          Seq(
    +            "TOK_PARTITIONINGSPEC",
    +            "TOK_WINDOWRANGE",
    +            "TOK_WINDOWVALUES"),
    +          spec)
    +
    +      // Handle Partition By and Order By.
    +      val (partitionSpec, orderSpec) = partitionClause.map { partitionAndOrdering =>
    +        val (partitionByClause :: orderByClause :: sortByClause :: clusterByClause :: Nil) =
    +          getClauses(
    +            Seq("TOK_DISTRIBUTEBY", "TOK_ORDERBY", "TOK_SORTBY", "TOK_CLUSTERBY"),
    +            partitionAndOrdering.children)
    +
    +        (partitionByClause, orderByClause.orElse(sortByClause), clusterByClause) match {
    +          case (Some(partitionByExpr), Some(orderByExpr), None) =>
    +            (partitionByExpr.children.map(nodeToExpr),
    +              orderByExpr.children.map(nodeToSortOrder))
    +          case (Some(partitionByExpr), None, None) =>
    +            (partitionByExpr.children.map(nodeToExpr), Nil)
    +          case (None, Some(orderByExpr), None) =>
    +            (Nil, orderByExpr.children.map(nodeToSortOrder))
    +          case (None, None, Some(clusterByExpr)) =>
    +            val expressions = clusterByExpr.children.map(nodeToExpr)
    +            (expressions, expressions.map(SortOrder(_, Ascending)))
    +          case _ =>
    +            noParseRule("Partition & Ordering", partitionAndOrdering)
    +        }
    +      }.getOrElse {
    +        (Nil, Nil)
    +      }
    +
    +      // Handle Window Frame
    +      val windowFrame =
    +        if (rowFrame.isEmpty && rangeFrame.isEmpty) {
    +          UnspecifiedFrame
    +        } else {
    +          val frameType = rowFrame.map(_ => RowFrame).getOrElse(RangeFrame)
    +          def nodeToBoundary(node: ASTNode): FrameBoundary = node match {
    +            case Token(PRECEDING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedPreceding
    +              } else {
    +                ValuePreceding(count.toInt)
    +              }
    +            case Token(FOLLOWING(), Token(count, Nil) :: Nil) =>
    +              if (count.toLowerCase() == "unbounded") {
    +                UnboundedFollowing
    +              } else {
    +                ValueFollowing(count.toInt)
    +              }
    +            case Token(CURRENT(), Nil) => CurrentRow
    +            case _ =>
    +              noParseRule("Window Frame Boundary", node)
    +          }
    +
    +          rowFrame.orElse(rangeFrame).map { frame =>
    +            frame.children match {
    +              case precedingNode :: followingNode :: Nil =>
    +                SpecifiedWindowFrame(
    +                  frameType,
    +                  nodeToBoundary(precedingNode),
    +                  nodeToBoundary(followingNode))
    +              case precedingNode :: Nil =>
    +                SpecifiedWindowFrame(frameType, nodeToBoundary(precedingNode), CurrentRow)
    +              case _ =>
    +                noParseRule("Window Frame", frame)
    +            }
    +          }.getOrElse(sys.error(s"If you see this, please file a bug report with your query."))
    +        }
    +
    +      WindowSpecDefinition(partitionSpec, orderSpec, windowFrame)
    +  }
    +
    +  protected def nodeToTransformation(
    +      node: ASTNode,
    +      child: LogicalPlan): Option[ScriptTransformation] = None
    +
    +  protected def nodeToGenerate(node: ASTNode, outer: Boolean, child: LogicalPlan): Generate = {
    +    val Token("TOK_SELECT", Token("TOK_SELEXPR", clauses) :: Nil) = node
    +
    +    val alias = getClause("TOK_TABALIAS", clauses).children.head.text
    +
    +    val generator = clauses.head match {
    +      case Token("TOK_FUNCTION", Token(functionName, Nil) :: children) =>
    +        UnresolvedGenerator(functionName, children.map(nodeToExpr))
    --- End diff --
    
    can we follow [HiveQl](https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala#L1823-L1838) here?
    
    Currently there are only 2 generators we need to support in lateral view: explode and json_tuple and we can hard code it here instead of creating `UnresolvedGenerator`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169007994
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/48766/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-169012694
  
    **[Test build #48771 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/48771/consoleFull)** for PR 10583 at commit [`43c29b7`](https://github.com/apache/spark/commit/43c29b7a2ba3598e50561a974dba1d763e90746c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48951594
  
    --- Diff: dev/deps/spark-deps-hadoop-2.2 ---
    @@ -5,8 +5,7 @@ activation-1.1.jar
     akka-actor_2.10-2.3.11.jar
     akka-remote_2.10-2.3.11.jar
     akka-slf4j_2.10-2.3.11.jar
    -antlr-2.7.7.jar
    -antlr-runtime-3.4.jar
    +antlr-runtime-3.5.2.jar
    --- End diff --
    
    I think we need to update LICENSE and NOTICE because of this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10583#issuecomment-168965050
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-12573][SPARK-12574][SQL] Move SQL Parse...

Posted by hvanhovell <gi...@git.apache.org>.

Github user hvanhovell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/10583#discussion_r48805244
  
    --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/parser/ParseUtils.java ---
    @@ -0,0 +1,163 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.parser;
    --- End diff --
    
    Yeah (reynold just suggested the same thing), I'll add it to catalyst.parser.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org