You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/04/11 23:07:25 UTC
[jira] [Commented] (SPARK-4226) SparkSQL - Add support for
subqueries in predicates
[ https://issues.apache.org/jira/browse/SPARK-4226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235976#comment-15235976 ]
Apache Spark commented on SPARK-4226:
-------------------------------------
User 'hvanhovell' has created a pull request for this issue:
https://github.com/apache/spark/pull/12306
> SparkSQL - Add support for subqueries in predicates
> ---------------------------------------------------
>
> Key: SPARK-4226
> URL: https://issues.apache.org/jira/browse/SPARK-4226
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.2.0
> Environment: Spark 1.2 snapshot
> Reporter: Terry Siu
>
> I have a test table defined in Hive as follows:
> {code:sql}
> CREATE TABLE sparkbug (
> id INT,
> event STRING
> ) STORED AS PARQUET;
> {code}
> and insert some sample data with ids 1, 2, 3.
> In a Spark shell, I then create a HiveContext and then execute the following HQL to test out subquery predicates:
> {code}
> val hc = HiveContext(hc)
> hc.hql("select customerid from sparkbug where customerid in (select customerid from sparkbug where customerid in (2,3))")
> {code}
> I get the following error:
> {noformat}
> java.lang.RuntimeException: Unsupported language features in query: select customerid from sparkbug where customerid in (select customerid from sparkbug where customerid in (2,3))
> TOK_QUERY
> TOK_FROM
> TOK_TABREF
> TOK_TABNAME
> sparkbug
> TOK_INSERT
> TOK_DESTINATION
> TOK_DIR
> TOK_TMP_FILE
> TOK_SELECT
> TOK_SELEXPR
> TOK_TABLE_OR_COL
> customerid
> TOK_WHERE
> TOK_SUBQUERY_EXPR
> TOK_SUBQUERY_OP
> in
> TOK_QUERY
> TOK_FROM
> TOK_TABREF
> TOK_TABNAME
> sparkbug
> TOK_INSERT
> TOK_DESTINATION
> TOK_DIR
> TOK_TMP_FILE
> TOK_SELECT
> TOK_SELEXPR
> TOK_TABLE_OR_COL
> customerid
> TOK_WHERE
> TOK_FUNCTION
> in
> TOK_TABLE_OR_COL
> customerid
> 2
> 3
> TOK_TABLE_OR_COL
> customerid
> scala.NotImplementedError: No parse rules for ASTNode type: 817, text: TOK_SUBQUERY_EXPR :
> TOK_SUBQUERY_EXPR
> TOK_SUBQUERY_OP
> in
> TOK_QUERY
> TOK_FROM
> TOK_TABREF
> TOK_TABNAME
> sparkbug
> TOK_INSERT
> TOK_DESTINATION
> TOK_DIR
> TOK_TMP_FILE
> TOK_SELECT
> TOK_SELEXPR
> TOK_TABLE_OR_COL
> customerid
> TOK_WHERE
> TOK_FUNCTION
> in
> TOK_TABLE_OR_COL
> customerid
> 2
> 3
> TOK_TABLE_OR_COL
> customerid
> " +
>
> org.apache.spark.sql.hive.HiveQl$.nodeToExpr(HiveQl.scala:1098)
>
> at scala.sys.package$.error(package.scala:27)
> at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:252)
> at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:50)
> at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:49)
> at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
> {noformat}
> [This thread|http://apache-spark-user-list.1001560.n3.nabble.com/Subquery-in-having-clause-Spark-1-1-0-td17401.html] also brings up lack of subquery support in SparkSQL. It would be nice to have subquery predicate support in a near, future release (1.3, maybe?).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org