You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Suresh Thalamati (JIRA)" <ji...@apache.org> on 2016/03/15 23:53:33 UTC
[jira] [Commented] (SPARK-13820) TPC-DS Query 10 fails to compile

    [ https://issues.apache.org/jira/browse/SPARK-13820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196419#comment-15196419 ] 

Suresh Thalamati commented on SPARK-13820:
------------------------------------------

This query contains correlated subquery, it is not supported yet in spark sql.  

[~davies] I saw your PR https://github.com/apache/spark/pull/10706 on these kind of query,  are you planning to merge this for 2.0 ?

> TPC-DS Query 10 fails to compile
> --------------------------------
>
>                 Key: SPARK-13820
>                 URL: https://issues.apache.org/jira/browse/SPARK-13820
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.1
>         Environment: Red Hat Enterprise Linux Server release 7.1 (Maipo)
> Linux bigaperf116.svl.ibm.com 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 18:37:38 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Roy Cecil
>
> TPC-DS Query 10 fails to compile with the following error.
> Parsing error: KW_SELECT )=> ( KW_EXISTS subQueryExpression ) -> ^( TOK_SUBQUERY_EXPR ^( TOK_SUBQUERY_OP KW_EXISTS ) subQueryExpression ) );])
>         at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
>         at org.antlr.runtime.DFA.predict(DFA.java:144)
>         at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8155)
>         at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
> Parsing error: KW_SELECT )=> ( KW_EXISTS subQueryExpression ) -> ^( TOK_SUBQUERY_EXPR ^( TOK_SUBQUERY_OP KW_EXISTS ) subQueryExpression ) );])
>         at org.antlr.runtime.DFA.noViableAlt(DFA.java:158)
>         at org.antlr.runtime.DFA.predict(DFA.java:144)
>         at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpression(HiveParser_IdentifiersParser.java:8155)
>         at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceNotExpression(HiveParser_IdentifiersParser.java:9177)
> Query is pasted here for easy reproduction
>  select
>   cd_gender,
>   cd_marital_status,
>   cd_education_status,
>   count(*) cnt1,
>   cd_purchase_estimate,
>   count(*) cnt2,
>   cd_credit_rating,
>   count(*) cnt3,
>   cd_dep_count,
>   count(*) cnt4,
>   cd_dep_employed_count,
>   count(*) cnt5,
>   cd_dep_college_count,
>   count(*) cnt6
>  from
>   customer c
>   JOIN customer_address ca ON c.c_current_addr_sk = ca.ca_address_sk
>   JOIN customer_demographics ON cd_demo_sk = c.c_current_cdemo_sk
>   LEFT SEMI JOIN (select ss_customer_sk
>                   from store_sales
>                        JOIN date_dim ON ss_sold_date_sk = d_date_sk
>                   where
>                         d_year = 2002 and
>                         d_moy between 1 and 1+3) ss_wh1 ON c.c_customer_sk = ss_wh1.ss_customer_sk
>  where
>   ca_county in ('Rush County','Toole County','Jefferson County','Dona Ana County','La Porte County') and
>    exists (
>             select tmp.customer_sk from (
>             select ws_bill_customer_sk as customer_sk
>             from web_sales,date_dim
>             where
>                   web_sales.ws_sold_date_sk = date_dim.d_date_sk and
>                   d_year = 2002 and
>                   d_moy between 1 and 1+3
>             UNION ALL
>             select cs_ship_customer_sk as customer_sk
>             from catalog_sales,date_dim
>             where
>                   catalog_sales.cs_sold_date_sk = date_dim.d_date_sk and
>                   d_year = 2002 and
>                   d_moy between 1 and 1+3
>           ) tmp where c.c_customer_sk = tmp.customer_sk
>     )
>  group by cd_gender,
>           cd_marital_status,
>           cd_education_status,
>           cd_purchase_estimate,
>           cd_credit_rating,
>           cd_dep_count,
>           cd_dep_employed_count,
>           cd_dep_college_count
>  order by cd_gender,
>           cd_marital_status,
>           cd_education_status,
>           cd_purchase_estimate,
>           cd_credit_rating,
>           cd_dep_count,
>           cd_dep_employed_count,
>           cd_dep_college_count
>   limit 100;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org