You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Gaurav Tiwari (JIRA)" <ji...@apache.org> on 2016/03/23 18:56:25 UTC

[jira] [Created] (SPARK-14097) Spark SQL Optimization is not consistent

Gaurav Tiwari created SPARK-14097:
-------------------------------------

             Summary: Spark SQL Optimization is not consistent
                 Key: SPARK-14097
                 URL: https://issues.apache.org/jira/browse/SPARK-14097
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.6.1, 1.5.2
            Reporter: Gaurav Tiwari
            Priority: Minor


I am trying to execute a simple query with join on 3 tables. When I look at the execution plan , it varies with position of table in the "from" clause of the query. Execution plan looks more optimized when the position of table with predicates is specified before any other table. 

Original query : 

select distinct pge.portfolio_code 
from table1 pge join table2 p 
on p.perm_group = pge.anc_port_group 
join table3 uge 
on p.user_group=uge.anc_user_group 
where uge.user_name = 'user' and p.perm_type = 'TEST' 

Execution Plan for original query:

== Physical Plan == 
TungstenAggregate(key=[portfolio_code#14119], functions=[], output=[portfolio_code#14119]) 
 TungstenExchange hashpartitioning(portfolio_code#14119) 
  TungstenAggregate(key=[portfolio_code#14119], functions=[], output=[portfolio_code#14119]) 
   TungstenProject [portfolio_code#14119] 
    BroadcastHashJoin [user_group#13665], [anc_user_group#13658], BuildRight 
     TungstenProject [portfolio_code#14119,user_group#13665] 
      BroadcastHashJoin [anc_port_group#14117], [perm_group#13667], BuildRight 
       ConvertToUnsafe 
        Scan ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table1.parquet][portfolio_code#14119,anc_port_group#14117] 
       ConvertToUnsafe 
        Project [user_group#13665,perm_group#13667] 
         Filter (perm_type#13666 = TEST) 
          Scan ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table2.parquet][user_group#13665,perm_group#13667,perm_type#13666] 
     ConvertToUnsafe 
      Project [anc_user_group#13658] 
       Filter (user_name#13659 = user) 
        Scan ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table3.parquet][anc_user_group#13658,user_name#13659] 

Optimized query (table with predicates is moved ahead): 
select distinct pge.portfolio_code 
from table1 uge, table2 p, table3 pge 
where uge.user_name = 'user' and p.perm_type = 'TEST' 
and p.perm_group = pge.anc_port_group 
and p.user_group=uge.anc_user_group 

Execution Plan for  optimized query:
== Physical Plan == 
TungstenAggregate(key=[portfolio_code#14119], functions=[], output=[portfolio_code#14119]) 
 TungstenExchange hashpartitioning(portfolio_code#14119) 
  TungstenAggregate(key=[portfolio_code#14119], functions=[], output=[portfolio_code#14119]) 
   TungstenProject [portfolio_code#14119] 
    BroadcastHashJoin [perm_group#13667], [anc_port_group#14117], BuildRight 
     TungstenProject [perm_group#13667] 
      BroadcastHashJoin [anc_user_group#13658], [user_group#13665], BuildRight 
       ConvertToUnsafe 
        Project [anc_user_group#13658] 
         Filter (user_name#13659 = user) 
          Scan ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table3.parquet][anc_user_group#13658,user_name#13659] 
       ConvertToUnsafe 
        Project [perm_group#13667,user_group#13665] 
         Filter (perm_type#13666 = TEST) 
          Scan ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table2.parquet][perm_group#13667,user_group#13665,perm_type#13666] 
     ConvertToUnsafe 
      Scan ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table1.parquet][portfolio_code#14119,anc_port_group#14117] 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org