You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2017/02/16 07:03:41 UTC
[jira] [Commented] (HIVE-15933) Improve plans for correlated subquery with join and predicate

    [ https://issues.apache.org/jira/browse/HIVE-15933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869371#comment-15869371 ] 

Hive QA commented on HIVE-15933:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12852935/HIVE-15933.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 10229 tests executed
*Failed tests:*
{noformat}
TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=235)
TestSSL - did not produce a TEST-*.xml file (likely timed out) (batchId=214)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_auto_join1] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join31] (batchId=81)
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_join_with_different_encryption_keys] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[multiMapJoin2] (batchId=152)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=223)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=223)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket2] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[groupby8_map_skew] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[identity_project_remove_skip] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join31] (batchId=133)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[mapjoin_subquery] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[nullgroup2] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[smb_mapjoin_1] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[stats5] (batchId=116)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[union_remove_8] (batchId=116)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/3586/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/3586/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-3586/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12852935 - PreCommit-HIVE-Build

> Improve plans for correlated subquery with join and predicate
> -------------------------------------------------------------
>
>                 Key: HIVE-15933
>                 URL: https://issues.apache.org/jira/browse/HIVE-15933
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Planning
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>         Attachments: HIVE-15933.1.patch
>
>
> This is a continuation of HIVE-15905
> for queries such as:
> {code:SQL}
> explain select  
>   cd_gender,
>   cd_marital_status,
>   cd_education_status,
>   count(*) cnt1,
>   cd_purchase_estimate,
>   count(*) cnt2,
>   cd_credit_rating,
>   count(*) cnt3,
>   cd_dep_count,
>   count(*) cnt4,
>   cd_dep_employed_count,
>   count(*) cnt5,
>   cd_dep_college_count,
>   count(*) cnt6
>  from
>   customer c,customer_address ca,customer_demographics
>  where
>   c.c_current_addr_sk = ca.ca_address_sk and
>   ca_county in ('Walker County','Richland County','Gaines County','Douglas County','Dona Ana County') and
>   cd_demo_sk = c.c_current_cdemo_sk and 
>   exists (select *
>           from store_sales,date_dim
>           where c.c_customer_sk = ss_customer_sk and
>                 ss_sold_date_sk = d_date_sk and
>                 d_year = 2002 and
>                 d_moy between 4 and 4+3)
>  group by cd_gender,
>           cd_marital_status,
>           cd_education_status,
>           cd_purchase_estimate,
>           cd_credit_rating,
>           cd_dep_count,
>           cd_dep_employed_count,
>           cd_dep_college_count
>  order by cd_gender,
>           cd_marital_status,
>           cd_education_status,
>           cd_purchase_estimate,
>           cd_credit_rating,
>           cd_dep_count,
>           cd_dep_employed_count,
>           cd_dep_college_count
> limit 100;
> {code}
> HIVE generates un-necessary joins to produce value for correlated columns.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)