You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2014/04/23 14:28:20 UTC

[jira] [Commented] (HIVE-6955) ExprNodeColDesc isSame doesn't account for tabAlias: this affects trait Propagation in Joins

    [ https://issues.apache.org/jira/browse/HIVE-6955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978133#comment-13978133 ] 

Hive QA commented on HIVE-6955:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12641301/HIVE-6955.1.patch

{color:red}ERROR:{color} -1 due to 41 failed/errored test(s), 5417 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join32
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nullgroup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_select_dummy_source
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_partscan_1_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_symlink_text_input_format
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_current_database
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_9
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_dynamic_partitions_with_whitelist
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_partialscan_autogether
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/11/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/11/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 41 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12641301

> ExprNodeColDesc isSame doesn't account for tabAlias: this affects trait Propagation in Joins
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-6955
>                 URL: https://issues.apache.org/jira/browse/HIVE-6955
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Harish Butani
>            Assignee: Harish Butani
>         Attachments: HIVE-6955.1.patch
>
>
> For tpcds Q15:
> {code}
> explain
> select ca_zip, sum(cs_sales_price)
> from catalog_sales, customer, customer_address, date_dim
> where catalog_sales.cs_bill_customer_sk = customer.c_customer_sk
>   and customer.c_current_addr_sk = customer_address.ca_address_sk
>   and (substr(ca_zip,1,5) in ('85669', '86197','88274','83405','86475',
>                               '85392', '85460', '80348', '81792')
>        or ca_state in ('CA','WA','GA')
>        or cs_sales_price > 500)
>   and catalog_sales.cs_sold_date_sk = date_dim.d_date_sk
>   and d_qoy = 2 and d_year = 2001
> group by ca_zip
> order by ca_zip
> limit 100;
> {code}
> The Traits setup for the Operators are:
> {code}
> FIL[23]: bucketCols=[[]],numBuckets=-1
> RS[11]: bucketCols=[[VALUE._col0]],numBuckets=-1
> JOIN[12]: bucketCols=[[_col71], [_col71]],numBuckets=-1
> FIL[13]: bucketCols=[[_col71], [_col71]],numBuckets=-1
> SEL[14]: bucketCols=[[_col71], [_col71]],numBuckets=-1
> GBY[15]: bucketCols=[[_col0]],numBuckets=-1
> RS[16]: bucketCols=[[KEY._col0]],numBuckets=-1
> GBY[17]: bucketCols=[[_col0]],numBuckets=-1
> SEL[18]: bucketCols=[[_col0]],numBuckets=-1
> LIM[21]: bucketCols=[[_col0]],numBuckets=-1
> FS[22]: bucketCols=[[_col0]],numBuckets=-1
> TS[3]: bucketCols=[[]],numBuckets=-1
> RS[5]: bucketCols=[[VALUE._col0]],numBuckets=-1
> JOIN[6]: bucketCols=[[_col3], [_col36]],numBuckets=-1
> RS[7]: bucketCols=[[VALUE._col40]],numBuckets=-1
> JOIN[9]: bucketCols=[[_col40], [_col0]],numBuckets=-1
> RS[10]: bucketCols=[[VALUE._col0]],numBuckets=-1
> TS[1]: bucketCols=[[]],numBuckets=-1
> RS[8]: bucketCols=[[VALUE._col0]],numBuckets=-1
> TS[0]: bucketCols=[[]],numBuckets=-1
> RS[4]: bucketCols=[[VALUE._col3]],numBuckets=-1
> {code}
> This is incorrect:
> Join[9] joins ca join (cs join cust). In this case both sides of join have a '_col0' column. The reverse mapping of trait propagation relies on ExprNodeColumnDesc.isSame; since this doesn't account for the tabAlias we end up with Join[9] being bucketed on cs_sold_date_sk; Join[12] has the same issue, only compounds the error.



--
This message was sent by Atlassian JIRA
(v6.2#6252)