You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2017/06/02 17:15:04 UTC

[jira] [Commented] (HIVE-16811) Estimate statistics in absence of stats

    [ https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035066#comment-16035066 ] 

Hive QA commented on HIVE-16811:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12870882/HIVE-16811.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 134 failed/errored test(s), 10814 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[materialized_view_create_rewrite] (batchId=237)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_part] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_reordering_values] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats2] (batchId=82)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join_stats] (batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_12] (batchId=31)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[correlationoptimizer5] (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[explain_rearrange] (batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_join_breaktask] (batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join19] (batchId=29)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join42] (batchId=23)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join43] (batchId=5)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_cond_pushdown_unqual1] (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_cond_pushdown_unqual2] (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_cond_pushdown_unqual3] (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_cond_pushdown_unqual4] (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_hive_626] (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join_star] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[merge_join_1] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mergejoin] (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[mergejoins_mixed] (batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_outer_join5] (batchId=41)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_repeated_alias] (batchId=10)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_only_null] (batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_partial_size] (batchId=17)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[stats_ppr_all] (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_outer_join6] (batchId=39)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_context] (batchId=31)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[dynamic_partition_pruning_2] (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[global_limit] (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[reduce_deduplicate] (batchId=142)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=141)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_merge_2_orc] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[alter_merge_orc] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_smb_mapjoin_14] (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_11] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_16] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[auto_sortmerge_join_9] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketmapjoin1] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketpruning1] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_7] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_stats] (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[columnStatsUpdateForStatsOptimizer_1] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[column_access_stats] (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[column_table_stats] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_all_partitioned] (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_where_partitioned] (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[delete_whole_partition] (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynamic_partition_pruning] (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2] (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid] (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[filter_join_breaktask] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[infer_bucket_sort_bucketed_table] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_into_with_schema] (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_dynamic_partitioned] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_partitioned] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_partitioned] (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadata_only_queries] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[orc_create] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[partition_multilevels] (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[partition_shared_scan] (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppd_union_view] (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[ppr_pushdown] (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sample10] (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part] (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acid_part_update] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_acidvec_part_update] (batchId=146)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_complex] (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_nonvec_part_all_primitive] (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_complex] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_primitive] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part] (batchId=147)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_complex] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_nonvec_part_all_primitive] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_complex] (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_primitive] (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive] (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_18] (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[smb_mapjoin_19] (batchId=144)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[special_character_in_tabnames_1] (batchId=153)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[stats_only_null] (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[table_access_keys_stats] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_dml] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_schema_evolution] (batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_empty] (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_group_by] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_all_partitioned] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[update_where_partitioned] (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_auto_smb_mapjoin_14] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_count_distinct] (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partition_diff_num_cols] (batchId=155)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_partitioned_date_time] (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_context] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vectorized_dynamic_partition_pruning] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=98)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=98)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vector_join_part_col_char] (batchId=98)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vector_non_string_partition] (batchId=98)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join_stats2] (batchId=137)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_join_stats] (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_smb_mapjoin_14] (batchId=125)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[auto_sortmerge_join_9] (batchId=128)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[bucket_map_join_tez1] (batchId=136)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_stats] (batchId=106)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[column_access_stats] (batchId=124)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[filter_join_breaktask] (batchId=132)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join19] (batchId=113)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_cond_pushdown_unqual1] (batchId=119)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_cond_pushdown_unqual2] (batchId=107)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_cond_pushdown_unqual3] (batchId=120)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_cond_pushdown_unqual4] (batchId=101)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_hive_626] (batchId=135)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[join_star] (batchId=113)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[mergejoins_mixed] (batchId=130)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] (batchId=115)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_outer_join5] (batchId=118)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[stats_only_null] (batchId=112)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[table_access_keys_stats] (batchId=130)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5519/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5519/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5519/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 134 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12870882 - PreCommit-HIVE-Build

> Estimate statistics in absence of stats
> ---------------------------------------
>
>                 Key: HIVE-16811
>                 URL: https://issues.apache.org/jira/browse/HIVE-16811
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>         Attachments: HIVE-16811.1.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and this could lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY INT, 
> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY      INT,
>                                 L_PARTKEY       INT,
>                                 L_SUPPKEY       INT,
>                                 L_LINENUMBER    INT,
>                                 L_QUANTITY      DOUBLE,
>                                 L_EXTENDEDPRICE DOUBLE,
>                                 L_DISCOUNT      DOUBLE,
>                                 L_TAX           DOUBLE,
>                                 L_RETURNFLAG    STRING,
>                                 L_LINESTATUS    STRING,
>                                 l_shipdate      STRING,
>                                 L_COMMITDATE    STRING,
>                                 L_RECEIPTDATE   STRING,
>                                 L_SHIPINSTRUCT  STRING,
>                                 L_SHIPMODE      STRING,
>                                 L_COMMENT       STRING) partitioned by (dl int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
>     p_partkey INT,
>     p_name STRING,
>     p_mfgr STRING,
>     p_brand STRING,
>     p_type STRING,
>     p_size INT,
>     p_container STRING,
>     p_retailprice DOUBLE,
>     p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = l_partkey and s_suppkey = l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up with join at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)