You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2015/06/30 02:08:04 UTC

[jira] [Commented] (HIVE-11141) Improve RuleRegExp when the Expression node stack gets huge

    [ https://issues.apache.org/jira/browse/HIVE-11141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606699#comment-14606699 ] 

Hive QA commented on HIVE-11141:
--------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12742590/HIVE-11141.2.patch

{color:red}ERROR:{color} -1 due to 116 failed/errored test(s), 9034 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join28
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_all_non_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_all_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_orig_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_delete_whole_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby2_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby8_map_skew
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_rollup1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_update_delete
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join40
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_partition_metadataonly
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_filter_on_outerjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multi_insert_gby3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonblock_op_deduplicate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partInit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_date2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ptf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reduce_deduplicate_extended
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_reducesink_dedup
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union27
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_all_non_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_all_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_update_two_cols
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mr_diff_schema_alias
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_12
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket_map_join_tez1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_non_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_all_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_orig_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_delete_whole_partition
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_groupby2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_insert_update_delete
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ptf
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union_multiinsert
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_unionDistinct_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_all_non_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_all_partitioned
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_update_two_cols
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_nulls
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mr_diff_schema_alias
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_limit
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_groupby2
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_reduce_deduplicate
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join28
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_nulls
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cross_product_check_2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby8_map_skew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_cube1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby_rollup1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join20
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join40
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_partition_metadataonly
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_mapjoin_filter_on_outerjoin
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_multi_insert_gby3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_ptf
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union27
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
org.apache.hadoop.hive.ql.TestMTQueries.testMTQueries1
org.apache.hadoop.hive.ql.TestTxnCommands2.testDeleteIn
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4434/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4434/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4434/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 116 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12742590 - PreCommit-HIVE-TRUNK-Build

> Improve RuleRegExp when the Expression node stack gets huge
> -----------------------------------------------------------
>
>                 Key: HIVE-11141
>                 URL: https://issues.apache.org/jira/browse/HIVE-11141
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: HIVE-11141.1.patch, HIVE-11141.2.patch, SQLQuery10.sql.mssql, createtable.rtf
>
>
> Hive occasionally gets bottlenecked on generating plans for large queries, the majority of the cases time is spent in fetching metadata, partitions and other optimizer transformation related rules
> I have attached the query for the test case which needs to be tested after we setup database as shown below.
> {code}
> create database dataset_3;
> use database dataset_3;
> {code}
> createtable.rtf - create table command
> SQLQuery10.sql.mssql - explain query
> It seems that the most problematic part of the code as the stack gets arbitrary long, in RuleRegExp.java
> {code}
>   @Override
>   public int cost(Stack<Node> stack) throws SemanticException {
>     int numElems = (stack != null ? stack.size() : 0);
>     String name = "";
>     for (int pos = numElems - 1; pos >= 0; pos--) {
>       name = stack.get(pos).getName() + "%" + name;
>       Matcher m = pattern.matcher(name);
>       if (m.matches()) {
>         return m.group().length();
>       }
>     }
>     return -1;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)