You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/15 23:05:00 UTC

[jira] [Commented] (DRILL-6374) Transitive Closure leads to TPCH Queries regressions and OOM when run concurrency test

    [ https://issues.apache.org/jira/browse/DRILL-6374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476587#comment-16476587 ] 

ASF GitHub Bot commented on DRILL-6374:
---------------------------------------

vdiravka opened a new pull request #1262: DRILL-6374: Transitive Closure leads to TPCH Queries regressions and OOM when run concurrency test
URL: https://github.com/apache/drill/pull/1262
 
 
   **The issue**: Using of DRILL_FILTER_ON_JOIN in early planning stage leads to impossibility of removing redundant Projects in main LOGICAL stage. 
   
   **Solution**: The main idea to use TRANSITIVE_CLOSURE rules after main LOGICAL stage of rules. 
   
   * New STRICT_EQUAL_IS_DISTINCT_FROM predicate for FilterJoinRules is added to pulled up redundant filters from Join condition to above filter (similar to DrillJoinRule). Redundant conditions in Joins can lead to further errors in planning.
   * Change performing LogicalOptimizerRules from LOGICAL stage to PARTITION_PRUNING. 
   LOGICAL stage shouldn't involve pruning:
   https://github.com/vdiravka/drill/blob/87beac7e6f64ce5f6dedfa2b22cc5c9099caeaec/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java#L166
   Hive logical optimizer rules involves pruning:
   https://github.com/vdiravka/drill/blob/87beac7e6f64ce5f6dedfa2b22cc5c9099caeaec/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveStoragePlugin.java#L171
   
   * Refactoring in DrillPushFilterPastProjectRule.java is made, which isn't related to the root cause of the issue.
   * One unit test is enabled, since optimizations for aggregations are done before TC.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Transitive Closure leads to TPCH Queries regressions and OOM when run concurrency test
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-6374
>                 URL: https://issues.apache.org/jira/browse/DRILL-6374
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.14.0
>         Environment: RHEL 7
>            Reporter: Dechang Gu
>            Assignee: Vitalii Diravka
>            Priority: Critical
>             Fix For: 1.14.0
>
>         Attachments: TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json, TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json
>
>
> Run TPCH regression test on Apache Drill 1.14.0 master commit 6fcaf4268eddcb09010b5d9c5dfb3b3be5c3f903 (DRILL-6173), most of the queries regressed.
> In particular, TPC-H Query 9 takes about 4x time (36 sec vs 8.6 sec), comparing to that when run against the parent commit (9173308710c3decf8ff745493ad3e85ccdaf7c37).
> Further in the concurrency test for the commit, with 48 clients each running 16 TPCH queries (so total 768 queries are executed) with planner.width.max_per_node=5, some queries hit OOM and caused 273 queries failed, while for the parent commit all the 768 queries completed successfully.
>  
> Profiles for TPCH_09 in the regression tests are uploaded:
>  * The failing commit  file name: [^TPCH_09_2_id_2517381b-1a61-3db5-40c3-4463bd421365.json],
>  * The parent commit file name: [^TPCH_09_2_id_2517497b-d4da-dab6-6124-abde5804a25f.json] ).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)