You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (Jira)" <ji...@apache.org> on 2019/11/04 12:49:00 UTC

[jira] [Updated] (DRILL-6839) Failed to plan (aggregate + Hash or NL join) when slice target is low

     [ https://issues.apache.org/jira/browse/DRILL-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arina Ielchiieva updated DRILL-6839:
------------------------------------
    Fix Version/s:     (was: 1.17.0)

> Failed to plan (aggregate + Hash or NL join) when slice target is low 
> ----------------------------------------------------------------------
>
>                 Key: DRILL-6839
>                 URL: https://issues.apache.org/jira/browse/DRILL-6839
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Igor Guzenko
>            Assignee: Igor Guzenko
>            Priority: Major
>
> *Case 1.* When nested loop join is about to be used:
>  - Option "_planner.enable_nljoin_for_scalar_only_" is set to false
>  - Option "_planner.slice_target_" is set to low value for imitation of big input tables
>  
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
>  @BeforeClass
>  public static void setUp() throws Exception {
>  startCluster(ClusterFixture.builder(dirTestWatcher));
>  }
>  @Test
>  public void testCrossJoinSucceedsForLowSliceTarget() throws Exception {
>    try {
>      client.alterSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName(), false);
>      client.alterSession(ExecConstants.SLICE_TARGET, 1);
>      queryBuilder().sql(
>         "SELECT COUNT(l.nation_id) " +
>         "FROM cp.`tpch/nation.parquet` l " +
>         ", cp.`tpch/region.parquet` r")
>      .run();
>    } finally {
>     client.resetSession(ExecConstants.SLICE_TARGET);
>     client.resetSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName());
>    }
>  }
> }{code}
>  
> *Case 2.* When hash join is about to be used:
>  - Option "planner.enable_mergejoin" is set to false, so hash join will be used instead
>  - Option "planner.slice_target" is set to low value for imitation of big input tables
>  - Comment out //ruleList.add(HashJoinPrule.DIST_INSTANCE); in PlannerPhase.getPhysicalRules method
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
>  @BeforeClass
>  public static void setUp() throws Exception {
>    startCluster(ClusterFixture.builder(dirTestWatcher));
>  }
>  @Test
>  public void testInnerJoinSucceedsForLowSliceTarget() throws Exception {
>    try {
>     client.alterSession(PlannerSettings.MERGEJOIN.getOptionName(), false);
>     client.alterSession(ExecConstants.SLICE_TARGET, 1);
>     queryBuilder().sql(
>       "SELECT COUNT(l.nation_id) " +
>       "FROM cp.`tpch/nation.parquet` l " +
>       "INNER JOIN cp.`tpch/region.parquet` r " +
>       "ON r.nation_id = l.nation_id")
>     .run();
>    } finally {
>     client.resetSession(ExecConstants.SLICE_TARGET);
>     client.resetSession(PlannerSettings.MERGEJOIN.getOptionName());
>    }
>  }
> }
> {code}
>  
> *Workaround:* To avoid the exception we need to set option "_planner.enable_multiphase_agg_" to false. By doing this we avoid unsuccessful attempts to create 2 phase aggregation plan in StreamAggPrule and guarantee that logical aggregate will be converted to physical one. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)