You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Arina Ielchiieva (Jira)" <ji...@apache.org> on 2019/11/04 12:49:00 UTC
[jira] [Updated] (DRILL-6839) Failed to plan (aggregate + Hash or
NL join) when slice target is low
[ https://issues.apache.org/jira/browse/DRILL-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Arina Ielchiieva updated DRILL-6839:
------------------------------------
Fix Version/s: (was: 1.17.0)
> Failed to plan (aggregate + Hash or NL join) when slice target is low
> ----------------------------------------------------------------------
>
> Key: DRILL-6839
> URL: https://issues.apache.org/jira/browse/DRILL-6839
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Igor Guzenko
> Assignee: Igor Guzenko
> Priority: Major
>
> *Case 1.* When nested loop join is about to be used:
> - Option "_planner.enable_nljoin_for_scalar_only_" is set to false
> - Option "_planner.slice_target_" is set to low value for imitation of big input tables
>
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
> @BeforeClass
> public static void setUp() throws Exception {
> startCluster(ClusterFixture.builder(dirTestWatcher));
> }
> @Test
> public void testCrossJoinSucceedsForLowSliceTarget() throws Exception {
> try {
> client.alterSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName(), false);
> client.alterSession(ExecConstants.SLICE_TARGET, 1);
> queryBuilder().sql(
> "SELECT COUNT(l.nation_id) " +
> "FROM cp.`tpch/nation.parquet` l " +
> ", cp.`tpch/region.parquet` r")
> .run();
> } finally {
> client.resetSession(ExecConstants.SLICE_TARGET);
> client.resetSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName());
> }
> }
> }{code}
>
> *Case 2.* When hash join is about to be used:
> - Option "planner.enable_mergejoin" is set to false, so hash join will be used instead
> - Option "planner.slice_target" is set to low value for imitation of big input tables
> - Comment out //ruleList.add(HashJoinPrule.DIST_INSTANCE); in PlannerPhase.getPhysicalRules method
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
> @BeforeClass
> public static void setUp() throws Exception {
> startCluster(ClusterFixture.builder(dirTestWatcher));
> }
> @Test
> public void testInnerJoinSucceedsForLowSliceTarget() throws Exception {
> try {
> client.alterSession(PlannerSettings.MERGEJOIN.getOptionName(), false);
> client.alterSession(ExecConstants.SLICE_TARGET, 1);
> queryBuilder().sql(
> "SELECT COUNT(l.nation_id) " +
> "FROM cp.`tpch/nation.parquet` l " +
> "INNER JOIN cp.`tpch/region.parquet` r " +
> "ON r.nation_id = l.nation_id")
> .run();
> } finally {
> client.resetSession(ExecConstants.SLICE_TARGET);
> client.resetSession(PlannerSettings.MERGEJOIN.getOptionName());
> }
> }
> }
> {code}
>
> *Workaround:* To avoid the exception we need to set option "_planner.enable_multiphase_agg_" to false. By doing this we avoid unsuccessful attempts to create 2 phase aggregation plan in StreamAggPrule and guarantee that logical aggregate will be converted to physical one.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)