You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Sean Hsuan-Yi Chu (JIRA)" <ji...@apache.org> on 2015/05/16 06:21:00 UTC

[jira] [Created] (DRILL-3117) Wrong Join-Order when In-List is materialized as a table

Sean Hsuan-Yi Chu created DRILL-3117:
----------------------------------------

             Summary: Wrong Join-Order when In-List is materialized as a table
                 Key: DRILL-3117
                 URL: https://issues.apache.org/jira/browse/DRILL-3117
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
            Reporter: Sean Hsuan-Yi Chu
            Assignee: Sean Hsuan-Yi Chu


After the number of elements in In-List exceeds a threshold (set as 20 by DRILL-3009), Drill materializes In-List into a table as an alternative to performing lots of comparisons for each row. For instance, assuming `c.json` has lots of rows (> 100,000) and we have this query:

select a.col from `c.json` a, `c.json` b
where a.col = b.col and a.id in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21);

Currently, Calcite generates a plan which performs JOIN on tables a & b firstly. However, this is an extremely expensive operation. Instead, Drill should have JOINed the materialized table with table a. 

This issue is also the root reason for the slow query reported in DRILL-2929.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)