You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Chun Chang (JIRA)" <ji...@apache.org> on 2017/02/08 23:34:41 UTC

[jira] [Closed] (DRILL-2290) Very slow performance for a query involving nested map

     [ https://issues.apache.org/jira/browse/DRILL-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chun Chang closed DRILL-2290.
-----------------------------
    Resolution: Resolved

Tested with 1.8.0 and the performance hit is gone.

{noformat}
0: jdbc:drill:schema=dfs.md1314> select b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from dfs.`/drill/testdata/complex/json/complex.json` a inner join dfs.`/drill/testdata/complex/json/complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1 order by b.id limit 20;
+-----+----------+-----------------------------+----------+
| id  |  EXPR$1  |            oooi             |  EXPR$3  |
+-----+----------+-----------------------------+----------+
| 1   | 1.6789   | {"oa":{"oab":{}}}           | null     |
| 3   | 3.6789   | {"oa":{"oab":{}}}           | 3.5678   |
| 4   | 4.6789   | {"oa":{"oab":{}}}           | 4.5678   |
| 5   | 5.6789   | {"oa":{"oab":{}}}           | 5.5678   |
| 7   | 7.6789   | {"oa":{"oab":{}}}           | null     |
| 9   | 9.6789   | {"oa":{"oab":{}}}           | null     |
| 11  | 11.6789  | {"oa":{"oab":{}}}           | 11.5678  |
| 12  | 12.6789  | {"oa":{"oab":{}}}           | null     |
| 13  | 13.6789  | {"oa":{"oab":{}}}           | null     |
| 17  | 17.6789  | {"oa":{"oab":{}}}           | 17.5678  |
| 18  | 18.6789  | {"oa":{"oab":{}}}           | null     |
| 20  | 20.6789  | {"oa":{"oab":{}}}           | null     |
| 21  | 21.6789  | {"oa":{"oab":{"oabc":21}}}  | null     |
| 22  | 22.6789  | {"oa":{"oab":{"oabc":22}}}  | 22.5678  |
| 23  | 23.6789  | {"oa":{"oab":{}}}           | 23.5678  |
| 27  | 27.6789  | {"oa":{"oab":{}}}           | null     |
| 30  | 30.6789  | {"oa":{"oab":{}}}           | 30.5678  |
| 32  | 32.6789  | {"oa":{"oab":{"oabc":32}}}  | null     |
| 34  | 34.6789  | {"oa":{"oab":{"oabc":34}}}  | 34.5678  |
| 36  | 36.6789  | {"oa":{"oab":{}}}           | 36.5678  |
+-----+----------+-----------------------------+----------+
20 rows selected (142.316 seconds)
{noformat}

> Very slow performance for a query involving nested map
> ------------------------------------------------------
>
>                 Key: DRILL-2290
>                 URL: https://issues.apache.org/jira/browse/DRILL-2290
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 0.8.0
>            Reporter: Chun Chang
>             Fix For: Future
>
>
> #Thu Feb 19 18:40:10 EST 2015
> git.commit.id.abbrev=1ceddff
> This query took 17 minutes to complete. Too long. I think this happened after the fix dealing with nested maps.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner join `complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1 order by b.id limit 20;
> +------------+------------+------------+------------+
> |     id     |   EXPR$1   |    oooi    |   EXPR$3   |
> +------------+------------+------------+------------+
> | 1          | 1.6789     | {"oa":{"oab":{"oabc":1}}} | 1.5678     |
> | 3          | 3.6789     | {"oa":{"oab":{"oabc":3}}} | 3.5678     |
> | 4          | 4.6789     | {"oa":{"oab":{"oabc":4}}} | 4.5678     |
> | 5          | 5.6789     | {"oa":{"oab":{"oabc":5}}} | 5.5678     |
> | 7          | 7.6789     | {"oa":{"oab":{"oabc":7}}} | 7.5678     |
> | 9          | 9.6789     | {"oa":{"oab":{"oabc":9}}} | 9.5678     |
> | 10         | 10.6789    | {"oa":{"oab":{"oabc":10}}} | 10.5678    |
> | 11         | 11.6789    | {"oa":{"oab":{"oabc":11}}} | 11.5678    |
> | 13         | 13.6789    | {"oa":{"oab":{"oabc":13}}} | 13.5678    |
> | 14         | 14.6789    | {"oa":{"oab":{"oabc":14}}} | 14.5678    |
> | 15         | 15.6789    | {"oa":{"oab":{"oabc":15}}} | 15.5678    |
> | 16         | 16.6789    | {"oa":{"oab":{"oabc":16}}} | 16.5678    |
> | 17         | 17.6789    | {"oa":{"oab":{"oabc":17}}} | 17.5678    |
> | 18         | 18.6789    | {"oa":{"oab":{"oabc":18}}} | 18.5678    |
> | 19         | 19.6789    | {"oa":{"oab":{"oabc":19}}} | 19.5678    |
> | 20         | 20.6789    | {"oa":{"oab":{"oabc":20}}} | 20.5678    |
> | 21         | 21.6789    | {"oa":{"oab":{"oabc":21}}} | 21.5678    |
> | 22         | 22.6789    | {"oa":{"oab":{"oabc":22}}} | 22.5678    |
> | 24         | 24.6789    | {"oa":{"oab":{"oabc":24}}} | 24.5678    |
> | 25         | 25.6789    | {"oa":{"oab":{"oabc":25}}} | 25.5678    |
> +------------+------------+------------+------------+
> 20 rows selected (1020.036 seconds)
> {code}
> The query deals just a little less than 1 million records so should not be that slow.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner join `complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1) c;
> +------------+
> |   EXPR$0   |
> +------------+
> | 900190     |
> +------------+
> 1 row selected (700.516 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)