You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Suresh Ollala (JIRA)" <ji...@apache.org> on 2016/07/08 18:42:11 UTC
[jira] [Updated] (DRILL-4707) Conflicting columns names under
case-insensitive policy lead to either memory leak or incorrect result
[ https://issues.apache.org/jira/browse/DRILL-4707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Suresh Ollala updated DRILL-4707:
---------------------------------
Reviewer: Chun Chang
> Conflicting columns names under case-insensitive policy lead to either memory leak or incorrect result
> ------------------------------------------------------------------------------------------------------
>
> Key: DRILL-4707
> URL: https://issues.apache.org/jira/browse/DRILL-4707
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Jinfeng Ni
> Assignee: Jinfeng Ni
> Priority: Critical
> Fix For: 1.8.0
>
>
> On latest master branch:
> {code}
> select version, commit_id, commit_message from sys.version;
> +-----------------+-------------------------------------------+---------------------------------------------------------------------------------+
> | version | commit_id | commit_message |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------------------+
> | 1.7.0-SNAPSHOT | 3186217e5abe3c6c2c7e504cdb695567ff577e4c | DRILL-4607: Add a split function that allows to separate string by a delimiter |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------------------+
> {code}
> If a query has two conflicting column names under case-insensitive policy, Drill will either hit memory leak, or incorrect issue.
> Q1.
> {code}
> select r_regionkey as XYZ, r_name as xyz FROM cp.`tpch/region.parquet`;
> Error: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (131072)
> Allocator(op:0:0:1:Project) 1000000/131072/2490368/10000000000 (res/actual/peak/limit)
> Fragment 0:0
> {code}
> Q2: return only one column in the result.
> {code}
> select n_nationkey as XYZ, n_regionkey as xyz FROM cp.`tpch/nation.parquet`;
> +------+
> | XYZ |
> +------+
> | 0 |
> | 1 |
> | 1 |
> | 1 |
> | 4 |
> | 0 |
> | 3 |
> {code}
> The cause of the problem seems to be that the Project thinks the two incoming columns as identical (since Drill adopts case-insensitive for column names in execution).
> The planner should make sure that the conflicting columns are resolved, since execution is name-based.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)