You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Deneche A. Hakim (JIRA)" <ji...@apache.org> on 2015/04/29 02:55:07 UTC
[jira] [Updated] (DRILL-2893) ScanBatch throws a NullPointerException instead of returning OUT_OF_MEMORY

     [ https://issues.apache.org/jira/browse/DRILL-2893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Deneche A. Hakim updated DRILL-2893:
------------------------------------
    Description: 
- set _drill.exec.memory.top.max_ in _drill-override.conf_ to some low value (I used _75000000_)
- disable hash aggregate (set _plannder.enable_hashagg_ to false)
- disable exchanges (set _plannder.disable_exchanges_ to true)
- run the following query
{noformat}
select count(*) from (select * from dfs.data.`tpch1/lineitem.parquet` order by l_orderkey);
{noformat}
and you should get the following error message:
{noformat}
Query failed: SYSTEM ERROR: null

Fragment 0:0

[e05ff3c2-e130-449e-b721-b3442796e29b on 172.30.1.1:31010]
{noformat}

We have 2 problems here:

1st: 
- ScanBatch detects that it can't allocate it's field value vectors and right before returning _OUT_OF_MEMORY_ downstream it calls _clear() on the field vectors
- one of those vectors actually threw a NullPointerException in it's _allocateNew()_ methods after it cleared it's buffer and couldn't allocate a new one
- when ScanBatch tries to clean that vector, it will throw a NullPointerException which will prevent the ScanBatch from returning _OUT_OF_MEMORY_ and will cancel the query instead

2nd problem:
- once the query has been canceled, _ScanBatch.cleanup()_ will throw another _NullPointerException_ when cleaning the field vectors, which will prevent the cleanup of the remaining resources and will cause a memory leak

  was:
- set _drill.exec.memory.top.max_ in _drill-override.conf_ to some low value (I used _1000000000_)
- disable hash aggregate (set _plannder.enable_hashagg_ to false)
- disable exchanges (set _plannder.disable_exchanges_ to true)
- run the following query
{noformat}
select count(*) from (select * from dfs.data.`tpch1/lineitem.parquet` order by l_orderkey);
{noformat}
and you should get the following error message:
{noformat}
Query failed: SYSTEM ERROR: null

Fragment 0:0

[e05ff3c2-e130-449e-b721-b3442796e29b on 172.30.1.1:31010]
{noformat}

We have 2 problems here:

1st: 
- ScanBatch detects that it can't allocate it's field value vectors and right before returning _OUT_OF_MEMORY_ downstream it calls _clear() on the field vectors
- one of those vectors actually threw a NullPointerException in it's _allocateNew()_ methods after it cleared it's buffer and couldn't allocate a new one
- when ScanBatch tries to clean that vector, it will throw a NullPointerException which will prevent the ScanBatch from returning _OUT_OF_MEMORY_ and will cancel the query instead

2nd problem:
- once the query has been canceled, _ScanBatch.cleanup()_ will throw another _NullPointerException_ when cleaning the field vectors, which will prevent the cleanup of the remaining resources and will cause a memory leak


> ScanBatch throws a NullPointerException instead of returning OUT_OF_MEMORY
> --------------------------------------------------------------------------
>
>                 Key: DRILL-2893
>                 URL: https://issues.apache.org/jira/browse/DRILL-2893
>             Project: Apache Drill
>          Issue Type: Sub-task
>          Components: Execution - Relational Operators
>            Reporter: Deneche A. Hakim
>            Assignee: Deneche A. Hakim
>             Fix For: 1.0.0
>
>
> - set _drill.exec.memory.top.max_ in _drill-override.conf_ to some low value (I used _75000000_)
> - disable hash aggregate (set _plannder.enable_hashagg_ to false)
> - disable exchanges (set _plannder.disable_exchanges_ to true)
> - run the following query
> {noformat}
> select count(*) from (select * from dfs.data.`tpch1/lineitem.parquet` order by l_orderkey);
> {noformat}
> and you should get the following error message:
> {noformat}
> Query failed: SYSTEM ERROR: null
> Fragment 0:0
> [e05ff3c2-e130-449e-b721-b3442796e29b on 172.30.1.1:31010]
> {noformat}
> We have 2 problems here:
> 1st: 
> - ScanBatch detects that it can't allocate it's field value vectors and right before returning _OUT_OF_MEMORY_ downstream it calls _clear() on the field vectors
> - one of those vectors actually threw a NullPointerException in it's _allocateNew()_ methods after it cleared it's buffer and couldn't allocate a new one
> - when ScanBatch tries to clean that vector, it will throw a NullPointerException which will prevent the ScanBatch from returning _OUT_OF_MEMORY_ and will cancel the query instead
> 2nd problem:
> - once the query has been canceled, _ScanBatch.cleanup()_ will throw another _NullPointerException_ when cleaning the field vectors, which will prevent the cleanup of the remaining resources and will cause a memory leak



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)