You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/05/11 23:24:00 UTC

[jira] [Commented] (DRILL-6410) Memory leak in Parquet Reader during cancellation

    [ https://issues.apache.org/jira/browse/DRILL-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472793#comment-16472793 ] 

ASF GitHub Bot commented on DRILL-6410:
---------------------------------------

sachouche opened a new pull request #1257: DRILL-6410: Fixed memory leak in flat Parquet reader
URL: https://github.com/apache/drill/pull/1257
 
 
   **Problem Description**
   - Occasionally, a memory leak is observed within the Parquet reader (flat) when query cancellation is invoked
   - I tried a previous attempt to address this issue but it seems it is still happening
   - Thus far, only QA have been able to observe this issue (and only occasionally)
   
   **Analysis**
   - There was a recent breakthrough which gives me hope for addressing this issue 
   - The leak logged two piece of information: leak size and state of the child allocator
   - The state of the child allocator indicated no leak (all allocated bytes released)
   - After code examination, it occurred to me this was happening because the Asynchronous Page Reader task was releasing the Drill buffer while the scan thread was closing the allocator
   - The code attempts to cancel asynchronous tasks and then release allocated buffers, though there is one big issue: the Java FutureTask.cancel(true) doesn't block during the cancellation process; this method **merely interrupts the asynchronous task** and proceeds
   - This means if the asynchronous thread was context switched or doing computation (not blocked waiting), then the fragment cleanup logic can close the allocator before all resources have been released
   
   **Fix**
   - The Java ThreadPoolExecutor and FutureTask have few extension points to enhance the task termination process
   - Created a new utility class which can create an ExecutorService with the ability to block during future cancellation
   - Blocking will happen only when the cancel method is allowed to interrupt the asynchronous task
   - Note that there shouldn't be any performance degradation as synchronization code was added only to cover the cancel path
   - Also added a new test-suite to test the correctness of this new utility
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Memory leak in Parquet Reader during cancellation
> -------------------------------------------------
>
>                 Key: DRILL-6410
>                 URL: https://issues.apache.org/jira/browse/DRILL-6410
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Major
>
> Occasionally, a memory leak is observed within the flat Parquet reader when query cancellation is invoked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)