You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2022/07/25 15:41:00 UTC

[jira] [Commented] (ARROW-17198) [C++] Potential memory leak at shutdown if an exec plan with a scanner fails or is aborted immediately before shutdown

    [ https://issues.apache.org/jira/browse/ARROW-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17570962#comment-17570962 ] 

Weston Pace commented on ARROW-17198:
-------------------------------------

I'm able to reproduce this by compiling with ASAN, using stress to make the CPU busy, and running the following command:

{noformat}
while taskset -c 0,1 ./debug/arrow-dataset-scanner-test --gtest_filter=TestScannerThreading/TestScanner.FromReader/3Threaded2d16b1024r; do sleep 0.1; done
{noformat}

> [C++] Potential memory leak at shutdown if an exec plan with a scanner fails or is aborted immediately before shutdown
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-17198
>                 URL: https://issues.apache.org/jira/browse/ARROW-17198
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Minor
>              Labels: query-engine
>
> I'm primarily creating this so we can remember to make a test for this.  This problem should be solved as part of ARROW-16072.  When the scanner fails it simply discards references to the various scanner AsyncGenerators.  However, some I/O tasks may still have references to these generators and so some parts of the scanner survive after the plan itself is marked complete.  If there is an immediate shutdown then these parts will not be properly disposed of even though the plan is marked complete and it will show up as a memory leak.
> Example:
> https://pipelines.actions.githubusercontent.com/serviceHosts/8bb0d999-3387-4c48-9fa6-c66c718a46e2/_apis/pipelines/1/runs/359690/signedlogcontent/4?urlExpires=2022-07-25T14%3A43%3A01.2797488Z&urlSigningMethod=HMACV1&urlSignature=GS3lS09Q9sTRweN%2B8UEu2GwUGc%2FbO9eyH27FRKumbrg%3D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)