You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/05/26 07:58:00 UTC

[jira] [Commented] (ARROW-12879) [C++] Thread pool leaks memory when forking (and could maybe deadlock) if threads exist at the time of fork

    [ https://issues.apache.org/jira/browse/ARROW-12879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351610#comment-17351610 ] 

Antoine Pitrou commented on ARROW-12879:
----------------------------------------

Leaking memory when forking a process with threads is an unavoidable fact of life (the dead threads will still hold to unreleased memory, for example through shared_ptrs held in local frames of execution). I'm not sure there's any point in trying to solve this. If you fork a process with threads, the only reasonable thing you can do in the child is spawn another executable (using e.g. exec()).

> [C++] Thread pool leaks memory when forking (and could maybe deadlock) if threads exist at the time of fork
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-12879
>                 URL: https://issues.apache.org/jira/browse/ARROW-12879
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 4.0.0
>            Reporter: Weston Pace
>            Priority: Major
>
> While working on ARROW-12878 I have made the leak more obvious.  When we fork we cannot delete any remaining std::thread.  In addition, we cannot safely use any mutexes that might have been claimed by child threads.
>  
> The existing implementation works around this by creating a new ThreadPool::State instance.  However, shared_ptr's to the old instance are still held by (now defunct) std::thread instances and so the state object will never be deleted (valgrind confirms this).
>  
> Furthermore, if the fork were to happen while a thread task was running and had captured some mutex (e.g. any of the ones used in the datasets API) then that mutex will never be released.
>  
> A more correct workaround would be to hook into pthread_atfork and shut down all threads (don't have to wait for all jobs to complete), forking, then restarting all the threads on BOTH the child and the parent (today we restart on just the child and we leave the parent running).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)