You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2021/04/09 21:20:00 UTC

[jira] [Resolved] (ARROW-12208) [C++] Add the ability to run async tasks without using the CPU thread pool

     [ https://issues.apache.org/jira/browse/ARROW-12208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Neal Richardson resolved ARROW-12208.
-------------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Issue resolved by pull request 9892
[https://github.com/apache/arrow/pull/9892]

> [C++] Add the ability to run async tasks without using the CPU thread pool
> --------------------------------------------------------------------------
>
>                 Key: ARROW-12208
>                 URL: https://issues.apache.org/jira/browse/ARROW-12208
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>          Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> This is required allow Windows 32-bit RTools 3.5 builds avoid crashing as they don't seem to properly implement the `<threading>` header.
>  
> However, it could be generally useful to anyone that wants to avoid thread creation.
>  
> Currently, the asynchronous approaches introduce necessary threading.  For example, even a simple call to check if the CSVFileFormat supports a file requires peeking the file and reading the first block.  These I/O operations happen on the I/O pools and then are transferred to the CPU thread pool (which is NOT the same thing as the calling thread) meanwhile the calling thread is blocked waiting for results.
>  
> This can be avoided by treating the calling thread as a single threaded thread pool and then using that as the CPU thread pool.  This allows all CPU work to be done on the calling thread alone.  This could also allow us to remove duplicate code paths (e.g. code paths that exist only to keep functions serial such as the serial CSV reader) in the future.
>  
> This capability could be extended to include the I/O thread pool as well at some point in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)