You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2021/04/27 19:40:00 UTC

[jira] [Commented] (ARROW-12560) [C++] Investigate utilizing aggressive thread creation when adding callback to finished future.

    [ https://issues.apache.org/jira/browse/ARROW-12560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333511#comment-17333511 ] 

Weston Pace commented on ARROW-12560:
-------------------------------------

The problem is that we are using the CPU thread pool.  So, imagine for a moment that we are able to parse/decode CSV blocks in parallel (this may indeed be possible someday so hopefully it isn't too hard to imagine).

Also imagine that we have some fixed block readahead of 10.

Then we have something like the following (this is just rough pseudocode, we can assume ParseAndDecode is dropping the result in a vector somewhere and we aren't checking errors)...
{code:java}
for (int i = 0; i < 10; i++)
{
    source().Then(block => ParseAndDecode(block));
}
{code}
We want `source()` to be a background generator (reading along on its own I/O thread).  We want `ParseAndDecode` to run on the CPU thread.

Everything works fine if the I/O is slower than `ParseAndDecode`.  Every call to `source()` returns an unfinished future and, when it is finished, we will transfer to the CPU pool (creating a new thread task) and run the thread task.  So there will be 10 thread tasks for 10 blocks and some of those thread tasks may run in parallel (which is what we are after with parallel readahead here).

On the other hand, if I/O is faster than `ParseAndDecode` then calls to `source()` will start returning finished futures (the call to `Transfer` would have done nothing because the future was already finished).  There is no need to transfer, we are already on the CPU thread pool.  However, no thread task is created.  The entire operation runs serially as a single thread task.  In this case we want to create a new thread task to do our CPU work.  The cost of `ParseAndDecode` is expensive so we know we aren't creating too many thread tasks.  The main thread can then carry on and issue the next call to `source()`.

> [C++] Investigate utilizing aggressive thread creation when adding callback to finished future.
> -----------------------------------------------------------------------------------------------
>
>                 Key: ARROW-12560
>                 URL: https://issues.apache.org/jira/browse/ARROW-12560
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Assignee: Weston Pace
>            Priority: Major
>              Labels: async-util
>
> Imagine there is a slow map function (that could run in parallel) and a vector generator given a long vector of tasks.  If we apply map to the generator and then readahead we won't actually get any parallelism because the vector generator returns everything synchronously and so no thread task will ever be submitted.
> This hypothetical situation is a reality in some situations in the scanner.  For example, if scanning CSV files and the CPU threads fall behind the I/O threads then all callbacks will be synchronous (since the futures will already have been completed by the I/O threads).
> In such a situation we might benefit from creating a new thread task even though we wouldn't normally create one.  For example, if we have an idle core.  You can think of this as an analogue of work stealing.
> On the other hand, creating new thread tasks at any random callback might not be the most efficient. We could mitigate this by marking a callback as "potentially long" as some kind of hint when we add the callback to indicate it as eligible for eager thread creation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)