You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Weston Pace (Jira)" <ji...@apache.org> on 2022/04/21 07:31:00 UTC

[jira] [Updated] (ARROW-16246) [C++] Add backpressure to hash-join node

     [ https://issues.apache.org/jira/browse/ARROW-16246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Weston Pace updated ARROW-16246:
--------------------------------
    Description: 
There are three possible elements to this:

First, if the sink is slow, the hash-join node needs to pause emitting batches (after the hash table has been built, only applicable if we are doing spillover)

Second, if the left input to the hash join node is slow we may want to pause reading from the right input while we wait for the left input to catch up (I may have this backwards).  This is useful even if we don't have spillover.

Finally, if we are busy spilling to disk we may need to ask the source to pause (not clear if this will be needed or not)

  was:
There are two elements to this:

First, if the sink is slow, the hash-join node needs to pause emitting batches (after the hash table has been built, only applicable if we are doing spillover)

Second, if the left input to the hash join node is slow we may want to pause reading from the right input while we wait for the left input to catch up (I may have this backwards).  This is useful even if we don't have spillover.


> [C++] Add backpressure to hash-join node
> ----------------------------------------
>
>                 Key: ARROW-16246
>                 URL: https://issues.apache.org/jira/browse/ARROW-16246
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Weston Pace
>            Priority: Major
>
> There are three possible elements to this:
> First, if the sink is slow, the hash-join node needs to pause emitting batches (after the hash table has been built, only applicable if we are doing spillover)
> Second, if the left input to the hash join node is slow we may want to pause reading from the right input while we wait for the left input to catch up (I may have this backwards).  This is useful even if we don't have spillover.
> Finally, if we are busy spilling to disk we may need to ask the source to pause (not clear if this will be needed or not)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)