You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2018/06/12 20:17:00 UTC

[jira] [Commented] (ARROW-501) [C++] Implement concurrent / buffering InputStream for streaming data use cases

    [ https://issues.apache.org/jira/browse/ARROW-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510145#comment-16510145 ] 

Wes McKinney commented on ARROW-501:
------------------------------------

I've been thinking about this lately. I would like to have a "Spooler" abstraction that performs IO or other concurrent work while another thread is doing something else. For example:

{code}
const int buffer_size = 1 << 20;
InputStreamSpooler spooler(stream, buffer_size);

// This will start reading the first chunk from stream in a background thread
spooler->Start();

std::shared_ptr<Buffer> chunk;

// This blocks until the chunk is ready, returns it, then begins reading the next chunk right away
while (!spooler->is_finished()) {
  RETURN_NOT_OK(spooler->Next(&chunk));
  // Do something with chunk
}

spooler->Stop();
{code}

Are there some concurrency primitives that provide this in libraries like facebook/folly that might help with this type of background concurrency? 

> [C++] Implement concurrent / buffering InputStream for streaming data use cases
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-501
>                 URL: https://issues.apache.org/jira/browse/ARROW-501
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>             Fix For: 0.11.0
>
>
> Related to ARROW-500, when processing an input data stream, we may wish to continue buffering input (up to an maximum buffer size) in between synchronous Read calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)