You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/08/25 17:30:53 UTC

[GitHub] [arrow] pitrou opened a new pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

pitrou opened a new pull request #8052:
URL: https://github.com/apache/arrow/pull/8052


   The goal is to have a standardized ABI to communicate streams of homogeneous arrays or record batches (for example for database result sets).
   
   The trickiest part is error reporting.  This proposal tries to strike a compromise between simplicity (an integer error code mapping to errno values) and expressivity (an optional description string for application-specific and context-specific details).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] hannesmuehleisen commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
hannesmuehleisen commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276127






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-680168437


   https://issues.apache.org/jira/browse/ARROW-9761


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] hannesmuehleisen commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
hannesmuehleisen commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276127


   One thing we noticed: There is no way to `rewind` such a stream, but we would really like to be able to do so. For example, if someone defines an Arrow-backed SQL VIEW, it can be scanned multiple times. Once it has been read once, it is currently invalidated and cannot be used again. Would it be possible to add a callback for that? If not supported, it could be set to `NULL`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477293832



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.

Review comment:
       I don't how to phrase it: it returns value that are `errno` error codes (in case of error). A number of values are standard in C++: https://en.cppreference.com/w/cpp/error/errno_macros
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696357318


   Another thing that occurred to me is whether we want to enable batch-level metadata (which would be implementation-defined). This is supported in Flight for example 
   
   https://github.com/apache/arrow/blob/master/format/Flight.proto#L316


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696286568


   I think JDBC might in some cases allow for "scrolling" type of rewind vs restart (not saying this should support it but it is worth mentioning.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-680876865


   However, as said above, perhaps returning `-1` (rather than `0` for success or a positive `errno`-like value for errors) could mean an end of stream. Thoughts?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-702018712


   I don't think it makes sense to continue hesitating on the API. I think I'm going to merge the PR as-is. The interface is marked experimental so can still evolve (or even be removed).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #8052:
URL: https://github.com/apache/arrow/pull/8052


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477023156



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_next)(struct ArrowArrayStream*, struct ArrowArray* out);
+
+  // Callback to get optional detailed error information.
+  // This must only be called if the last stream operation failed
+  // with a non-0 return code.  The returned pointer is only valid until
+  // the next operation on this stream (including release).
+  // If unavailable, NULL is returned.
+  const char* (*get_last_error)(struct ArrowArrayStream*);
+
+  // Release callback: release the stream's own resources.
+  // Note that arrays returned by `get_next` must be individually released.

Review comment:
       are arrays produced by this stream tied to the lifecycle of this stream (must they be released first?)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-689996230


   > Rebased. Does someone want to review the doc at https://pitrou.net/arrowdevdoc/format/CStreamInterface.html ?
   
   I took a quick look and seemed reasonable (nothing jumped out at me that needed changes).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696303007


   Rewinding doesn't strike me as something which needs to be part of the C stream protocol. APIs can still provide rewind and other semantics while using a simple-as-possible stream as a building block.
   
   In the case of a SQL view which can be viewed multiple times for example, let the SQL view need not be a stream. It could instead be a function returning streams (each beginning at the start of the view):
   ```diff
   -ArrowArrayStream* MakeSqlView(...);
   +SqlView* MakeSqlView(...);
   +ArrowArrayStream* GetStream(SqlView*);
   ```
   
   Rewind, scrolling, offsets, reverse iteration, etc can all be accommodated in this fashion so IMHO they don't belong in the protocol.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477023405



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)

Review comment:
       `released` is something defined in the ArrowArray-spec.  Is it stronger or weaker guarantee then returning a nullptr here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696357318


   Another thing that occurred to me is whether we want to enable batch-level metadata (which would be implementation-defined). This is supported in Flight for example 
   
   https://github.com/apache/arrow/blob/master/format/Flight.proto#L316


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696286568






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-689575836


   Rebased. Does someone want to review the doc at https://pitrou.net/arrowdevdoc/format/CStreamInterface.html ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-680762232


   It's possible to have some chunks with 0 length in a stream, yes. I don't see any reason to forbid it in this API (and such corner cases are often annoying to deal with).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou edited a comment on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou edited a comment on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276591


   Would `rewind` go back to the start of stream always?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
emkornfield commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696288457


   Starting simple sounds good to me.  At least for JDBC, I don't think it is required for driver implementations to support scrolling.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696183147


   Rebased again. We can probably merge soon, if CI agrees.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276591


   Does `rewing` go back to the start of stream always?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477022712



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_next)(struct ArrowArrayStream*, struct ArrowArray* out);
+
+  // Callback to get optional detailed error information.
+  // This must only be called if the last stream operation failed
+  // with a non-0 return code.  The returned pointer is only valid until
+  // the next operation on this stream (including release).
+  // If unavailable, NULL is returned.
+  const char* (*get_last_error)(struct ArrowArrayStream*);
+
+  // Release callback: release the stream's own resources.
+  // Note that arrays returned by `get_next` must be individually released.
+  void (*release)(struct ArrowArrayStream*);
+  // Opaque producer-specific data

Review comment:
       nit ntew line here or remove the new line above?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-680167277


   cc @wesm 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477293832



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.

Review comment:
       I don't know how to phrase it: it returns value that are `errno` error codes (in case of error). A number of values are standard in C++: https://en.cppreference.com/w/cpp/error/errno_macros
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696287452


   Each environment (Python, C++, JDBC, etc.) has its own feature set for streams / iterators, which doesn't make our task easy :-/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] bkietz commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
bkietz commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696303007


   Rewinding doesn't strike me as something which needs to be part of the C stream protocol. APIs can still provide rewind and other semantics while using a simple-as-possible stream as a building block.
   
   In the case of a SQL view which can be viewed multiple times for example, let the SQL view need not be a stream. It could instead be a function returning streams (each beginning at the start of the view):
   ```diff
   -ArrowArrayStream* MakeSqlView(...);
   +SqlView* MakeSqlView(...);
   +ArrowArrayStream* GetStream(SqlView*);
   ```
   
   Rewind, scrolling, offsets, reverse iteration, etc can all be accommodated in this fashion so IMHO they don't belong in the protocol.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696183147






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-684920129


   I added some docs, please take a look.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] emkornfield commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
emkornfield commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477022487



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.

Review comment:
       silly question is `errno`-compatible a well defined unix/windows term? someplace where I can read more about it?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] hannesmuehleisen commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
hannesmuehleisen commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696277621


   > Would `rewind` go back to the start of stream always?
   
   I would think so, no? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou edited a comment on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou edited a comment on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696276591


   Would `rewind` go back to the start of stream always?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477295669



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)

Review comment:
       A nullptr cannot be returned. The callback returns an `int`. However, we could say that returning `-1` means end of stream.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C stream inferface

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-696282466


   Ok, I just wanted to make sure you weren't looking to go back a number of rows given as argument :-)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#discussion_r477294326



##########
File path: cpp/src/arrow/c/abi.h
##########
@@ -60,6 +60,31 @@ struct ArrowArray {
   void* private_data;
 };
 
+// EXPERIMENTAL
+struct ArrowArrayStream {
+  // Callback to get the stream type
+  // (will be the same for all arrays in the stream).
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out);
+  // Callback to get the next array
+  // (if no error and the array is released, the stream has ended)
+  // Return value: 0 if successful, an `errno`-compatible error code otherwise.
+  int (*get_next)(struct ArrowArrayStream*, struct ArrowArray* out);
+
+  // Callback to get optional detailed error information.
+  // This must only be called if the last stream operation failed
+  // with a non-0 return code.  The returned pointer is only valid until
+  // the next operation on this stream (including release).
+  // If unavailable, NULL is returned.
+  const char* (*get_last_error)(struct ArrowArrayStream*);
+
+  // Release callback: release the stream's own resources.
+  // Note that arrays returned by `get_next` must be individually released.

Review comment:
       Hmm... I'd say no.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] hannesmuehleisen commented on pull request #8052: ARROW-9761: [C/C++] Add experimental C ArrowArrayStream ABI

Posted by GitBox <gi...@apache.org>.
hannesmuehleisen commented on pull request #8052:
URL: https://github.com/apache/arrow/pull/8052#issuecomment-680673004


   I have one suggestion: The proposal indicates that the stream is finished when the array resulting from get_next is released. This seems a bit odd, how about just setting its length to 0? Or is it possible in the stream API that individual chunks are of length 0 but subsequent chunks are not?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org