You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/08 19:09:54 UTC

[GitHub] [arrow] GavinRay97 commented on a change in pull request #12571: RFC: Add inlined data to flight.

GavinRay97 commented on a change in pull request #12571:
URL: https://github.com/apache/arrow/pull/12571#discussion_r821983337



##########
File path: format/Flight.proto
##########
@@ -301,6 +301,33 @@ message Location {
  */
 message Ticket {
   bytes ticket = 1;
+  // Data representing some part of the data retrievable by the ticket.
+  //
+  // `inlined_completeness` indicates what part of the data retriavable
+  // by the ticket this represents. This is provided as an optimization for
+  // client/server applications that want to reduce latency to first result
+  // without requiring another RPC round-trip to retrieve the ticket.  applications
+  // built on top of Flight are responsible for any negotiation necessary on whether
+  // inlining data is appropriate.
+  //
+  // The size of inlined_data is expected to be small (typically less then 1MB) 
+  // and inlining too much data across tickets can run into underlying transport 
+  // limitations.  Furthermore, since the data is expected to be small, implementations 
+  // are less likely to optimize for zero-copy in these cases.  
+  repeated FlightData inlined_data = 2;
+  enum InlinedCompleteness {
+    // Default is no data is inlined.  An UNDEFINED value is not provided because an enum
+    // for the data that the client isn't aware of makes the data unusable.
+    NO_INLINED_DATA = 0;
+    // The data present in inlined_data represents all data
+    // present for the ticket.
+    COMPLETE_DATA = 1;
+    // The data present for inlined_data represents only a partial
+    // sample of the data available for the ticket.  No guarantees on the 
+    // ordering are provided.
+    SAMPLE_DATA = 2;

Review comment:
       What would be a usecase for `SAMPLE_DATA`?
   
   I'm thinking that something like `LIMIT` would probably come as part of the query, so for IE `LIMIT 10` it would be `COMPLETE_DATA` but the limit is implemented data-side
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org