You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "alamb (via GitHub)" <gi...@apache.org> on 2023/04/22 11:40:00 UTC

[GitHub] [arrow-rs] alamb commented on a diff in pull request #4101: feat: set FlightDescriptor on FlightDataEncoderBuilder

alamb commented on code in PR #4101:
URL: https://github.com/apache/arrow-rs/pull/4101#discussion_r1174386562


##########
arrow-flight/src/encode.rs:
##########
@@ -134,6 +137,14 @@ impl FlightDataEncoderBuilder {
         self
     }
 
+    pub fn with_flight_descriptor(

Review Comment:
   Can you please add doc strings to explain how this function works, to keep in line with the rest of the functions on this struct?
   
   Specifically that it is sent on the first FlightData as described in https://arrow.apache.org/docs/format/Flight.html#uploading-data



##########
arrow-flight/src/encode.rs:
##########
@@ -233,6 +256,24 @@ impl FlightDataEncoder {
         schema
     }
 
+    /// Encodes descriptor as a [`FlightData`] in self.queue.
+    /// Updates `self.descriptor` and returns the new descriptor
+    fn encode_descriptor(&mut self, descriptor: &FlightDescriptor) -> FlightDescriptor {
+        // The first message is the descriptor message, and all
+        // batches have the same descriptor
+        let descriptor = descriptor.clone();
+        let descriptor_flight_data = FlightData::new(
+            Some(descriptor.clone()),
+            IpcMessage(Bytes::new()),
+            vec![],
+            vec![],
+        );
+        self.queue_message(descriptor_flight_data);
+        // remember descriptor
+        self.descriptor = Some(descriptor.clone());

Review Comment:
   Why is the descriptor remembered?



##########
arrow-flight/src/encode.rs:
##########
@@ -194,12 +210,19 @@ impl FlightDataEncoder {
             app_metadata: Some(app_metadata),
             queue: VecDeque::new(),
             done: false,
+            descriptor: None,
         };
 
         // If schema is known up front, enqueue it immediately
         if let Some(schema) = schema {
             encoder.encode_schema(&schema);
         }
+
+        // If descriptor is known up front, enqueue it immediately
+        if let Some(descriptor) = descriptor {

Review Comment:
   As I understand the intent, the idea is that the descriptor is set on the first message that is sent this code will potentially send a separate message after the schema message, if the schema is known up front.
   
   Perhaps you could check  `self.descriptor` in `enqueue_messages` and attach the descriptor there



##########
arrow-flight/src/encode.rs:
##########
@@ -176,6 +189,8 @@ pub struct FlightDataEncoder {
     queue: VecDeque<FlightData>,
     /// Is this stream done (inner is empty or errored)
     done: bool,
+    /// descriptor, set after the first batch

Review Comment:
   What does "set after the first batch" mean? 
   
   I guess I expect this to be something more like "cleared after the first FlightData message"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org