You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/19 14:57:21 UTC

[GitHub] [arrow] lidavidm opened a new pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

lidavidm opened a new pull request #7224:
URL: https://github.com/apache/arrow/pull/7224


   This fixes a few things:
   - Sending/receiving text headers in Java
   - Iterating over all headers in Java (binary ones used to be filtered out)
   - Receiving binary headers in Python


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on a change in pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
pitrou commented on a change in pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#discussion_r428602803



##########
File path: python/pyarrow/_flight.pyx
##########
@@ -1785,15 +1786,16 @@ cdef dict convert_headers(const CCallHeaders& c_headers):
         CCallHeaders.const_iterator header_iter = c_headers.cbegin()
     headers = {}
     while header_iter != c_headers.cend():
-        # Headers in gRPC (and HTTP/1, HTTP/2) are required to be
-        # valid ASCII.
         header = c_string(deref(header_iter).first).decode("ascii")
+        if header not in headers:
+            headers[header] = []
+        value = c_string(deref(header_iter).second)
         if not header.endswith("-bin"):
-            # Ignore -bin (gRPC binary) headers
-            value = c_string(deref(header_iter).second).decode("ascii")
-            if header not in headers:
-                headers[header] = []
-            headers[header].append(value)
+            # Text header values in gRPC (and HTTP/1, HTTP/2) are
+            # required to be valid ASCII. Binary header values are
+            # exposed as bytes.
+            value = value.decode("ascii")
+        headers[header].append(value)

Review comment:
       `headers.setdefault(header, []).append(value)` automatically creates the empty list entry for you.

##########
File path: python/pyarrow/tests/test_flight.py
##########
@@ -483,6 +495,51 @@ def sending_headers(self):
         }
 
 
+class MultiHeaderClientMiddlewareFactory(ClientMiddlewareFactory):
+    """Test sending/receiving multiple (binary-valued) headers."""
+
+    def __init__(self):
+        self.last_headers = {}

Review comment:
       This isn't ever used, right?

##########
File path: python/pyarrow/tests/test_flight.py
##########
@@ -452,6 +453,17 @@ def do_action(self, context, action):
         return iter([flight.Result(b"")])
 
 
+class MultiHeaderFlightServer(FlightServerBase):
+    """Test sending/receiving multiple (binary-valued) headers."""
+
+    def do_action(self, context, action):
+        middleware = context.get_middleware("test")
+        if middleware:

Review comment:
       Is this condition useful?

##########
File path: python/pyarrow/tests/test_flight.py
##########
@@ -452,6 +453,17 @@ def do_action(self, context, action):
         return iter([flight.Result(b"")])
 
 
+class MultiHeaderFlightServer(FlightServerBase):
+    """Test sending/receiving multiple (binary-valued) headers."""
+
+    def do_action(self, context, action):
+        middleware = context.get_middleware("test")
+        if middleware:
+            headers = repr(middleware.client_headers).encode("utf-8")
+            return iter([flight.Result(headers)])

Review comment:
       Is `iter` required?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-630880344


   https://issues.apache.org/jira/browse/ARROW-8858


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-641961534


   > LGTM. Do you need someone to review the Java changes?
   > Also, can you rebase to try and fix the AppVeyor failure?
   
   Thanks! Rebased. Ryan already looked at the Java changes, and he's previously worked on this part of Flight.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] rymurr commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
rymurr commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-631511937


   Hey @lidavidm, you are correct. I don't love the duplicate methods but its the easier than using the gRPC style or constantly switching between byte[]/String.
   
   I forgot the lifetime of the Metadata object was so short. The two usecases for metadata are quite different (error reporting vs middleware) so it doesn't make sense to mash them together. Thinking out loud withdrawn ;-)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-641603882


   Rebased to fix conflicts.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-631428845


   I could take a different approach and only expose bytes, but internally check if there is/isn't the "-bin" suffix and use the appropriate marshaller - does that sound better? It would be somewhat inefficient, as the user would convert strings to bytes only to have us convert them back to strings. 
   
   Unfortunately gRPC enforces that string keys must be read/set as strings and bytes keys must be read/set as bytes. It doesn't literally duplicate methods, but it effectively does by using generic parameters/typed key objects.
   
   `ErrorFlightMetadata` is different - a gRPC `Metadata` object actually has an implicit lifetime (you can't use it outside of the interceptor method where you received it), so we can't just wrap `Metadata`. And on the other hand, we _do_ want to wrap `Metadata` in middleware because we'd like to mutate the underlying object.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-630877429


   CC @rymurr, if you have time to take a look that would be much appreciated (since you've also worked on the metadata handling). grpc-java is very strict about using the proper metadata marshaller, so I undeprecated the String overloads of the relevant methods, or else we wouldn't be able to handle non-binary headers (e.g. user-agent, authorization).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#discussion_r428637185



##########
File path: python/pyarrow/tests/test_flight.py
##########
@@ -452,6 +453,17 @@ def do_action(self, context, action):
         return iter([flight.Result(b"")])
 
 
+class MultiHeaderFlightServer(FlightServerBase):
+    """Test sending/receiving multiple (binary-valued) headers."""
+
+    def do_action(self, context, action):
+        middleware = context.get_middleware("test")
+        if middleware:
+            headers = repr(middleware.client_headers).encode("utf-8")
+            return iter([flight.Result(headers)])

Review comment:
       It was before but I've changed it now so that it's not.

##########
File path: python/pyarrow/tests/test_flight.py
##########
@@ -452,6 +453,17 @@ def do_action(self, context, action):
         return iter([flight.Result(b"")])
 
 
+class MultiHeaderFlightServer(FlightServerBase):
+    """Test sending/receiving multiple (binary-valued) headers."""
+
+    def do_action(self, context, action):
+        middleware = context.get_middleware("test")
+        if middleware:

Review comment:
       It's not useful, thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#discussion_r428637508



##########
File path: python/pyarrow/tests/test_flight.py
##########
@@ -483,6 +495,51 @@ def sending_headers(self):
         }
 
 
+class MultiHeaderClientMiddlewareFactory(ClientMiddlewareFactory):
+    """Test sending/receiving multiple (binary-valued) headers."""
+
+    def __init__(self):
+        self.last_headers = {}

Review comment:
       It is used, but it is unclear, so I've added an explanation of where and how it's used.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#issuecomment-637026817


   @pitrou following up, any other comments here? Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson closed pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
nealrichardson closed pull request #7224:
URL: https://github.com/apache/arrow/pull/7224


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] lidavidm commented on a change in pull request #7224: ARROW-8858: [FlightRPC] ensure binary/multi-valued headers are properly exposed

Posted by GitBox <gi...@apache.org>.
lidavidm commented on a change in pull request #7224:
URL: https://github.com/apache/arrow/pull/7224#discussion_r428637604



##########
File path: python/pyarrow/_flight.pyx
##########
@@ -1785,15 +1786,16 @@ cdef dict convert_headers(const CCallHeaders& c_headers):
         CCallHeaders.const_iterator header_iter = c_headers.cbegin()
     headers = {}
     while header_iter != c_headers.cend():
-        # Headers in gRPC (and HTTP/1, HTTP/2) are required to be
-        # valid ASCII.
         header = c_string(deref(header_iter).first).decode("ascii")
+        if header not in headers:
+            headers[header] = []
+        value = c_string(deref(header_iter).second)
         if not header.endswith("-bin"):
-            # Ignore -bin (gRPC binary) headers
-            value = c_string(deref(header_iter).second).decode("ascii")
-            if header not in headers:
-                headers[header] = []
-            headers[header].append(value)
+            # Text header values in gRPC (and HTTP/1, HTTP/2) are
+            # required to be valid ASCII. Binary header values are
+            # exposed as bytes.
+            value = value.decode("ascii")
+        headers[header].append(value)

Review comment:
       Good catch, thank you!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org