You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "aiguofer (via GitHub)" <gi...@apache.org> on 2023/05/30 20:26:08 UTC

[GitHub] [arrow-adbc] aiguofer opened a new issue, #715: python(go?): Error when using TLS with FlightSQL ADBC driver

aiguofer opened a new issue, #715:
URL: https://github.com/apache/arrow-adbc/issues/715

   I've been trying to do some testing using the Python ADBC driver against our FlightSQL service and I've run into issues when TLS is enabled. If I disable TLS and test locally it works as expected, but with TLS enabled I get some very strange errors.
   
   If I try to hit our service in Prod, where the TLS handshake happens in our load balancer, this is the error on the client:
   
   ```
   Traceback (most recent call last):
     File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 49, in <module>
       main(sys.argv)
     File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 40, in main
       print(conn.adbc_get_info())
             ^^^^^^^^^^^^^^^^^^^^
     File "/Users/aiguofer/Library/Application Support/hatch/env/virtual/afs-example/vPTxZbV9/afs-example/lib/python3.11/site-packages/adbc_driver_manager/dbapi.py", line 348, in adbc_get_info
       handle = self._conn.get_info()
                ^^^^^^^^^^^^^^^^^^^^^
     File "adbc_driver_manager/_lib.pyx", line 688, in adbc_driver_manager._lib.AdbcConnection.get_info
     File "adbc_driver_manager/_lib.pyx", line 385, in adbc_driver_manager._lib.check_error
   adbc_driver_manager._lib.ProgrammingError: ADBC_STATUS_UNAUTHENTICATED (13): [FlightSQL] rpc error: code = Unauthenticated desc = Invalid credentials
   ```
   
   However, when I look in DataDog I see the request and the correct response with a status `0`. What's strange is that the `Invalid credentials` error is one that we actually do send back from the `authenticate` function, but I don't see any traces or logs in DataDog returning this error when I test this.
   
   I tried to debug this locally by enabling TLS with a self signed cert. When I try this, the logs in my service seem like everything worked fine:
   
   ```
   13:54:15.485 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[:status: 200, content-type: application/grpc, grpc-encoding: identity, grpc-accept-encoding: gzip] padding=0 endStream=false | MDC:  | extra: 
   13:54:15.485 [flight-server-default-executor-1] INFO  c.d.s.logging.LoggingMiddleware Call completed for method='GET_FLIGHT_INFO' | extra: {duration=2, callStatus={code=OK, cause=null, description=, metadata=org.apache.arrow.flight.ErrorFlightMetadata@12087791}}
   13:54:15.485 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND DATA: streamId=3 padding=0 endStream=false length=1143 bytes=00000004720aa007ffffffff980300001000000000000a000e0006000d0008000a000000000004001000000000010a000c000000080004000a00000008000000... | MDC:  | extra: 
   13:54:15.485 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[grpc-status: 0] padding=0 endStream=true | MDC:  | extra: 
   13:54:15.486 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=1143 | MDC:  | extra: 
   13:54:15.486 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] INBOUND PING: ack=false bytes=145258749040133895 | MDC:  | extra: 
   13:54:15.486 [grpc-nio-worker-ELG-3-3] DEBUG io.grpc.netty.NettyServerHandler [id: 0x86eb57a0, L:/127.0.0.1:9876 - R:/127.0.0.1:64926] OUTBOUND PING: ack=true bytes=145258749040133895 | MDC:  | extra: 
   ```
   
   But on the client side I get:
   
   ```
   Traceback (most recent call last):
     File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 52, in <module>
       main(sys.argv)
     File "/Users/aiguofer/work/dbtlabs/sl-example-clients/python/src/adbc_example.py", line 43, in main
       print(conn.adbc_get_info())
             ^^^^^^^^^^^^^^^^^^^^
     File "/Users/aiguofer/Library/Application Support/hatch/env/virtual/afs-example/vPTxZbV9/afs-example/lib/python3.11/site-packages/adbc_driver_manager/dbapi.py", line 348, in adbc_get_info
       handle = self._conn.get_info()
                ^^^^^^^^^^^^^^^^^^^^^
     File "adbc_driver_manager/_lib.pyx", line 688, in adbc_driver_manager._lib.AdbcConnection.get_info
     File "adbc_driver_manager/_lib.pyx", line 385, in adbc_driver_manager._lib.check_error
   adbc_driver_manager._lib.OperationalError: ADBC_STATUS_IO (10): [FlightSQL] rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: EOF"
   ``` 
   
   Here's the code for the example client, and it takes in a single argument which is a JDBC connection string like `jdbc:arrow-flight-sql://localhost:9876?customId=<id>&token=<token>`.
   
   ```
   import sys
   from dataclasses import dataclass
   from urllib.parse import parse_qs, urlparse
   
   from adbc_driver_flightsql import DatabaseOptions
   from adbc_driver_flightsql.dbapi import connect
   
   
   @dataclass
   class ConnAttr:
       host: str
       params: dict
       auth_header: str
   
   
   def parse_jdbc_uri(uri):
       parsed = urlparse(uri)
       params = {k.lower(): v[0] for k, v in parse_qs(parsed.query).items()}
       return ConnAttr(
           host=parsed.path.replace("arrow-flight-sql", "grpc")
           if params.pop("useencryption") == "false"
           else parsed.path.replace("arrow-flight-sql", "grpc+tls"),
           params=params,
           auth_header=f"Bearer {params.pop('token')}",
       )
   
   
   def main(args):
       conn_attr = parse_jdbc_uri(args[1])
       with connect(
           conn_attr.host,
           db_kwargs={
               DatabaseOptions.AUTHORIZATION_HEADER.value: conn_attr.auth_header,
               DatabaseOptions.TLS_SKIP_VERIFY.value: "true",
               **{
                   f"{DatabaseOptions.RPC_CALL_HEADER_PREFIX.value}{k}": v
                   for k, v in conn_attr.params.items()
               },
           },
       ) as conn:
           print(conn.adbc_get_info())
   
   
   if __name__ == "__main__":
       main(sys.argv)
   ```
   
   I feel like if I knew Go I might be able to try some debugging within the Go driver, but unfortunately I've got no experience with it. Any help on how I might further debug would be greatly appreciated!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] aiguofer commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "aiguofer (via GitHub)" <gi...@apache.org>.
aiguofer commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1569436236

   My initial guess was that it might be something related to the scheme, but it seems to establish the TLS connection just fine and the server handles the request correctly and responds correctly (based on logs). The issue seems to be while the client receives the response. Some logging would certainly help!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1570491592

   Unfortunately, yes. The JDBC driver is not compliant with the Flight SQL spec: https://github.com/apache/arrow/issues/34532


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1569320329

   Hmm, the Go driver needs to include more diagnostic info about which server it's talking to.
   
   But one off-the-top-of-my-head guess is that the FlightInfo might be using a `grpc://` scheme instead of `grpc+tls://` (or vice versa)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1570136354

   There's two requests. The initial connection may be fine but when the client goes to fetch the data, it connects based on the info in the server's FlightInfo response. If the info in that response is using the wrong scheme you may see errors like this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1569321750

   Also, I agree the driver should make it easier to debug what's going on...There is an issue for adding more logging in https://github.com/apache/arrow-adbc/issues/492


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] lidavidm commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1570494086

   And yes, stuff like this is why I'd like to improve the logging and add more context in error messages so it's clear what failed in a potentially multi-request chain (e.g. with a prepared statement, you might see 5 requests...)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] aiguofer closed issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "aiguofer (via GitHub)" <gi...@apache.org>.
aiguofer closed issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver
URL: https://github.com/apache/arrow-adbc/issues/715


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] aiguofer commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "aiguofer (via GitHub)" <gi...@apache.org>.
aiguofer commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1570500985

   Ahhh that makes sense! Thanks for the help! Good thing for us we're just using the same server for now, but if I get some time I might try to fix that on the JDBC side. The only potential problem would be if servers aren't correctly sending the location back in FlightInfo requests this will cause clients to break, so it might need some sort of fallback logic if it detects that the main server is hosted elsewhere and the FlightInfo specifies a localhost endpoint.
   
   Closing this issue now that I know what's going on. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-adbc] aiguofer commented on issue #715: python(go?): Error when using TLS with FlightSQL ADBC driver

Posted by "aiguofer (via GitHub)" <gi...@apache.org>.
aiguofer commented on issue #715:
URL: https://github.com/apache/arrow-adbc/issues/715#issuecomment-1570480287

   Ahhh that was it! We were using the same Location (`Location.forGrpcInsecure("0.0.0.0", config.applicationPort)`) for both `FlightServer.builder` and for the `FlightEndpoint` lists that we send back for `FlightInfo` requests in the `FlightSqlProducer`. The reason I was seeing the invalid credentials error when hitting prod is because I had my local server running, so the `DO_GET` request was going to my local machine using the prod credentials.
   
   Interestingly enough, this works fine when using the JDBC driver.... does that mean that the JDBC driver is ignoring the flight endpoints provided by `GET_FLIGHT_INFO`? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org