You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "maxfirman (via GitHub)" <gi...@apache.org> on 2024/02/21 12:46:41 UTC

[I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

maxfirman opened a new issue, #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559

   ### What would you like help with?
   
   I'm attempting to query Dremio using the ADBC Flight Sql client, however my code hangs when attempting to return a result set.
   
   The following code snippet will reproduce the issue:
   
   ```python
   from adbc_driver_flightsql.dbapi import connect
   
   
   with connect(
       uri="grpc://localhost:32010",
       db_kwargs={"username": "dremio", "password": "dremio123"},
   ) as connection:
       with connection.cursor() as cursor:
           cursor.execute("select 1 as foo")
           result = cursor.fetchall()  # <- hangs indefinitely
           print(result)
   ```
   
   I have been testing against a local Dremio instance running in the standalone Docker image:
   
   ```
   docker run -p 9047:9047 -p 31010:31010 -p 32010:32010 -p 45678:45678 dremio/dremio-oss
   ```
   
   Note you will have to create an initial user through the Dremio UI: `http://localhost:9047`
   
   Interestingly, I can see that the query is being executed successfully when I look at the jobs page in the Dremio UI. However any attempt to subsequently retrieve the data using `fetchall`, `fetchone`, `fetch_arrow_table` etc... hangs indefinitely.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman closed issue #1559: [Python] Querying Dremio with the ADBC Flight SQL client
URL: https://github.com/apache/arrow-adbc/issues/1559


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957278240

   It's possible we should catch and handle this explicitly. There's also https://github.com/apache/arrow/pull/40084 which is supposed to explicitly do what I think Dremio is trying to implicitly do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1956897704

   ![profile](https://github.com/apache/arrow-adbc/assets/15067739/42f64f98-f9a0-4764-9e43-2340424b9c36)
   py-spy profile.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957747767

   Thanks @lidavidm, that is definitely the issue. I look forward to testing the fix once Dremio 24.3.3 lands.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "stevelorddremio (via GitHub)" <gi...@apache.org>.
stevelorddremio commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957606475

   > So, Dremio is telling the client to connect to "0.0.0.0":
   > 
   > ```
   > >>> partitions, schema = cur.adbc_execute_partitions("SELECT 1")
   > >>> info = pyarrow.flight.FlightInfo.deserialize(partitions[0])
   > >>> info.endpoints[0].locations
   > [<pyarrow.flight.Location b'grpc+tcp://0.0.0.0:32010'>]
   > ```
   > 
   > Is it possible that one machine routes this to localhost and the other (correctly?) drops this?
   
   Yes, the behaviour will be as you described, which will likely be invalid.
   This will be fixed post Dremio 24.3.3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "stevelorddremio (via GitHub)" <gi...@apache.org>.
stevelorddremio commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957861910

   @maxfirman correct. It will be the next release(s) after 24.3.3. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1956830941

   Thanks @lidavidm. 
   
   Curiously, I've just tested it on my home machine (Fedora) and it returns data as expected. The issue only seems to exist on my work machine (Ubuntu 22). 
   
   As far as I can see I have the same versions of adbc and dremio installed. One other difference is that my work machine uses brew as the package manager and my home machine uses dnf.
   
   My instinct is that there is some network policy on my work machine that is blocking the connection, although I am able to authenticate and execute the query (its just returning the results that hangs). I'm also able to query and retrieve results from Dremio on my work machine using the approach documented here: https://github.com/dremio-hub/arrow-flight-client-examples/blob/main/python/example.py#L33
   
   I will work on getting a stack trace using py-spy.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1956735610

   ```
   Python 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   >>> import adbc_driver_flightsql.dbapi
   >>> conn = adbc_driver_flightsql.dbapi.connect("grpc://localhost:32010", db_kwargs={"username": "dremio", "password": "dremio123"})
   /home/lidavidm/temp/venv/lib/python3.11/site-packages/adbc_driver_manager/dbapi.py:307: Warning: Cannot disable autocommit; conn will not be DB-API 2.0 compliant
     warnings.warn(
   >>> cur = conn.cursor()
   >>> cur.execute("SELECT 1")
   >>> cur.fetchall()
   [(1,)]
   ```
   
   What version of ADBC, Python, PyArrow, Dremio, etc. are you using, and on what platform?
   
   Are you able to get a stack trace? (You can use [py-spy](https://github.com/benfred/py-spy))
   
   This is the version of Dremio I havae:
   
   ```
   Build
       24.3.2-202401241821100032-d2d8a497
   Edition
       Community Edition
   Build Time
       01/24/2024 18:34:32
   Change Hash
       d2d8a49790d59599d617f25f6020731f0260178d
   Change Time
       01/24/2024 18:11:23
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957825794

   @stevelorddremio just to clarify by "post Dremio 24.3.3", do you mean to say the fix is included in 24.3.3 or the next release after 24.3.3?
   
   I'm assuming the later, as I can't see any reference to this bug fix in the [release notes](https://docs.dremio.com/current/release-notes/version-240-release#2433-release-notes-february-2024-enterprise), and I just tested against our cluster (running 24.3.3 Enterprise) and still see the same issue.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-2009774823

   Thanks for the follow-up!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957018493

   Oh, sorry, `py-spy` has a `dump` command that should just give you a stack trace (and can be run in a separate shell against an existing process), that would directly show us where it got stuck


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957021285

   `--native` would also help to show the C extension stack traces (which would hopefully illuminate a bit more)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "lidavidm (via GitHub)" <gi...@apache.org>.
lidavidm commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957272022

   So, Dremio is telling the client to connect to "0.0.0.0":
   
   ```
   >>> partitions, schema = cur.adbc_execute_partitions("SELECT 1")
   >>> info = pyarrow.flight.FlightInfo.deserialize(partitions[0])
   >>> info.endpoints[0].locations
   [<pyarrow.flight.Location b'grpc+tcp://0.0.0.0:32010'>]
   ```
   
   Is it possible that one machine routes this to localhost and the other (correctly?) drops this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1957254650

   Thanks. Unfortunately running the dump command requires sudo permissions, which I don't have. 
   
   I'm going to have to bother someone in our IT department to grant me temporary sudo permissions so I can get the trace dump. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "mgross-ebner (via GitHub)" <gi...@apache.org>.
mgross-ebner commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-1962349797

   @zeroshade Looks like the issue we discussed in https://github.com/apache/arrow/pull/40090 will be fixed in Dremio.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Python] Querying Dremio with the ADBC Flight SQL client [arrow-adbc]

Posted by "maxfirman (via GitHub)" <gi...@apache.org>.
maxfirman commented on issue #1559:
URL: https://github.com/apache/arrow-adbc/issues/1559#issuecomment-2009647520

   I can confirm that the following now works against Dremio 24.3.4:
   ```python
   with connect(
           uri="grpc+tls://<dremio-host>:32010",
           db_kwargs={
               "username": "<username>",
               "password": "<password>",
           },
   ) as connection:
       with connection.cursor() as cursor:
           cursor.execute("select 1")
           result = cursor.fetchall()
           print(result)
   ```
   
   Prints:
   ```
   [(1,)]
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org