You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/09/29 11:32:30 UTC

[GitHub] [airflow] alexandermalyga opened a new issue, #26774: Trino and Presto hooks do not properly execute statements other than SELECT

alexandermalyga opened a new issue, #26774:
URL: https://github.com/apache/airflow/issues/26774

   ### Apache Airflow Provider(s)
   
   presto, trino
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-trino==4.0.1
   apache-airflow-providers-presto==4.0.1
   
   ### Apache Airflow version
   
   2.4.0
   
   ### Operating System
   
   macOS 12.6
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   When using the TrinoHook (PrestoHook also applies), only the `get_records()` and `get_first()` methods work as expected, the  `run()` and `insert_rows()` do not.
   The SQL statements sent by the problematic methods reach the database (visible in logs and UI), but they don't get executed.
   
   The issue is caused by the hook not making the required subsequent requests to the Trino HTTP endpoints after the first request. More info [here](https://trino.io/docs/current/develop/client-protocol.html#overview-of-query-processing):
   
   > If the JSON document returned by the POST to /v1/statement does not contain a nextUri link, the query has completed, either successfully or unsuccessfully, and no additional requests need to be made. If the nextUri link is present in the document, there are more query results to be fetched. The client should loop executing a GET request to the nextUri returned in the QueryResults response object until nextUri is absent from the response.
   
   SQL statements made by methods like `get_records()` do get executed because internally they call `fetchone()` or `fetchmany()` on the cursor, which do make the subsequent requests.
   
   ### What you think should happen instead
   
   The Hook is able to execute SQL statements other than SELECT.
   
   ### How to reproduce
   
   Connect to a Trino or Presto instance and execute any SQL statement (INSERT or CREATE TABLE) using `TrinoHook.run()`, the statements will reach the API but they won't get executed.
   
   Then, provide a dummy handler function like this:
   `TrinoHook.run(..., handler=lambda cur: cur.description)`
   
   The description property internally iterates over the cursor requests and the initial statement will get executed.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexandermalyga commented on issue #26774: Trino and Presto hooks do not properly execute statements other than SELECT

Posted by GitBox <gi...@apache.org>.
alexandermalyga commented on issue #26774:
URL: https://github.com/apache/airflow/issues/26774#issuecomment-1262405377

   Not sure about the approach here. `cur.fetchall()` could be added at the end of `run()` and `insert_rows()` but that would require copy-pasting DbApiHook's implementation and adding one line, which is not ideal. Any thoughts @uranusjr @potiuk @eladkal ? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on issue #26774: Trino and Presto hooks do not properly execute statements other than SELECT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #26774:
URL: https://github.com/apache/airflow/issues/26774#issuecomment-1262760436

   Not sure about final solution, but right now we have "common.sql" package that allows much faster iteration on common code used by all sql providers, and it should be possible to add a new feature to the hook (callable post-run??) to the DBApiHook, bumping common.sql to 1.3 and make trino/presto depend on common.sql >=1.3.
   
   That was the main reason for having the common.sql separated to be able to iterate more quickly there and release new features/fixes faster to our users while avoiding relying on bumping Airflow core.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #26774: Trino and Presto hooks do not properly execute statements other than SELECT

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #26774: Trino and Presto hooks do not properly execute statements other than SELECT
URL: https://github.com/apache/airflow/issues/26774


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] alexandermalyga commented on issue #26774: Trino and Presto hooks do not properly execute statements other than SELECT

Posted by GitBox <gi...@apache.org>.
alexandermalyga commented on issue #26774:
URL: https://github.com/apache/airflow/issues/26774#issuecomment-1285994361

   This was recently fixed on Trino's side in [this commit](https://github.com/trinodb/trino-python-client/commit/ed4f9daedad2646f9f68c0ca805a2cb94ab0b5ee). I'll open a PR to bump the client's version.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org