You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "flolas (via GitHub)" <gi...@apache.org> on 2023/09/24 15:32:55 UTC

[GitHub] [airflow] flolas commented on issue #34583: Support split_statements AthenaOperator [Amazon provider]

flolas commented on issue #34583:
URL: https://github.com/apache/airflow/issues/34583#issuecomment-1732598175

   @hussein-awala I see two use cases:
   * Multiple SQL statements need to be run sequentially in a single operator, if a sql statements takes too long, would be good to wait async with a trigger **[current scope of this issue]**
   * There's a need to process the results of the queries directly at the worker level.
   
   The existing AthenaOperator only waits for the query to complete; it doesn't retrieve or process the results.
   
   To address these cases, I propose two potential implementations:
   
   **1. AthenaOperator and AthenaHook with split_statements argument:**  This would be ideal for running multiple SQL queries sequentially without returning results to the worker. The feature would also must support asynchronous waiting. As per standard practice, idempotency would be defined at the operator level.
   
   **2. Extension of AthenaOperator and Introduction of AthenaDbApiHook:** The AthenaOperator could be extended based on BaseSQLOperator, and a new hook, AthenaDbApiHook, could be introduced. This hook would be built on top of both PyAthena and AwsBaseHook. This approach allows the AthenaOperator to be compatible with both AthenaHook and AthenaDbApiHook, making it highly versatile for addressing the two cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org