You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/09 12:00:18 UTC

[GitHub] [airflow] pankajastro opened a new pull request, #25619: Enable multiple query execution in RedshiftDataOperator

pankajastro opened a new pull request, #25619:
URL: https://github.com/apache/airflow/pull/25619

   Enable `RedshiftDataOperator` to execute a batch of SQL using `batch_execute_statement` boto3 API. 
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvement+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] kaxil merged pull request #25619: Enable multiple query execution in RedshiftDataOperator

Posted by GitBox <gi...@apache.org>.
kaxil merged PR #25619:
URL: https://github.com/apache/airflow/pull/25619


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pankajastro commented on a diff in pull request #25619: Enable multiple query execution in RedshiftDataOperator

Posted by GitBox <gi...@apache.org>.
pankajastro commented on code in PR #25619:
URL: https://github.com/apache/airflow/pull/25619#discussion_r941378712


##########
airflow/providers/amazon/aws/operators/redshift_data.py:
##########
@@ -43,6 +44,8 @@ class RedshiftDataOperator(BaseOperator):
     :param secret_arn: the name or ARN of the secret that enables db access
     :param statement_name: the name of the SQL statement
     :param with_event: indicates whether to send an event to EventBridge
+    :param enable_batch_execution: Enable executing multiple query.
+    :param sqls: List of sql statement. This is required when ``enable_batch_execution`` is ``true``

Review Comment:
   hmm, liked it. Thank you @eladkal . I'll update the PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on a diff in pull request #25619: Enable multiple query execution in RedshiftDataOperator

Posted by GitBox <gi...@apache.org>.
eladkal commented on code in PR #25619:
URL: https://github.com/apache/airflow/pull/25619#discussion_r941271161


##########
airflow/providers/amazon/aws/operators/redshift_data.py:
##########
@@ -43,6 +44,8 @@ class RedshiftDataOperator(BaseOperator):
     :param secret_arn: the name or ARN of the secret that enables db access
     :param statement_name: the name of the SQL statement
     :param with_event: indicates whether to send an event to EventBridge
+    :param enable_batch_execution: Enable executing multiple query.
+    :param sqls: List of sql statement. This is required when ``enable_batch_execution`` is ``true``

Review Comment:
   I'm not so happy with having both `sql` and `sqls`
   To me it seems like we also don't need `enable_batch_execution` as it can be deduced from the type user passed.
   if string passed to `sql` then it's regular mode
   if list of string passed to `sql` then it's batch mode.
   
   WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] eladkal commented on pull request #25619: Enable multiple query execution in RedshiftDataOperator

Posted by GitBox <gi...@apache.org>.
eladkal commented on PR #25619:
URL: https://github.com/apache/airflow/pull/25619#issuecomment-1209704688

   @ferruzzi @o-nikolas @vincbeck any further comments on this one?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] pankajastro commented on a diff in pull request #25619: Enable multiple query execution in RedshiftDataOperator

Posted by GitBox <gi...@apache.org>.
pankajastro commented on code in PR #25619:
URL: https://github.com/apache/airflow/pull/25619#discussion_r941632503


##########
airflow/providers/amazon/aws/operators/redshift_data.py:
##########
@@ -119,6 +120,22 @@ def execute_query(self):
         resp = self.hook.conn.execute_statement(**trim_none_values(kwargs))
         return resp['Id']
 
+    def execute_batch_query(self):
+        if self.sql is None:
+            AirflowException("Missing require param sqls")

Review Comment:
   - AWS API will through ```Invalid type for parameter Sqls, value: , type: <class 'str'>, valid types: <class 'list'>, <class 'tuple'>```
   
   - And if a user will not pass `sql` param then baseoperator will catch it ```airflow.exceptions.AirflowException: missing keyword argument 'sql'```
   
   So, yeah you are right this exception will not add value here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org