You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/03/10 18:31:04 UTC

[GitHub] [airflow] vincbeck opened a new pull request #22157: Fix RedshiftDataOperator and update doc

vincbeck opened a new pull request #22157:
URL: https://github.com/apache/airflow/pull/22157


   While testing `RedshiftDataOperator` operator as part of #22063 verification process I found out some bugs. The bugs are, for all optional parameters as part of `RedshiftDataOperator`, they should not be passed down to the boto3 client if no value is specified. Otherwise, such error message are thrown
   
   `Invalid type for parameter Parameters, value: None, type: <class 'NoneType'>, valid types: <class 'list'>, <class 'tuple'>.`
   
   I also updated the doc and sample dag for `RedshiftDataOperator`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on a change in pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
eladkal commented on a change in pull request #22157:
URL: https://github.com/apache/airflow/pull/22157#discussion_r824148716



##########
File path: docs/apache-airflow-providers-amazon/operators/redshift_data.rst
##########
@@ -15,38 +15,37 @@
     specific language governing permissions and limitations
     under the License.
 
-.. _howto/operator:RedshiftDataOperator:
+Amazon Redshift Data Operators
+==============================
 
-RedshiftDataOperator
-====================
+Use the :class:`RedshiftDataOperator <airflow.providers.amazon.aws.operators.redshift_data>` to execute
+statements against an Amazon Redshift cluster.
 
-.. contents::
-  :depth: 1
-  :local:
+This differs from `RedshiftSQLOperator` in that it allows users to query and retrieve data via the AWS API and avoid the necessity of a Postgres connection.

Review comment:
       ```suggestion
   This differs from ``RedshiftSQLOperator`` in that it allows users to query and retrieve data via the AWS API and avoid the necessity of a Postgres connection.
   ```
   
   fix static checks




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on a change in pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
josh-fell commented on a change in pull request #22157:
URL: https://github.com/apache/airflow/pull/22157#discussion_r824040379



##########
File path: airflow/providers/amazon/aws/example_dags/example_redshift_data_execute_sql.py
##########
@@ -40,29 +40,26 @@
 POLL_INTERVAL = 10
 
 
-# [START howto_redshift_data]
-@dag(
-    dag_id='example_redshift_data',
-    schedule_interval=None,
-    start_date=datetime(2021, 1, 1),
-    dagrun_timeout=timedelta(minutes=60),
-    tags=['example'],
-    catchup=False,
-)

Review comment:
       Why update away from the `@dag` decorator? More curiosity than a suggestion.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on a change in pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
josh-fell commented on a change in pull request #22157:
URL: https://github.com/apache/airflow/pull/22157#discussion_r824041065



##########
File path: airflow/providers/amazon/aws/example_dags/example_redshift_data_execute_sql.py
##########
@@ -71,10 +68,8 @@ def output_results_fn(id):
         poll_interval=POLL_INTERVAL,
         await_result=True,
     )
+    # [END howto_redshift_data]
 
-    # Using a task-decorated function to output the list of tables in a Redshift cluster
-    output_results_fn(redshift_query.output)
-
+    task_output = output_query_results(task_query.output)
 
-example_redshift_data_dag = example_redshift_data()
-# [END howto_redshift_data]
+    chain(task_query, task_output)

Review comment:
       This is redundant. The dependencies between these tasks should be handle implicitly via the TaskFlow API in `output_query_results(task_query.output)`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vincbeck commented on a change in pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
vincbeck commented on a change in pull request #22157:
URL: https://github.com/apache/airflow/pull/22157#discussion_r824061122



##########
File path: airflow/providers/amazon/aws/example_dags/example_redshift_data_execute_sql.py
##########
@@ -40,29 +40,26 @@
 POLL_INTERVAL = 10
 
 
-# [START howto_redshift_data]
-@dag(
-    dag_id='example_redshift_data',
-    schedule_interval=None,
-    start_date=datetime(2021, 1, 1),
-    dagrun_timeout=timedelta(minutes=60),
-    tags=['example'],
-    catchup=False,
-)

Review comment:
       Good question. We (AWS) are trying to be consistent across the Amazon providers package and decided to go with this format




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vincbeck commented on a change in pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
vincbeck commented on a change in pull request #22157:
URL: https://github.com/apache/airflow/pull/22157#discussion_r824088553



##########
File path: airflow/providers/amazon/aws/example_dags/example_redshift_data_execute_sql.py
##########
@@ -71,10 +68,8 @@ def output_results_fn(id):
         poll_interval=POLL_INTERVAL,
         await_result=True,
     )
+    # [END howto_redshift_data]
 
-    # Using a task-decorated function to output the list of tables in a Redshift cluster
-    output_results_fn(redshift_query.output)
-
+    task_output = output_query_results(task_query.output)
 
-example_redshift_data_dag = example_redshift_data()
-# [END howto_redshift_data]
+    chain(task_query, task_output)

Review comment:
       Good to know! Thanks :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #22157:
URL: https://github.com/apache/airflow/pull/22157


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #22157: Fix RedshiftDataOperator and update doc

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #22157:
URL: https://github.com/apache/airflow/pull/22157#issuecomment-1064370987


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org