You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/08/09 19:24:50 UTC

[GitHub] [airflow] frankcash opened a new pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

frankcash opened a new pull request #17515:
URL: https://github.com/apache/airflow/pull/17515


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   related: #16544
   
   ---
   
   Adds `class RedshiftSqlHook`  that utilizes the `PostgresHook` to create a specific Hook for the Redshift SQL use case.  
   
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on pull request #17515:
URL: https://github.com/apache/airflow/pull/17515#issuecomment-895784949


   I would call it just `RedshiftHook`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT commented on pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT commented on pull request #17515:
URL: https://github.com/apache/airflow/pull/17515#issuecomment-908224826


   Given that all SQL-related hooks (Postgres, MySql, Snowflake...) don't have SQL in their names, and that most of the developers will start using RedshiftHook coming from PostgresHook, I would leave the code in a class called RedshiftHook. I think it can actually be in the same class as the other functions of redshift.
   
   Also, the static checks and docs are failing. I would recommend you to use breeze for being aware of these fails in your local computer https://github.com/apache/airflow/blob/main/CONTRIBUTORS_QUICK_START.rst#setting-up-breeze
   
   I have spent a lot of time making pull requests without Breeze, and believe me, it's worth the time you spend configuring everything because otherwise, you'll spend even more time making changes, uploading them to git, waiting for another run of the tests...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] JavierLopezT edited a comment on pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
JavierLopezT edited a comment on pull request #17515:
URL: https://github.com/apache/airflow/pull/17515#issuecomment-895784949


   What do you think about calling it just `RedshiftHook`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] frankcash commented on pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
frankcash commented on pull request #17515:
URL: https://github.com/apache/airflow/pull/17515#issuecomment-895961108


   I have no problem with that, in-fact I thought about it.  I just feel it could be confusing as there already is a `RedshiftHook` <https://github.com/apache/airflow/blob/main/airflow/providers/amazon/aws/hooks/redshift.py#L25-L37>, but I am not sure of what type of applications there would be for having both of these hooks in the same task or even DAG as Redshift infra modifications are typically very slow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #17515:
URL: https://github.com/apache/airflow/pull/17515#issuecomment-908235537


   While I understand the superiority in the design aspect of putting everything in `RedshiftHook`, it is quite resource-consuming to implement since `AwsBaseHook` and `PostgresHook` have very different features and you can’t easily subclass them both. So the only choice would be composition, i.e. something like
   
   ```python
   class RedshiftHook(BaseHook):
       def __init__(self, ...):
           ...
           self._aws_hook = AwsBaseHook(...)
           self._sql_hook = PostgresHook(...)
   
   	# Cluster functions...
       def cluster_status(self, cluster_identifier: str) -> str:
           return self._aws_hook.get_conn()...
   
   	# SQL functions...
       def get_records(self, sql, parameters=None):
           return self._sql_hook.get_records(sql, parameters=parameters)
   ```
   
   It would be non-trivial to get everything line up correctly and cause maintenance burden. Splitting the AWS and SQL functionalities into two hooks is the better engineering solution IMO.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] josh-fell commented on pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
josh-fell commented on pull request #17515:
URL: https://github.com/apache/airflow/pull/17515#issuecomment-926628914


   There may be some overlap between the desired functionality of this PR and #18447 which adds both a `RedshiftStatementHook` and `RedshiftOperator`. Just wanted to bring this up to the group.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] frankcash closed pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
frankcash closed pull request #17515:
URL: https://github.com/apache/airflow/pull/17515


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] frankcash closed pull request #17515: Initializes RedshiftSqlHook by extending the PostgresHook

Posted by GitBox <gi...@apache.org>.
frankcash closed pull request #17515:
URL: https://github.com/apache/airflow/pull/17515


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org