You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/02/17 11:50:44 UTC

[GitHub] [airflow] baolsen opened a new pull request #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

baolsen opened a new pull request #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441
 
 
   ---
   Issue link: WILL BE INSERTED BY [boring-cyborg](https://github.com/kaxil/boring-cyborg)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [X] Description above provides context of the change
   - [X] Commit message/PR title starts with `[AIRFLOW-NNNN]`. AIRFLOW-NNNN = JIRA ID<sup>*</sup>
   - [X] Unit tests coverage for changes (not needed for documentation changes)
   - [X] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [X] Relevant documentation is updated including usage instructions.
   - [X] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   <sup>*</sup> For document-only changes commit message can start with `[AIRFLOW-XXXX]`.
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb edited a comment on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

Posted by GitBox <gi...@apache.org>.
ashb edited a comment on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441#issuecomment-588131525
 
 
   To duplicate my comments from the Jira issue. I feel we could do this nicer with a lot less duplication:
   
   In AwsHook:
   ```
   from cached_property import cached_property
   
       @cached_property
       def conn(self):
           return self.get_client_type(self.client_type, region=self.region_name)
   
       def get_conn(self):
           # Compat shim
           return self.conn
   ```
   
   And then in each subclass we only need to define:
   
   ```
   class S3Hook(AwsHook):
       client_type = "S3"
   ```
   
   for instance.
   
   The lambda hook would need a custom property as it takes an extra arg, but the others wouldn't.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] ashb commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441#issuecomment-588131525
 
 
   To duplicate my comments from the Jira issue. I feel we could do this nicer with a lot less duplication:
   
   In AwsHook:
   ```
   from cached_property import cached_property
   
       @cached_property
       def conn(self):
           return self.get_client_type(self.client_type)
   
       def get_conn(self):
           # Compat shim
           return self.conn
   ```
   
   And then in each subclass we only need to define:
   
   ```
   class S3Hook(AwsHook):
       client_type = "S3"
   ```
   
   for instance.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441#issuecomment-588156126
 
 
   > To duplicate my comments from the Jira issue. I feel we could do this nicer with a lot less duplication:
   > 
   > In AwsHook:
   > 
   > ```
   > from cached_property import cached_property
   > 
   >     @cached_property
   >     def conn(self):
   >         return self.get_client_type(self.client_type, region=self.region_name)
   > 
   >     def get_conn(self):
   >         # Compat shim
   >         return self.conn
   > ```
   > 
   > And then in each subclass we only need to define:
   > 
   > ```
   > class S3Hook(AwsHook):
   >     client_type = "S3"
   > ```
   > 
   > for instance.
   > 
   > The lambda hook would need a custom property as it takes an extra arg, but the others wouldn't.
   
   Makes sense. I had some trouble caching the get_client_type method due to the parameters for region etc, but your approach looks like it avoids that issue by wrapping it nicely. Thanks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen closed pull request #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

Posted by GitBox <gi...@apache.org>.
baolsen closed pull request #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441#issuecomment-590211768
 
 
   Closing this PR as I've created a new branch - changes diverged quite a bit from what I thought at first.
   Will re-open when ready

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] baolsen commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client

Posted by GitBox <gi...@apache.org>.
baolsen commented on issue #7441: [AIRFLOW-6822] AWS hooks should cache boto3 client
URL: https://github.com/apache/airflow/pull/7441#issuecomment-588225074
 
 
   Hey @ashb 
   
   Can you advise me on something... Please check
   providers/amazon/aws/sensors/s3_prefix.py
   
   Within the poke method, the hook is being recreated each time. Is there a reason for this - maybe something to do with the new reschedule behavior? I'm not sure why it is written like that and not just do 'self.hook = X' in the __init__ method of the sensor.
   
   There are a few of the AWS sensors with different implementations around this, I thought its a good idea to change them to be more consistent as well

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services