You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/08/20 00:00:45 UTC

[GitHub] [airflow] o-nikolas commented on a diff in pull request #25835: Use the extra_args in get_key()

o-nikolas commented on code in PR #25835:
URL: https://github.com/apache/airflow/pull/25835#discussion_r950615759


##########
airflow/providers/amazon/aws/hooks/s3.py:
##########
@@ -469,7 +469,7 @@ def get_key(self, key: str, bucket_name: Optional[str] = None) -> S3Transfer:
             verify=self.verify,
         )
         obj = s3_resource.Object(bucket_name, key)
-        obj.load()
+        obj.load(**self.extra_args)

Review Comment:
   A few things are getting conflated in this thread.
   #### 1) The PR code change:
   It looks like you're just unpacking the contents of `extra_args` into the `load` call. I agree with @Taragolis that I'm not sure `obj.load` expects all the things it could receive from that dict (or if any of them would even help with encryption). Object.load ([docs](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Object.load)) basically just calls s3client `head_object` ([docs](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.head_object)) which is one of those weird S3 APIs:
   >A HEAD request has the same options as a GET action on an object. The response is identical to the GET response except that there is no response body. Because of this, if the HEAD request generates an error, it returns a generic 404 Not Found or 403 Forbidden code. It is not possible to retrieve the exact exception beyond these error codes.
   
   To handle encryption when using that API you have to set request headers (see again the previous doc link):
   
   >If you encrypt an object by using server-side encryption with customer-provided encryption keys (SSE-C) when you store the object in Amazon S3, then when you retrieve the metadata from the object, you must use the following headers:
   
       x-amz-server-side-encryption-customer-algorithm
       x-amz-server-side-encryption-customer-key
       x-amz-server-side-encryption-customer-key-MD5
   
   #### 2) Andrey's suggested change:
   
   > At that moment in this method first we try to check if object exists before download and raise an error if not exists.
   This check could cause an issues in some cases.
   
   Just so that I fully understand your proposal, what are some cases you're thinking of? Also this change seems out of scope for what @moreaupascal56 is trying to achieve?
   
   Also, also, for your suggested code snippet, you're explicitly closing and unlinking the temp file, do you not just the context manager to do so?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org