You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by po...@apache.org on 2022/08/19 16:58:15 UTC

[airflow] branch main updated: Improve error handling/messaging around bucket exist check (#25805)

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new 92fce4fe87 Improve error handling/messaging around bucket exist check (#25805)
92fce4fe87 is described below

commit 92fce4fe8786ae66ba60df94949dc41cbb3526ce
Author: Niko <on...@amazon.com>
AuthorDate: Fri Aug 19 09:58:06 2022 -0700

    Improve error handling/messaging around bucket exist check (#25805)
    
    S3Hook.check_for_bucket() method uses the boto3 s3 client method `head_bucket`
    to check for bucket existence. This client method does not work like
    most boto3 APIs, it only returns a small subset of error codes.
---
 airflow/providers/amazon/aws/hooks/s3.py | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/airflow/providers/amazon/aws/hooks/s3.py b/airflow/providers/amazon/aws/hooks/s3.py
index 149887f6c9..22398fc878 100644
--- a/airflow/providers/amazon/aws/hooks/s3.py
+++ b/airflow/providers/amazon/aws/hooks/s3.py
@@ -203,7 +203,19 @@ class S3Hook(AwsBaseHook):
             self.get_conn().head_bucket(Bucket=bucket_name)
             return True
         except ClientError as e:
-            self.log.error(e.response["Error"]["Message"])
+            # The head_bucket api is odd in that it cannot return proper
+            # exception objects, so error codes must be used. Only 200, 404 and 403
+            # are ever returned. See the following links for more details:
+            # https://github.com/boto/boto3/issues/2499
+            # https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.head_bucket
+            return_code = int(e.response['Error']['Code'])
+            if return_code == 404:
+                self.log.error('Bucket "%s" does not exist', bucket_name)
+            elif return_code == 403:
+                self.log.error(
+                    'Access to bucket "%s" is forbidden or there was an error with the request', bucket_name
+                )
+                self.log.error(e)
             return False
 
     @provide_bucket_name