You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/24 08:53:06 UTC

[GitHub] [airflow] Gabriel39 opened a new pull request #17201: [AIRFLOW-17200][WIP] Add Alibaba Cloud OSS support

Gabriel39 opened a new pull request #17201:
URL: https://github.com/apache/airflow/pull/17201


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   Alibaba Cloud Object Storage Service (OSS) is an encrypted, secure, cost-effective, and easy-to-use object storage service that enables you to store, back up, and archive large amounts of data in the cloud, with a guaranteed durability of 99.9999999999%(12 9’s). (refer to https://www.alibabacloud.com/product/oss)
   
   This PR is to enable Airflow users using OSS. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-890641672


   @vikramkoka @mik-laj need review. kind remind. Thx :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677121404



##########
File path: docs/apache-airflow-providers-alibabacloud/operators/oss.rst
##########
@@ -0,0 +1,64 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Alibaba OSS Operators
+====================
+
+.. contents::
+  :depth: 1
+  :local:
+
+Overview
+--------
+
+Airflow to Alibaba Cloud Object Storage Service (OSS) integration provides several operators to create and interact with OSS buckets.
+
+ - :class:`~airflow.providers.alibabacloud.sensors.oss_key.OSSKeySensor`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_bucket.OSSCreateBucketOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_bucket.OSSDeleteBucketOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSUploadObjectOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSDownloadObjectOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSDeleteBatchObjectOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSDeleteObjectOperator`
+
+Two example_dags are provided which showcase these operators in action.
+
+ - example_oss_bucket.py
+ - example_oss_object.py

Review comment:
       Pleaase remove it. The names of specific examples have no value for the end user, so we can omit them and just focus on describing specific operations.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677121790



##########
File path: docs/apache-airflow-providers-alibabacloud/operators/oss.rst
##########
@@ -0,0 +1,64 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Alibaba OSS Operators
+====================
+
+.. contents::
+  :depth: 1
+  :local:
+
+Overview
+--------
+
+Airflow to Alibaba Cloud Object Storage Service (OSS) integration provides several operators to create and interact with OSS buckets.
+
+ - :class:`~airflow.providers.alibabacloud.sensors.oss_key.OSSKeySensor`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_bucket.OSSCreateBucketOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_bucket.OSSDeleteBucketOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSUploadObjectOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSDownloadObjectOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSDeleteBatchObjectOperator`
+ - :class:`~airflow.providers.alibabacloud.operators.oss_object.OSSDeleteObjectOperator`
+
+Two example_dags are provided which showcase these operators in action.
+
+ - example_oss_bucket.py
+ - example_oss_object.py
+
+.. _howto/operator:OSSCreateBucketOperator:
+.. _howto/operator:OSSDeleteBucketOperator:
+
+Create and Delete Alibaba Cloud OSS Buckets
+-----------------------------------
+
+Purpose
+"""""""
+
+This example dag ``example_oss_bucket.py`` uses ``OSSCreateBucketOperator`` and ``OSSDeleteBucketOperator`` to create a

Review comment:
       Please remove reference to example dag in the description and focus only on the description of operator usage.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #17201:
URL: https://github.com/apache/airflow/pull/17201


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-889194773


   cc: @vikramkoka thx :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677120851



##########
File path: airflow/providers/alibabacloud/sensors/oss_key.py
##########
@@ -0,0 +1,106 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+import oss2
+from typing import Optional
+from urllib.parse import urlparse
+
+from airflow.exceptions import AirflowException
+from airflow.providers.alibabacloud.hooks.oss import OSSHook
+from airflow.sensors.base import BaseSensorOperator
+
+
+class OSSKeySensor(BaseSensorOperator):
+    """
+    Waits for a key (a file-like instance on OSS) to be present in a OSS bucket.
+    OSS being a key/value it does not support folders. The path is just a key
+    a resource.
+
+    :param bucket_key: The key being waited on. Supports full oss:// style url
+        or relative path from root level. When it's specified as a full oss://
+        url, please leave bucket_name as `None`.
+    :type bucket_key: str
+    :param region: OSS region
+    :type region: str
+    :param bucket_name: OSS bucket name
+    :type bucket_name: str
+    :param oss_conn_id: The Airflow connection used for OSS credentials.
+    :type oss_conn_id: Optional[str]
+    :param auth: authentication for Alibaba Cloud user. If not specified fetched from connection.
+    :type auth: Optional[oss2.auth.Auth]
+    :param access_id: access key id for Alibaba Cloud user. If not specified fetched from connection.
+    :type access_id: Optional[str]
+    :param access_secret: access key secret for Alibaba Cloud user. If not specified fetched from connection.
+    :type access_secret: Optional[str]
+    """
+
+    template_fields = ('bucket_key', 'bucket_name')
+
+    def __init__(
+        self,
+        bucket_key: str,
+        region: str,
+        bucket_name: Optional[str] = None,
+        oss_conn_id: Optional[str] = 'oss_default',
+        auth: Optional[oss2.auth.Auth] = None,
+        access_id: Optional[str] = None,
+        access_secret: Optional[str] = None,
+        **kwargs,
+    ):
+        super().__init__(**kwargs)
+
+        self.bucket_name = bucket_name
+        self.bucket_key = bucket_key
+        self.region = region
+        self.oss_conn_id = oss_conn_id
+        self.auth = auth
+        self.access_id = access_id
+        self.access_secret = access_secret
+        self.hook: Optional[OSSHook] = None
+
+    def poke(self, context):
+
+        if self.bucket_name is None:
+            parsed_url = urlparse(self.bucket_key)
+            if parsed_url.netloc == '':
+                raise AirflowException('If key is a relative path from root, please provide a bucket_name')
+            self.bucket_name = parsed_url.netloc
+            self.bucket_key = parsed_url.path.lstrip('/')
+        else:
+            parsed_url = urlparse(self.bucket_key)
+            if parsed_url.scheme != '' or parsed_url.netloc != '':
+                raise AirflowException(
+                    'If bucket_name is provided, bucket_key'
+                    + ' should be relative path from root'
+                    + ' level, rather than a full oss:// url'
+                )
+
+        self.log.info('Poking for key : oss://%s/%s', self.bucket_name, self.bucket_key)
+        return self.get_hook().object_exists(
+            key=self.bucket_key,
+            bucket_name=self.bucket_name,
+            auth=self.auth,
+            access_id=self.access_id,
+            access_secret=self.access_secret)
+
+    def get_hook(self) -> OSSHook:
+        """Create and return an OSSHook"""
+        if self.hook:

Review comment:
       What do you think about cached_property?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-896911672


   This makes _every_ (or almost every) test run try to reach out to alibaba cloud in the oss-cn-hangzhou region, and is currently failing builds on main https://github.com/apache/airflow/runs/3290445260
   
   ```
     E               Failed: Timeout >60.0s
     /usr/local/lib/python3.6/site-packages/urllib3/util/connection.py:86: Failed
   ```
   
   From what I can tell this bucket region is unreachable outside China, which means in the _best_ case these tests will be skipped, and in the worst case as now they will be skipped.
   
   I think we have to revert this PR until we have a fix for this. Or at the very least we will skip these tests and we can't release this provider until the tests are actually running.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-894660937


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r679924721



##########
File path: CONTRIBUTING.rst
##########
@@ -654,6 +654,7 @@ Here is the list of packages and their extras:
 Package                    Extras
 ========================== ===========================
 airbyte                    http
+alibabacloud               oss2

Review comment:
       @kaxil Yea. good idea :) I have renamed it to alibaba.cloud and resubmitted. Thx 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677443872



##########
File path: airflow/providers/alibabacloud/hooks/oss.py
##########
@@ -0,0 +1,441 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from functools import wraps
+from inspect import signature
+import oss2
+from oss2.exceptions import ClientError
+from typing import Callable, TYPE_CHECKING, TypeVar, cast, Optional
+from urllib.parse import urlparse
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base import BaseHook
+
+
+if TYPE_CHECKING:
+    from airflow.models.connection import Connection
+
+T = TypeVar("T", bound=Callable)
+
+
+def provide_bucket_name(func: T) -> T:
+    """
+    Function decorator that unifies bucket name and key taken from the key
+    in case no bucket name and at least a key has been passed to the function.
+    """
+    function_signature = signature(func)
+
+    @wraps(func)
+    def wrapper(*args, **kwargs) -> T:
+        bound_args = function_signature.bind(*args, **kwargs)
+        self = args[0]
+
+        def get_credential(oss_conn, arguments) -> oss2.auth.Auth:
+            extra_config = oss_conn.extra_dejson
+            auth_type = extra_config.get('auth_type', None)
+            if not auth_type:
+                raise Exception("No auth_type specified in extra_config, either 'AK' or 'STS'")
+
+            if auth_type == 'AK':
+                oss_access_key_id = arguments['access_id'] \
+                    if 'access_id' in bound_args.arguments and bound_args.arguments['access_id'] is not None \
+                    else extra_config.get('access_key_id', None)
+                oss_access_key_secret = arguments['access_secret'] \
+                    if 'access_secret' in bound_args.arguments and bound_args.arguments['access_secret'] is not None \
+                    else extra_config.get('access_key_secret', None)
+                if not oss_access_key_id:
+                    raise Exception("No access_key_id is specified for connection: " + self.conn_id)
+                if not oss_access_key_secret:
+                    raise Exception("No access_key_secret is specified for connection: " + self.conn_id)
+                return oss2.Auth(oss_access_key_id, oss_access_key_secret)
+            else:
+                raise Exception("Unsupported auth_type: " + auth_type)
+
+        if 'auth' not in bound_args.arguments or bound_args.arguments['auth'] is None:
+            bound_args.arguments['auth'] = get_credential(self.oss_conn, bound_args.arguments)
+
+        if 'bucket_name' not in bound_args.arguments or bound_args.arguments['bucket_name'] is None:
+            if self.oss_conn_id:
+                connection = self.get_connection(self.oss_conn_id)
+                if connection.schema:
+                    bound_args.arguments['bucket_name'] = connection.schema

Review comment:
       done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-887213468


   What do you think about adding system tests to these operators?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on pull request #17201: [AIRFLOW-17200][WIP] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-886023976


   FYI: This PR is WIP. For now, OSS support is almost done except documents, and I will follow up soon. So feel free to comment for current work. Thx. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on pull request #17201: [AIRFLOW-17200][WIP] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-886024344


   cc @potiuk @kaxil Thx


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677120581



##########
File path: airflow/providers/alibabacloud/provider.yaml
##########
@@ -0,0 +1,52 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-alibabacloud
+name: Alibabacloud
+description: |
+    Alibaba Cloud integration (including `Alibaba Cloud <https://www.alibabacloud.com//>`__).
+
+versions:
+  - 2.1.0
+
+additional-dependencies:
+  - apache-airflow>=2.1.0
+
+integrations:
+  - integration-name: Alibaba Cloud OSS
+    external-doc-url: https://www.alibabacloud.com/help/product/31815.htm
+    tags: [alibabacloud]
+
+operators:
+  - integration-name: Alibaba Cloud OSS
+    python-modules:
+      - airflow.providers.alibabacloud.operators.oss_bucket
+      - airflow.providers.alibabacloud.operators.oss_object

Review comment:
       If possible, we should store operators/hooks/sensors in a file with the same name so that the user does not have to guess the name of the file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-898959652


   https://github.com/apache/airflow/pull/17616 should fix the instability and #17617  is longer-term support to use mocks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677120045



##########
File path: airflow/providers/alibabacloud/operators/oss_bucket.py
##########
@@ -0,0 +1,122 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+"""This module contains Alibaba Cloud OSS operators."""
+import oss2
+from typing import Optional
+
+from airflow.models import BaseOperator
+from airflow.providers.alibabacloud.hooks.oss import OSSHook
+
+
+class OSSCreateBucketOperator(BaseOperator):
+    """
+    This operator creates an OSS bucket
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the guide:
+        :ref:`howto/operator:OSSCreateBucketOperator`
+
+    :param region: OSS region you want to create bucket
+    :type region: str
+    :param bucket_name: This is bucket name you want to create
+    :type bucket_name: str
+    :param oss_conn_id: The Airflow connection used for OSS credentials.
+    :type oss_conn_id: Optional[str]
+    :param auth: authentication for Alibaba Cloud user. If not specified fetched from connection.
+    :type auth: Optional[oss2.auth.Auth]
+    :param access_id: access key id for Alibaba Cloud user. If not specified fetched from connection.
+    :type access_id: Optional[str]
+    :param access_secret: access key secret for Alibaba Cloud user. If not specified fetched from connection.

Review comment:
       We should not allow credentials to be passed via parameters as this is insecure. When we allow it, the user very often then save these credentials in the same file as the DAG, which may expose these credentials to leakage.  However, when credentials are saved with the connection, they are either encrypted in the database or can be stored securely in secret backend.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r677118461



##########
File path: airflow/providers/alibabacloud/hooks/oss.py
##########
@@ -0,0 +1,441 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from functools import wraps
+from inspect import signature
+import oss2
+from oss2.exceptions import ClientError
+from typing import Callable, TYPE_CHECKING, TypeVar, cast, Optional
+from urllib.parse import urlparse
+
+from airflow.exceptions import AirflowException
+from airflow.hooks.base import BaseHook
+
+
+if TYPE_CHECKING:
+    from airflow.models.connection import Connection
+
+T = TypeVar("T", bound=Callable)
+
+
+def provide_bucket_name(func: T) -> T:
+    """
+    Function decorator that unifies bucket name and key taken from the key
+    in case no bucket name and at least a key has been passed to the function.
+    """
+    function_signature = signature(func)
+
+    @wraps(func)
+    def wrapper(*args, **kwargs) -> T:
+        bound_args = function_signature.bind(*args, **kwargs)
+        self = args[0]
+
+        def get_credential(oss_conn, arguments) -> oss2.auth.Auth:
+            extra_config = oss_conn.extra_dejson
+            auth_type = extra_config.get('auth_type', None)
+            if not auth_type:
+                raise Exception("No auth_type specified in extra_config, either 'AK' or 'STS'")
+
+            if auth_type == 'AK':
+                oss_access_key_id = arguments['access_id'] \
+                    if 'access_id' in bound_args.arguments and bound_args.arguments['access_id'] is not None \
+                    else extra_config.get('access_key_id', None)
+                oss_access_key_secret = arguments['access_secret'] \
+                    if 'access_secret' in bound_args.arguments and bound_args.arguments['access_secret'] is not None \
+                    else extra_config.get('access_key_secret', None)
+                if not oss_access_key_id:
+                    raise Exception("No access_key_id is specified for connection: " + self.conn_id)
+                if not oss_access_key_secret:
+                    raise Exception("No access_key_secret is specified for connection: " + self.conn_id)
+                return oss2.Auth(oss_access_key_id, oss_access_key_secret)
+            else:
+                raise Exception("Unsupported auth_type: " + auth_type)
+
+        if 'auth' not in bound_args.arguments or bound_args.arguments['auth'] is None:
+            bound_args.arguments['auth'] = get_credential(self.oss_conn, bound_args.arguments)
+
+        if 'bucket_name' not in bound_args.arguments or bound_args.arguments['bucket_name'] is None:
+            if self.oss_conn_id:
+                connection = self.get_connection(self.oss_conn_id)
+                if connection.schema:
+                    bound_args.arguments['bucket_name'] = connection.schema

Review comment:
       Is this behavior documented? I'm dont see it in ``docs/apache-airflow-providers-alibabacloud/connections/alibabacloud.rst`` file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-887510974


   > What do you think about adding system tests to these operators?
   
   Hi @mik-laj , I updated my code and resubmitted this PR. Thank you for your advises! 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r684635843



##########
File path: airflow/providers/alibaba/CHANGELOG.rst
##########
@@ -0,0 +1,25 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+
+Changelog
+---------
+
+2.1.0

Review comment:
       ```suggestion
   1.0.0
   ```

##########
File path: airflow/providers/alibaba/provider.yaml
##########
@@ -0,0 +1,54 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+---
+package-name: apache-airflow-providers-alibaba
+name: Alibaba
+description: |
+    Alibaba Cloud integration (including `Alibaba Cloud <https://www.alibabacloud.com//>`__).
+
+versions:
+  - 2.1.0

Review comment:
       ```suggestion
     - 1.0.0
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Gabriel39 commented on pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
Gabriel39 commented on pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#issuecomment-894660197


   > Looks fantastic. Just one small NIT - the provider version should be 1.0.0
   
   @potiuk Updated. Thx :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #17201: [AIRFLOW-17200] Add Alibaba Cloud OSS support

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #17201:
URL: https://github.com/apache/airflow/pull/17201#discussion_r679568071



##########
File path: CONTRIBUTING.rst
##########
@@ -654,6 +654,7 @@ Here is the list of packages and their extras:
 Package                    Extras
 ========================== ===========================
 airbyte                    http
+alibabacloud               oss2

Review comment:
       What do you'll think if having `alibaba.cloud` similar to `google.cloud` and have similar sort of directory structure.
   
   cc @eladkal 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org