You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/05/24 08:15:57 UTC

[GitHub] [airflow] potiuk opened a new pull request #8994: Fix name of google spreadsheets operator

potiuk opened a new pull request #8994:
URL: https://github.com/apache/airflow/pull/8994


   This change is based on #8991 so please take a look only at the last commit
   
   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633655831






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429681407



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:

Review comment:
       why do we need to add this one back?
   
   This means the API of the operator will be different between backport and eventual 2.0.
   
   If they want the old interface, user's can stick with the non backported op

##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       But 1.10.10 does support tags.
   
   (I can't work out from this fn where it is used.)

##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):

Review comment:
       these functions would be easier to understand if they had before -> after examples in a comment/docstring.
   
   could you add some please?

##########
File path: scripts/ci/in_container/run_test_package_import_all_classes.sh
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# shellcheck source=scripts/ci/in_container/_in_container_script_init.sh
+. "$( dirname "${BASH_SOURCE[0]}" )/_in_container_script_init.sh"
+
+echo
+echo "Testing if all classes in import packages can be imported"
+echo
+
+OUT_FILE=$(mktemp)
+
+if [[ ! ${INSTALL_AIRFLOW_VERSION:=""} =~ 1.10* ]]; then
+    echo
+    echo "ERROR! You can only install providers package in 1.10. airflow series."
+    echo "You have: ${INSTALL_AIRFLOW_VERSION}"
+    echo "Set INSTALL_AIRFLOW_VERSION variable to the version you want to install before running!"
+    exit 1
+else
+    pushd /airflow_sources || exit
+    echo
+    echo "Installing remaining packages from 'all' extras"
+    echo
+    pip install ".[all]" >>"${OUT_FILE}" 2>&1
+    echo
+    echo "Uninstalling airflow after that"
+    echo
+    pip uninstall -y apache-airflow >>"${OUT_FILE}"  2>&1
+    popd || exit
+    echo
+    echo "Install airflow from PyPI - ${INSTALL_AIRFLOW_VERSION}"
+    echo
+    pip install "apache-airflow==${INSTALL_AIRFLOW_VERSION}" >>"${OUT_FILE}" 2>&1

Review comment:
       how long does this test take? (can't check so easily on mobile.)
   
   I only wonder if this one is not worth running on every PR and commit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429754775



##########
File path: scripts/ci/in_container/run_test_package_import_all_classes.sh
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# shellcheck source=scripts/ci/in_container/_in_container_script_init.sh
+. "$( dirname "${BASH_SOURCE[0]}" )/_in_container_script_init.sh"
+
+echo
+echo "Testing if all classes in import packages can be imported"
+echo
+
+OUT_FILE=$(mktemp)
+
+if [[ ! ${INSTALL_AIRFLOW_VERSION:=""} =~ 1.10* ]]; then
+    echo
+    echo "ERROR! You can only install providers package in 1.10. airflow series."
+    echo "You have: ${INSTALL_AIRFLOW_VERSION}"
+    echo "Set INSTALL_AIRFLOW_VERSION variable to the version you want to install before running!"
+    exit 1
+else
+    pushd /airflow_sources || exit
+    echo
+    echo "Installing remaining packages from 'all' extras"
+    echo
+    pip install ".[all]" >>"${OUT_FILE}" 2>&1
+    echo
+    echo "Uninstalling airflow after that"
+    echo
+    pip uninstall -y apache-airflow >>"${OUT_FILE}"  2>&1
+    popd || exit
+    echo
+    echo "Install airflow from PyPI - ${INSTALL_AIRFLOW_VERSION}"
+    echo
+    pip install "apache-airflow==${INSTALL_AIRFLOW_VERSION}" >>"${OUT_FILE}" 2>&1

Review comment:
       It's two orders of magnitude shorter than any other "real" tests. I am sure It's worth it for sure, It takes less than 2 minutes. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429798826



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):

Review comment:
       I added a lot of examples and explanations in #8991   - that's where it belongs. Note that this is just a follow-up after that one, so you can check it there @ashb .




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429680656



##########
File path: airflow/config_templates/config.yml
##########
@@ -672,7 +672,7 @@
       version_added: ~
       type: string
       example: ~
-      default: "Airflow HiveOperator task for {{hostname}}.{{dag_id}}.{{task_id}}.{{execution_date}}"

Review comment:
       What's this change for?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633276444


   Hey @kaxil  @ashb @BasPH and @mik-laj -> I fixed the remaining 20 class names now. The CI tests will fail in the future in case somebody tries to commit a wrongly named class as well :).
   
   ```
   All good! All 573 classes are properly named
   ```
   
   Plus we will see in the CI job colorful output of what has changed in each of the packages (in case there are some changes) :). 
   
   I fixed most of the tests and likely docs will be failing and need some fix - however @mik-laj -> maybe you can help. I see 14 tests failing in test_to_contrib and I am not sure where they are coming from - it all looks legit - I must have missed something. Can you help to fix it please? 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633196635


   I want to check other names as well. Give me five minutes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633196635


   I want to check other names as well. Give me 15 minutes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429755224



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:

Review comment:
       Yes. @mik-laj explained it well. We have to do it this way,




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429800720



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       @ashb - also - note this tag removal will only apply to examples, not to the code itself. If you are using the backported operators in Airflow 1.10.10 you can still use tags. I explained that in the comments and examples I added in #8991 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633197256


   Could you look at the operators below?
   ```
   airflow.operators.generic_transfer.GenericTransfer
   airflow.providers.amazon.aws.operators.google_api_to_s3_transfer.GoogleApiToS3Transfer
   airflow.providers.amazon.aws.operators.redshift_to_s3.RedshiftToS3Transfer
   airflow.providers.amazon.aws.operators.s3_to_redshift.S3ToRedshiftTransfer
   airflow.providers.google.cloud.operators.gcs_to_gcs.GCSSynchronizeBuckets
   airflow.providers.google.suite.operators.sheets.GoogleSheetsCreateSpreadsheet
   airflow.providers.microsoft.azure.operators.oracle_to_azure_data_lake_transfer.OracleToAzureDataLakeTransfer
   airflow.providers.oracle.operators.oracle_to_oracle_transfer.OracleToOracleTransfer
   airflow.providers.snowflake.operators.s3_to_snowflake.S3ToSnowflakeTransfer
   airflow.providers.mysql.operators.presto_to_mysql.PrestoToMySqlTransfer
   airflow.providers.mysql.operators.s3_to_mysql.S3ToMySqlTransfer
   airflow.providers.mysql.operators.vertica_to_mysql.VerticaToMySqlTransfer
   airflow.providers.apache.druid.operators.hive_to_druid.HiveToDruidTransfer
   airflow.providers.apache.hive.operators.hive_to_mysql.HiveToMySqlTransfer
   airflow.providers.apache.hive.operators.mssql_to_hive.MsSqlToHiveTransfer
   airflow.providers.apache.hive.operators.mysql_to_hive.MySqlToHiveTransfer
   airflow.providers.apache.hive.operators.s3_to_hive.S3ToHiveTransfer
   airflow.providers.apache.hive.operators.vertica_to_hive.VerticaToHiveTransfer
   airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensorFolder
   airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensorRegex
   airflow.providers.amazon.aws.hooks.batch_client.AwsBatchClient
   airflow.providers.amazon.aws.hooks.batch_waiters.AwsBatchWaiters
   ```
   ```
   python scripts/list-integrations.py | tee /dev/tty > a.txt
   cat a.txt | grep -v "Hook$" | grep -v "Sensor$" | grep -v "Operator$" | grep -v "secrets"
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429887329



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       Ye. Will do. I have one more thing to fix so I add it there.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #8994:
URL: https://github.com/apache/airflow/pull/8994


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633282806


   For docs, you should use this file:
   https://raw.githubusercontent.com/PolideaInternal/airflow/fix-name-of-google-spreadsheets-operator/docs/howto/operator/gcp/sheets.rst
   <img width="817" alt="Screenshot 2020-05-24 at 21 27 59" src="https://user-images.githubusercontent.com/12058428/82763109-7e58df00-9e05-11ea-93be-082f1faa2f13.png">
   
   For the others, I'm waiting for the CI result.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633259658


   Automated check during readme generation: Fixing it now:
   
   ```
   The class airflow.providers.amazon.aws.operators.google_api_to_s3_transfer.GoogleApiToS3Transfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.amazon.aws.operators.redshift_to_s3.RedshiftToS3Transfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.amazon.aws.operators.s3_to_redshift.S3ToRedshiftTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.amazon.aws.hooks.batch_client.AwsBatchClient is wrongly named. It is one of the HOOKS so it should end with Hook
   The class airflow.providers.amazon.aws.hooks.batch_waiters.AwsBatchWaiters is wrongly named. It is one of the HOOKS so it should end with Hook
   
   ERROR! There are 5 classes badly named out of 77 classes for amazon
   
   The class airflow.providers.apache.druid.operators.hive_to_druid.HiveToDruidTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 1 classes badly named out of 5 classes for apache.druid
   
   The class airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensorRegex is wrongly named. It is one of the SENSORS so it should end with Sensor
   The class airflow.providers.apache.hdfs.sensors.hdfs.HdfsSensorFolder is wrongly named. It is one of the SENSORS so it should end with Sensor
   
   ERROR! There are 2 classes badly named out of 6 classes for apache.hdfs
   
   The class airflow.providers.apache.hive.operators.mysql_to_hive.MySqlToHiveTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.apache.hive.operators.mssql_to_hive.MsSqlToHiveTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.apache.hive.operators.s3_to_hive.S3ToHiveTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.apache.hive.operators.vertica_to_hive.VerticaToHiveTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.apache.hive.operators.hive_to_mysql.HiveToMySqlTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 5 classes badly named out of 14 classes for apache.hive
   
   The class airflow.providers.google.cloud.operators.gcs_to_gcs.GCSSynchronizeBuckets is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 1 classes badly named out of 325 classes for google
   
   The class airflow.providers.microsoft.azure.operators.oracle_to_azure_data_lake_transfer.OracleToAzureDataLakeTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 1 classes badly named out of 20 classes for microsoft.azure
   
   The class airflow.providers.mysql.operators.presto_to_mysql.PrestoToMySqlTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.mysql.operators.s3_to_mysql.S3ToMySqlTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   The class airflow.providers.mysql.operators.vertica_to_mysql.VerticaToMySqlTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 3 classes badly named out of 5 classes for mysql
   
   The class airflow.providers.oracle.operators.oracle_to_oracle_transfer.OracleToOracleTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 1 classes badly named out of 3 classes for oracle
   
   The class airflow.providers.snowflake.operators.s3_to_snowflake.S3ToSnowflakeTransfer is wrongly named. It is one of the OPERATORS so it should end with Operator
   
   ERROR! There are 1 classes badly named out of 3 classes for snowflake
   
   
   ERROR! There are in total: 20 classes badly named out of 573 classes 
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429683224



##########
File path: airflow/config_templates/config.yml
##########
@@ -672,7 +672,7 @@
       version_added: ~
       type: string
       example: ~
-      default: "Airflow HiveOperator task for {{hostname}}.{{dag_id}}.{{task_id}}.{{execution_date}}"

Review comment:
       The double `{` character was used here, so everything should work.
   ```
   >>> "{{ AA }}".format("AA")
   '{ AA }'
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429683048



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:

Review comment:
       The PythonOperaotra interface has changed in Airflow 2.0. In airflow 1.10, you had to provide additional options for the context to be passed. In Airflow 2.0, this happens automatically. For the example to work properly on Airflow 1.10, we need to add this parameter.
   https://github.com/apache/airflow/pull/5990
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429755461



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       Exactly. We want those operators to also work in previous versions of  Airflow, for those users that cannot migrate.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429802260



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       Just FYI - In the provider packages  - examples are part of the providers and we want the examples to continue being runnable and importable as well for more than airflow 1.10.10. Especially that we are planning to add AIP-4 - System Test automation using those example dags (and so far we are using those example dags to run our manually-triggered system tests, that we run for more versions of Airflow than 1.10.10 .




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429824769



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:

Review comment:
       Yes, I'm aware of the change to PythonOperator.
   
   My point is: why do we need to reverse that change for backported operators? Isn't the point of backporting to make the new interface available?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429825301



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       Cool, I did wonder where this might be used, if it's examples only this sound good.
   
   Can you add the reason for doing this change as a comment.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429760841



##########
File path: scripts/ci/in_container/run_test_package_import_all_classes.sh
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# shellcheck source=scripts/ci/in_container/_in_container_script_init.sh
+. "$( dirname "${BASH_SOURCE[0]}" )/_in_container_script_init.sh"
+
+echo
+echo "Testing if all classes in import packages can be imported"
+echo
+
+OUT_FILE=$(mktemp)
+
+if [[ ! ${INSTALL_AIRFLOW_VERSION:=""} =~ 1.10* ]]; then
+    echo
+    echo "ERROR! You can only install providers package in 1.10. airflow series."
+    echo "You have: ${INSTALL_AIRFLOW_VERSION}"
+    echo "Set INSTALL_AIRFLOW_VERSION variable to the version you want to install before running!"
+    exit 1
+else
+    pushd /airflow_sources || exit
+    echo
+    echo "Installing remaining packages from 'all' extras"
+    echo
+    pip install ".[all]" >>"${OUT_FILE}" 2>&1
+    echo
+    echo "Uninstalling airflow after that"
+    echo
+    pip uninstall -y apache-airflow >>"${OUT_FILE}"  2>&1
+    popd || exit
+    echo
+    echo "Install airflow from PyPI - ${INSTALL_AIRFLOW_VERSION}"
+    echo
+    pip install "apache-airflow==${INSTALL_AIRFLOW_VERSION}" >>"${OUT_FILE}" 2>&1

Review comment:
       Testing for imports 45 seconds on my machine. All those tests are done on one job  Installation of all packages one-by-one takes a biit longer but still it is single minutes rather than 30-40 minutes for the whole suite of tests.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429683106



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       We tested operators on Airflow 1.10.6 and 1.10.10 to encourage more users to migrate.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj edited a comment on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
mik-laj edited a comment on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633282806


   For docs, you should update this file:
   https://raw.githubusercontent.com/PolideaInternal/airflow/fix-name-of-google-spreadsheets-operator/docs/howto/operator/gcp/sheets.rst
   <img width="817" alt="Screenshot 2020-05-24 at 21 27 59" src="https://user-images.githubusercontent.com/12058428/82763109-7e58df00-9e05-11ea-93be-082f1faa2f13.png">
   
   For the others, I'm waiting for the CI result.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429755461



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):
+        # noinspection PyUnusedLocal
+        def remove_super_init_call_modifier(node: LN, capture: Capture, filename: Filename) -> None:
+            for ch in node.post_order():
+                if isinstance(ch, Leaf) and ch.value == "super":
+                    if any(c.value for c in ch.parent.post_order() if isinstance(c, Leaf)):
+                        ch.parent.remove()
+
+        self.qry.select_subclass("BaseHook").modify(remove_super_init_call_modifier)
+
+    def remove_tags(self):
+        # noinspection PyUnusedLocal
+        def remove_tags_modifier(_: LN, capture: Capture, filename: Filename) -> None:
+            for node in capture['function_arguments'][0].post_order():
+                if isinstance(node, Leaf) and node.value == "tags" and node.type == TOKEN.NAME:

Review comment:
       Exactly. We want those operators to also work in previous versions of  Airflow, for those users that cannot migrate easily. Our compatibility tests were from the beginning done on 1.10.6 and  1.10.10. There are a lot of users out there who will not migrate for quite a while.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429754433



##########
File path: airflow/config_templates/config.yml
##########
@@ -672,7 +672,7 @@
       version_added: ~
       type: string
       example: ~
-      default: "Airflow HiveOperator task for {{hostname}}.{{dag_id}}.{{task_id}}.{{execution_date}}"

Review comment:
       Yeahj. I was strugling with this one. Hive operator is wrongly written. It expect the configuration to contain {} references to parameters it passes, but when airflow reads the values from config, replaces it from environment variables (which are missing). So basically setting configuration for hive operator is wrong right now. 
   
   The problem was that Hive Operator when imported tried to read that configuration and since it was missing, it failed. I think I have it right finally (fallback in the operator but no defaults in .yaml/cfg files). Let's see how the tests work it out. Eventually we might want to fix the operator.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633202361


   I will fix those and build in protection 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429757858



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:
+            fn_args = capture['function_arguments'][0]
+            fn_args.append_child(Comma())
+
+            provide_context_arg = KeywordArg(Name('provide_context'), Name('True'))
+            provide_context_arg.prefix = fn_args.children[0].prefix
+            fn_args.append_child(provide_context_arg)
+
+        (
+            self.qry.
+            select_function("PythonOperator").
+            is_call().
+            is_filename(include=r"mlengine_operator_utils.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+        (
+            self.qry.
+            select_function("BranchPythonOperator").
+            is_call().
+            is_filename(include=r"example_google_api_to_s3_transfer_advanced.py$").
+            modify(add_provide_context_to_python_operator)
+        )
+
+    def remove_super_init_call(self):

Review comment:
       Sure. No problem. I can actually take it from the output of refactor. While refactoring you see colorful diff of all the changes made. It's just the matter of copy&pasting to the right place.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429754775



##########
File path: scripts/ci/in_container/run_test_package_import_all_classes.sh
##########
@@ -0,0 +1,62 @@
+#!/usr/bin/env bash
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# shellcheck source=scripts/ci/in_container/_in_container_script_init.sh
+. "$( dirname "${BASH_SOURCE[0]}" )/_in_container_script_init.sh"
+
+echo
+echo "Testing if all classes in import packages can be imported"
+echo
+
+OUT_FILE=$(mktemp)
+
+if [[ ! ${INSTALL_AIRFLOW_VERSION:=""} =~ 1.10* ]]; then
+    echo
+    echo "ERROR! You can only install providers package in 1.10. airflow series."
+    echo "You have: ${INSTALL_AIRFLOW_VERSION}"
+    echo "Set INSTALL_AIRFLOW_VERSION variable to the version you want to install before running!"
+    exit 1
+else
+    pushd /airflow_sources || exit
+    echo
+    echo "Installing remaining packages from 'all' extras"
+    echo
+    pip install ".[all]" >>"${OUT_FILE}" 2>&1
+    echo
+    echo "Uninstalling airflow after that"
+    echo
+    pip uninstall -y apache-airflow >>"${OUT_FILE}"  2>&1
+    popd || exit
+    echo
+    echo "Install airflow from PyPI - ${INSTALL_AIRFLOW_VERSION}"
+    echo
+    pip install "apache-airflow==${INSTALL_AIRFLOW_VERSION}" >>"${OUT_FILE}" 2>&1

Review comment:
       It's two orders of magnitude shorter than any other tests. I am sure It's worth it for sure, It takes less than 2 minutes. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429833732



##########
File path: backport_packages/refactor_backport_packages.py
##########
@@ -0,0 +1,369 @@
+#!/usr/bin/env python3
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from os.path import dirname
+from shutil import copyfile, copytree, rmtree
+from typing import List
+
+from backport_packages.setup_backport_packages import (
+    get_source_airflow_folder, get_source_providers_folder, get_target_providers_folder,
+    get_target_providers_package_folder, is_bigquery_non_dts_module,
+)
+from bowler import LN, TOKEN, Capture, Filename, Query
+from fissix.fixer_util import Comma, KeywordArg, Name
+from fissix.pytree import Leaf
+
+CLASS_TYPES = ["hooks", "operators", "sensors", "secrets", "protocols"]
+
+
+def copy_provider_sources() -> None:
+    """
+    Copies provider sources to directory where they will be refactored.
+    """
+    def rm_build_dir() -> None:
+        """
+        Removes build directory.
+        """
+        build_dir = os.path.join(dirname(__file__), "build")
+        if os.path.isdir(build_dir):
+            rmtree(build_dir)
+
+    def ignore_bigquery_files(src: str, names: List[str]) -> List[str]:
+        """
+        Ignore files with bigquery
+        :param src: source file
+        :param names: Name of the file
+        :return:
+        """
+        ignored_names = []
+        if any([src.endswith(os.path.sep + class_type) for class_type in CLASS_TYPES]):
+            ignored_names = [name for name in names
+                             if is_bigquery_non_dts_module(module_name=name)]
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                file_path = src + os.path.sep + file_name
+                with open(file_path, "rt") as file:
+                    text = file.read()
+                if any([f"airflow.providers.google.cloud.{class_type}.bigquery" in text
+                        for class_type in CLASS_TYPES]) or "_to_bigquery" in text:
+                    print(f"Ignoring {file_path}")
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_kubernetes_files(src: str, names: List[str]) -> List[str]:
+        ignored_names = []
+        if src.endswith(os.path.sep + "example_dags"):
+            for file_name in names:
+                if "example_kubernetes" in file_name:
+                    ignored_names.append(file_name)
+        return ignored_names
+
+    def ignore_some_files(src: str, names: List[str]) -> List[str]:
+        ignored_list = ignore_bigquery_files(src=src, names=names)
+        ignored_list.extend(ignore_kubernetes_files(src=src, names=names))
+        return ignored_list
+
+    rm_build_dir()
+    package_providers_dir = get_target_providers_folder()
+    if os.path.isdir(package_providers_dir):
+        rmtree(package_providers_dir)
+    copytree(get_source_providers_folder(), get_target_providers_folder(), ignore=ignore_some_files)
+
+
+class RefactorBackportPackages:
+    """
+    Refactors the code of providers, so that it works in 1.10.
+
+    """
+
+    def __init__(self):
+        self.qry = Query()
+
+    def remove_class(self, class_name) -> None:
+        # noinspection PyUnusedLocal
+        def _remover(node: LN, capture: Capture, filename: Filename) -> None:
+            if node.type not in (300, 311):  # remove only definition
+                node.remove()
+
+        self.qry.select_class(class_name).modify(_remover)
+
+    def rename_deprecated_modules(self):
+        changes = [
+            ("airflow.operators.bash", "airflow.operators.bash_operator"),
+            ("airflow.operators.python", "airflow.operators.python_operator"),
+            ("airflow.utils.session", "airflow.utils.db"),
+            (
+                "airflow.providers.cncf.kubernetes.operators.kubernetes_pod",
+                "airflow.contrib.operators.kubernetes_pod_operator"
+            ),
+        ]
+        for new, old in changes:
+            self.qry.select_module(new).rename(old)
+
+    def add_provide_context_to_python_operators(self):
+        # noinspection PyUnusedLocal
+        def add_provide_context_to_python_operator(node: LN, capture: Capture, filename: Filename) -> None:

Review comment:
       Good question. And we have good answers I think:
   
   First of all: this change (see #8991 for examples) only applies to example days that use the Python operators.
   
   Secondly: We are not backporting (and we've never planned to backport)  Python operators - they are part of the "core" operators, not the providers. The discussion we had back then was that some of the core operators (Python/Bash/Branch/Dummy/LatestOnly/BranchPython, ShortCircuit/PythonVirtualEnv - I believe that's a complete list but I could miss one or two) are really part of the Airflow Core and should stay there. The idea is that without those operators Airflow is fairly useless so they were supposed to be delivered with Airflow itself. They were moved to another location to make the naming consistent (but with deprecation redirection from old locations). But if any example uses any of them (quite a few) we had to refactor the examples to make them work with the 1.10.* version of those operators. That was a similar thing with tags (which we removed from examples so that the examples are applicable to pre-1.10.* version of the providers as well).
   
   So the decision was that they are going to stay in core and that their version is linked to Airflow version they are from.
   
   Obviously we could revert that decision  - but if we do so - we should add another provider ("core" ?). And we probably need to re-open discussion about that - because we all agreed on that before when we presented list of operators and explained that those will stay in core. 
   
   I do not think it's a blocker for this backport package release I believe) - It can be done easily afterwards as a separate release. Let me know if you think otherwise.  
   
   Feel free to open discussion on that @ash on devlist if you think it makes sense to release some of those as separate "provider".
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #8994: Fixed name of 20 remaining wrongly named operators.

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#discussion_r429680819



##########
File path: airflow/config_templates/config.yml
##########
@@ -672,7 +672,7 @@
       version_added: ~
       type: string
       example: ~
-      default: "Airflow HiveOperator task for {{hostname}}.{{dag_id}}.{{task_id}}.{{execution_date}}"

Review comment:
       Since this goes on the default.cfg, which is then `format()`ed fairly sure you don't want this change




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #8994: Fix name of google spreadsheets operator

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #8994:
URL: https://github.com/apache/airflow/pull/8994#issuecomment-633282806


   For docs, you should use this file:
   https://raw.githubusercontent.com/PolideaInternal/airflow/fix-name-of-google-spreadsheets-operator/docs/howto/operator/gcp/sheets.rst
   <img width="817" alt="Screenshot 2020-05-24 at 21 27 59" src="https://user-images.githubusercontent.com/12058428/82763109-7e58df00-9e05-11ea-93be-082f1faa2f13.png">
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org