You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by pi...@apache.org on 2023/01/12 00:00:55 UTC

[airflow] branch v2-5-test updated (93a7e5fc18 -> 2f5060edef)

This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a change to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


 discard 93a7e5fc18 Change Architecture and OperatingSystem classies into Enums (#28627)
 discard d65fd66e46 Update pre-commit hooks (#28567)
 discard 0902436028 Add back join to zombie query that was dropped in #28198 (#28544)
 discard 92c0b34571 Cleanup and do housekeeping with plugin examples (#28537)
 discard df9cd4c9b7 Improve provider validation pre-commit (#28516)
 discard 32a2bb67fd Remove extra H1 & improve formatting of Listeners docs page (#28450)
     new b607a287b8 Improve provider validation pre-commit (#28516)
     new 6d65da72db Add back join to zombie query that was dropped in #28198 (#28544)
     new f56fd84b4c Update pre-commit hooks (#28567)
     new 2f5060edef Change Architecture and OperatingSystem classies into Enums (#28627)

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (93a7e5fc18)
            \
             N -- N -- N   refs/heads/v2-5-test (2f5060edef)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .dockerignore                                      |   1 +
 .../airflow_breeze/utils/docker_command_utils.py   |   1 +
 dev/breeze/tests/test_commands.py                  |   8 +-
 docs/apache-airflow/empty_plugin/empty_plugin.py   |  60 -------
 docs/apache-airflow/howto/custom-view-plugin.rst   |  72 ++++++--
 docs/apache-airflow/listeners.rst                  |  54 +++---
 docs/apache-airflow/plugins.rst                    |  12 --
 .../empty_plugin => metastore_browser}/README.md   |   4 +-
 metastore_browser/hive_metastore.py                | 199 +++++++++++++++++++++
 .../templates/metastore_browser/base.html          |  20 ++-
 .../templates/metastore_browser/db.html            |  36 ++--
 .../templates/metastore_browser/dbs.html           |  11 +-
 .../templates/metastore_browser/table.html         | 152 ++++++++++++++++
 scripts/ci/docker-compose/local.yml                |   3 +
 14 files changed, 495 insertions(+), 138 deletions(-)
 delete mode 100644 docs/apache-airflow/empty_plugin/empty_plugin.py
 rename {docs/apache-airflow/empty_plugin => metastore_browser}/README.md (90%)
 create mode 100644 metastore_browser/hive_metastore.py
 rename docs/apache-airflow/empty_plugin/templates/empty_plugin/index.html => metastore_browser/templates/metastore_browser/base.html (57%)
 copy airflow/www/templates/airflow/xcom.html => metastore_browser/templates/metastore_browser/db.html (57%)
 copy airflow/www/templates/airflow/noaccess.html => metastore_browser/templates/metastore_browser/dbs.html (83%)
 create mode 100644 metastore_browser/templates/metastore_browser/table.html


[airflow] 03/04: Update pre-commit hooks (#28567)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit f56fd84b4c6f1d2a3d5491458df869263b6b3989
Author: KarshVashi <41...@users.noreply.github.com>
AuthorDate: Sat Dec 24 01:03:59 2022 +0000

    Update pre-commit hooks (#28567)
    
    (cherry picked from commit 837e0fe2ea8859ae879d8382142c29a6416f02b9)
---
 .pre-commit-config.yaml                                  |  8 ++++----
 airflow/www/fab_security/manager.py                      |  2 +-
 .../src/airflow_breeze/commands/testing_commands.py      |  2 +-
 dev/provider_packages/prepare_provider_packages.py       | 16 ++++++++--------
 docs/exts/docs_build/docs_builder.py                     |  4 ++--
 docs/exts/extra_files_with_substitutions.py              |  2 +-
 docs/exts/provider_init_hack.py                          |  2 +-
 kubernetes_tests/test_base.py                            |  2 +-
 tests/jobs/test_triggerer_job.py                         |  2 +-
 9 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index a6ed9b1f4d..577f0a1dda 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -148,7 +148,7 @@ repos:
           \.cfg$|\.conf$|\.ini$|\.ldif$|\.properties$|\.readthedocs$|\.service$|\.tf$|Dockerfile.*$
   # Keep version of black in sync wit blacken-docs and pre-commit-hook-names
   - repo: https://github.com/psf/black
-    rev: 22.3.0
+    rev: 22.12.0
     hooks:
       - id: black
         name: Run black (python formatter)
@@ -210,7 +210,7 @@ repos:
         pass_filenames: true
   # TODO: Bump to Python 3.8 when support for Python 3.7 is dropped in Airflow.
   - repo: https://github.com/asottile/pyupgrade
-    rev: v2.32.1
+    rev: v3.3.1
     hooks:
       - id: pyupgrade
         name: Upgrade Python code automatically
@@ -259,7 +259,7 @@ repos:
           ^airflow/_vendor/
         additional_dependencies: ['toml']
   - repo: https://github.com/asottile/yesqa
-    rev: v1.3.0
+    rev: v1.4.0
     hooks:
       - id: yesqa
         name: Remove unnecessary noqa statements
@@ -268,7 +268,7 @@ repos:
           ^airflow/_vendor/
         additional_dependencies: ['flake8>=4.0.1']
   - repo: https://github.com/ikamensh/flynt
-    rev: '0.76'
+    rev: '0.77'
     hooks:
       - id: flynt
         name: Run flynt string format converter for Python
diff --git a/airflow/www/fab_security/manager.py b/airflow/www/fab_security/manager.py
index 96649b046b..ea8918053c 100644
--- a/airflow/www/fab_security/manager.py
+++ b/airflow/www/fab_security/manager.py
@@ -1013,7 +1013,7 @@ class BaseSecurityManager:
 
     @staticmethod
     def ldap_extract(ldap_dict: dict[str, list[bytes]], field_name: str, fallback: str) -> str:
-        raw_value = ldap_dict.get(field_name, [bytes()])
+        raw_value = ldap_dict.get(field_name, [b""])
         # decode - if empty string, default to fallback, otherwise take first element
         return raw_value[0].decode("utf-8") or fallback
 
diff --git a/dev/breeze/src/airflow_breeze/commands/testing_commands.py b/dev/breeze/src/airflow_breeze/commands/testing_commands.py
index 33781e3373..58d0b509a9 100644
--- a/dev/breeze/src/airflow_breeze/commands/testing_commands.py
+++ b/dev/breeze/src/airflow_breeze/commands/testing_commands.py
@@ -181,7 +181,7 @@ def _run_test(
             for container_id in container_ids:
                 dump_path = FILES_DIR / f"container_logs_{container_id}_{date_str}.log"
                 get_console(output=output).print(f"[info]Dumping container {container_id} to {dump_path}")
-                with open(dump_path, "wt") as outfile:
+                with open(dump_path, "w") as outfile:
                     run_command(["docker", "logs", container_id], check=False, stdout=outfile)
     finally:
         run_command(
diff --git a/dev/provider_packages/prepare_provider_packages.py b/dev/provider_packages/prepare_provider_packages.py
index 2ef0859c89..ed1afb1e8f 100755
--- a/dev/provider_packages/prepare_provider_packages.py
+++ b/dev/provider_packages/prepare_provider_packages.py
@@ -1110,7 +1110,7 @@ def prepare_readme_file(context):
         template_name="PROVIDER_README", context=context, extension=".rst"
     )
     readme_file_path = os.path.join(TARGET_PROVIDER_PACKAGES_PATH, "README.rst")
-    with open(readme_file_path, "wt") as readme_file:
+    with open(readme_file_path, "w") as readme_file:
         readme_file.write(readme_content)
 
 
@@ -1182,7 +1182,7 @@ def mark_latest_changes_as_documentation_only(provider_package_id: str, latest_c
         "as doc-only changes!"
     )
     with open(
-        os.path.join(provider_details.source_provider_package_path, ".latest-doc-only-change.txt"), "tw"
+        os.path.join(provider_details.source_provider_package_path, ".latest-doc-only-change.txt"), "w"
     ) as f:
         f.write(latest_change.full_hash + "\n")
         # exit code 66 marks doc-only change marked
@@ -1311,7 +1311,7 @@ def replace_content(file_path, old_text, new_text, provider_package_id):
         try:
             if os.path.isfile(file_path):
                 copyfile(file_path, temp_file_path)
-            with open(file_path, "wt") as readme_file:
+            with open(file_path, "w") as readme_file:
                 readme_file.write(new_text)
             console.print()
             console.print(f"Generated {file_path} file for the {provider_package_id} provider")
@@ -1401,7 +1401,7 @@ def prepare_setup_py_file(context):
     setup_py_content = render_template(
         template_name=setup_py_template_name, context=context, extension=".py", autoescape=False
     )
-    with open(setup_py_file_path, "wt") as setup_py_file:
+    with open(setup_py_file_path, "w") as setup_py_file:
         setup_py_file.write(black_format(setup_py_content))
 
 
@@ -1415,7 +1415,7 @@ def prepare_setup_cfg_file(context):
         autoescape=False,
         keep_trailing_newline=True,
     )
-    with open(setup_cfg_file_path, "wt") as setup_cfg_file:
+    with open(setup_cfg_file_path, "w") as setup_cfg_file:
         setup_cfg_file.write(setup_cfg_content)
 
 
@@ -1434,7 +1434,7 @@ def prepare_get_provider_info_py_file(context, provider_package_id: str):
         autoescape=False,
         keep_trailing_newline=True,
     )
-    with open(get_provider_file_path, "wt") as get_provider_file:
+    with open(get_provider_file_path, "w") as get_provider_file:
         get_provider_file.write(black_format(get_provider_content))
 
 
@@ -1447,7 +1447,7 @@ def prepare_manifest_in_file(context):
         autoescape=False,
         keep_trailing_newline=True,
     )
-    with open(target, "wt") as fh:
+    with open(target, "w") as fh:
         fh.write(content)
 
 
@@ -1840,7 +1840,7 @@ def generate_new_changelog(package_id, provider_details, changelog_path, changes
         console.print(
             f"[green]Appending the provider {package_id} changelog for `{latest_version}` version.[/]"
         )
-    with open(changelog_path, "wt") as changelog:
+    with open(changelog_path, "w") as changelog:
         changelog.write("\n".join(new_changelog_lines))
         changelog.write("\n")
 
diff --git a/docs/exts/docs_build/docs_builder.py b/docs/exts/docs_build/docs_builder.py
index d6b01d7239..90baffe2ba 100644
--- a/docs/exts/docs_build/docs_builder.py
+++ b/docs/exts/docs_build/docs_builder.py
@@ -162,7 +162,7 @@ class AirflowDocsBuilder:
                 " ".join(shlex.quote(c) for c in build_cmd),
             )
             console.print(f"[info]{self.package_name:60}:[/] The output is hidden until an error occurs.")
-        with open(self.log_spelling_filename, "wt") as output:
+        with open(self.log_spelling_filename, "w") as output:
             completed_proc = run(
                 build_cmd,
                 cwd=self._src_dir,
@@ -241,7 +241,7 @@ class AirflowDocsBuilder:
                 f"[info]{self.package_name:60}:[/] Running sphinx. "
                 f"The output is hidden until an error occurs."
             )
-        with open(self.log_build_filename, "wt") as output:
+        with open(self.log_build_filename, "w") as output:
             completed_proc = run(
                 build_cmd,
                 cwd=self._src_dir,
diff --git a/docs/exts/extra_files_with_substitutions.py b/docs/exts/extra_files_with_substitutions.py
index 5cdaadd610..a2f0d8f9ce 100644
--- a/docs/exts/extra_files_with_substitutions.py
+++ b/docs/exts/extra_files_with_substitutions.py
@@ -38,7 +38,7 @@ def copy_docker_compose(app, exception):
         with open(os.path.join(app.outdir, os.path.dirname(path), os.path.basename(path))) as input_file:
             content = input_file.readlines()
         with open(
-            os.path.join(app.outdir, os.path.dirname(path), os.path.basename(path)), "wt"
+            os.path.join(app.outdir, os.path.dirname(path), os.path.basename(path)), "w"
         ) as output_file:
             for line in content:
                 output_file.write(line.replace("|version|", app.config.version))
diff --git a/docs/exts/provider_init_hack.py b/docs/exts/provider_init_hack.py
index e9ff142e82..be34d13b3a 100644
--- a/docs/exts/provider_init_hack.py
+++ b/docs/exts/provider_init_hack.py
@@ -37,7 +37,7 @@ def _create_init_py(app, config):
     del config
     # This file is deleted by /docs/build_docs.py. If you are not using the script, the file will be
     # deleted by pre-commit.
-    with open(PROVIDER_INIT_FILE, "wt"):
+    with open(PROVIDER_INIT_FILE, "w"):
         pass
 
 
diff --git a/kubernetes_tests/test_base.py b/kubernetes_tests/test_base.py
index a5a690881d..0601b2ff55 100644
--- a/kubernetes_tests/test_base.py
+++ b/kubernetes_tests/test_base.py
@@ -52,7 +52,7 @@ class TestBase(unittest.TestCase):
         ci = os.environ.get("CI")
         if ci and ci.lower() == "true":
             print("The resource dump will be uploaded as artifact of the CI job")
-        with open(output_file_path, "wt") as output_file:
+        with open(output_file_path, "w") as output_file:
             print("=" * 80, file=output_file)
             print(f"Describe resources for namespace {namespace}", file=output_file)
             print(f"Datetime: {datetime.utcnow()}", file=output_file)
diff --git a/tests/jobs/test_triggerer_job.py b/tests/jobs/test_triggerer_job.py
index b84392366a..5fa64c9c47 100644
--- a/tests/jobs/test_triggerer_job.py
+++ b/tests/jobs/test_triggerer_job.py
@@ -44,7 +44,7 @@ class TimeDeltaTrigger_(TimeDeltaTrigger):
         self.delta = delta
 
     async def run(self):
-        with open(self.filename, "at") as f:
+        with open(self.filename, "a") as f:
             f.write("hi\n")
         async for event in super().run():
             yield event


[airflow] 02/04: Add back join to zombie query that was dropped in #28198 (#28544)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 6d65da72db04d0c68301ca265dd5da61097670f5
Author: Jed Cunningham <66...@users.noreply.github.com>
AuthorDate: Fri Dec 23 09:24:30 2022 -0600

    Add back join to zombie query that was dropped in #28198 (#28544)
    
    #28198 accidentally dropped a join in a query, leading to this:
    
        airflow/jobs/scheduler_job.py:1547 SAWarning: SELECT statement has a
        cartesian product between FROM element(s) "dag_run_1", "task_instance",
        "job" and FROM element "dag". Apply join condition(s) between each element to resolve.
    
    (cherry picked from commit a24d18a534ddbcefbcf0d8790d140ff496781f8b)
---
 airflow/jobs/scheduler_job.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/airflow/jobs/scheduler_job.py b/airflow/jobs/scheduler_job.py
index baeabdc2ec..b8b608efcd 100644
--- a/airflow/jobs/scheduler_job.py
+++ b/airflow/jobs/scheduler_job.py
@@ -1525,7 +1525,8 @@ class SchedulerJob(BaseJob):
             zombies: list[tuple[TI, str, str]] = (
                 session.query(TI, DM.fileloc, DM.processor_subdir)
                 .with_hint(TI, "USE INDEX (ti_state)", dialect_name="mysql")
-                .join(LocalTaskJob, TaskInstance.job_id == LocalTaskJob.id)
+                .join(LocalTaskJob, TI.job_id == LocalTaskJob.id)
+                .join(DM, TI.dag_id == DM.dag_id)
                 .filter(TI.state == TaskInstanceState.RUNNING)
                 .filter(
                     or_(


[airflow] 01/04: Improve provider validation pre-commit (#28516)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit b607a287b8f0e8da0e2af426486f3108095b3537
Author: Jarek Potiuk <ja...@potiuk.com>
AuthorDate: Thu Dec 22 03:25:08 2022 +0100

    Improve provider validation pre-commit (#28516)
    
    (cherry picked from commit e47c472e632effbfe3ddc784788a956c4ca44122)
---
 .pre-commit-config.yaml                            |  21 +-
 STATIC_CODE_CHECKS.rst                             |   2 +-
 airflow/cli/commands/info_command.py               |   1 +
 .../pre_commit_check_provider_yaml_files.py        | 417 ++-------------------
 .../run_provider_yaml_files_check.py}              |  96 ++++-
 5 files changed, 131 insertions(+), 406 deletions(-)

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 5df89e4fc7..a6ed9b1f4d 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -627,19 +627,6 @@ repos:
         entry: ./scripts/ci/pre_commit/pre_commit_check_providers_subpackages_all_have_init.py
         language: python
         require_serial: true
-      - id: check-provider-yaml-valid
-        name: Validate providers.yaml files
-        pass_filenames: false
-        entry: ./scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
-        language: python
-        require_serial: true
-        files: ^docs/|provider\.yaml$|^scripts/ci/pre_commit/pre_commit_check_provider_yaml_files\.py$
-        additional_dependencies:
-          - 'PyYAML==5.3.1'
-          - 'jsonschema>=3.2.0,<5.0.0'
-          - 'tabulate==0.8.8'
-          - 'jsonpath-ng==1.5.3'
-          - 'rich>=12.4.4'
       - id: check-pre-commit-information-consistent
         name: Update information re pre-commit hooks and verify ids and names
         entry: ./scripts/ci/pre_commit/pre_commit_check_pre_commit_hooks.py
@@ -888,6 +875,14 @@ repos:
         pass_filenames: true
         exclude: ^airflow/_vendor/
         additional_dependencies: ['rich>=12.4.4', 'inputimeout']
+      - id: check-provider-yaml-valid
+        name: Validate provider.yaml files
+        pass_filenames: false
+        entry: ./scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
+        language: python
+        require_serial: true
+        files: ^docs/|provider\.yaml$|^scripts/ci/pre_commit/pre_commit_check_provider_yaml_files\.py$
+        additional_dependencies: ['rich>=12.4.4', 'inputimeout', 'markdown-it-py']
       - id: update-migration-references
         name: Update migration ref doc
         language: python
diff --git a/STATIC_CODE_CHECKS.rst b/STATIC_CODE_CHECKS.rst
index 7495044f3d..b2b6081b5f 100644
--- a/STATIC_CODE_CHECKS.rst
+++ b/STATIC_CODE_CHECKS.rst
@@ -195,7 +195,7 @@ require Breeze Docker image to be build locally.
 +--------------------------------------------------------+------------------------------------------------------------------+---------+
 | check-provide-create-sessions-imports                  | Check provide_session and create_session imports                 |         |
 +--------------------------------------------------------+------------------------------------------------------------------+---------+
-| check-provider-yaml-valid                              | Validate providers.yaml files                                    |         |
+| check-provider-yaml-valid                              | Validate provider.yaml files                                     | *       |
 +--------------------------------------------------------+------------------------------------------------------------------+---------+
 | check-providers-init-file-missing                      | Provider init file is missing                                    |         |
 +--------------------------------------------------------+------------------------------------------------------------------+---------+
diff --git a/airflow/cli/commands/info_command.py b/airflow/cli/commands/info_command.py
index 124271c8c9..a8a7c760ab 100644
--- a/airflow/cli/commands/info_command.py
+++ b/airflow/cli/commands/info_command.py
@@ -176,6 +176,7 @@ _MACHINE_TO_ARCHITECTURE = {
     "arm64": Architecture.ARM,
     "armv7": Architecture.ARM,
     "armv7l": Architecture.ARM,
+    "aarch64": Architecture.ARM,
 }
 
 
diff --git a/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py b/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
index 5622212f46..f188879200 100755
--- a/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
+++ b/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
@@ -17,392 +17,47 @@
 # under the License.
 from __future__ import annotations
 
-import json
-import pathlib
+import os
 import sys
-import textwrap
-from collections import Counter
-from itertools import chain, product
-from typing import Any, Iterable
+from pathlib import Path
 
-import jsonschema
-import yaml
-from jsonpath_ng.ext import parse
-from rich.console import Console
-from tabulate import tabulate
-
-try:
-    from yaml import CSafeLoader as SafeLoader
-except ImportError:
-    from yaml import SafeLoader  # type: ignore
-
-if __name__ != "__main__":
-    raise Exception(
+if __name__ not in ("__main__", "__mp_main__"):
+    raise SystemExit(
         "This file is intended to be executed as an executable program. You cannot use it as a module."
+        f"To run this script, run the ./{__file__} command"
     )
 
-ROOT_DIR = pathlib.Path(__file__).resolve().parents[3]
-DOCS_DIR = ROOT_DIR.joinpath("docs")
-PROVIDER_DATA_SCHEMA_PATH = ROOT_DIR.joinpath("airflow", "provider.yaml.schema.json")
-PROVIDER_ISSUE_TEMPLATE_PATH = ROOT_DIR.joinpath(
-    ".github", "ISSUE_TEMPLATE", "airflow_providers_bug_report.yml"
-)
-CORE_INTEGRATIONS = ["SQL", "Local"]
-
-errors = []
-
-
-def _filepath_to_module(filepath: pathlib.Path) -> str:
-    p = filepath.resolve().relative_to(ROOT_DIR).as_posix()
-    if p.endswith(".py"):
-        p = p[:-3]
-    return p.replace("/", ".")
-
-
-def _load_schema() -> dict[str, Any]:
-    with PROVIDER_DATA_SCHEMA_PATH.open() as schema_file:
-        content = json.load(schema_file)
-    return content
-
-
-def _load_package_data(package_paths: Iterable[str]):
-    schema = _load_schema()
-    result = {}
-    for provider_yaml_path in package_paths:
-        with open(provider_yaml_path) as yaml_file:
-            provider = yaml.load(yaml_file, SafeLoader)
-        rel_path = pathlib.Path(provider_yaml_path).relative_to(ROOT_DIR).as_posix()
-        try:
-            jsonschema.validate(provider, schema=schema)
-        except jsonschema.ValidationError:
-            raise Exception(f"Unable to parse: {rel_path}.")
-        result[rel_path] = provider
-    return result
-
-
-def get_all_integration_names(yaml_files) -> list[str]:
-    all_integrations = [
-        i["integration-name"] for f in yaml_files.values() if "integrations" in f for i in f["integrations"]
-    ]
-    all_integrations += ["Local"]
-    return all_integrations
-
-
-def check_integration_duplicates(yaml_files: dict[str, dict]):
-    """Integration names must be globally unique."""
-    print("Checking integration duplicates")
-    all_integrations = get_all_integration_names(yaml_files)
-
-    duplicates = [(k, v) for (k, v) in Counter(all_integrations).items() if v > 1]
-
-    if duplicates:
-        print(
-            "Duplicate integration names found. Integration names must be globally unique. "
-            "Please delete duplicates."
-        )
-        print(tabulate(duplicates, headers=["Integration name", "Number of occurrences"]))
-        sys.exit(3)
-
+AIRFLOW_SOURCES = Path(__file__).parents[3].resolve()
+GITHUB_REPOSITORY = os.environ.get("GITHUB_REPOSITORY", "apache/airflow")
+os.environ["SKIP_GROUP_OUTPUT"] = "true"
 
-def assert_sets_equal(set1, set2):
-    try:
-        difference1 = set1.difference(set2)
-    except TypeError as e:
-        raise AssertionError(f"invalid type when attempting set difference: {e}")
-    except AttributeError as e:
-        raise AssertionError(f"first argument does not support set difference: {e}")
-
-    try:
-        difference2 = set2.difference(set1)
-    except TypeError as e:
-        raise AssertionError(f"invalid type when attempting set difference: {e}")
-    except AttributeError as e:
-        raise AssertionError(f"second argument does not support set difference: {e}")
-
-    if not (difference1 or difference2):
-        return
-
-    lines = []
-    if difference1:
-        lines.append("    -- Items in the left set but not the right:")
-        for item in sorted(difference1):
-            lines.append(f"       {item!r}")
-    if difference2:
-        lines.append("    -- Items in the right set but not the left:")
-        for item in sorted(difference2):
-            lines.append(f"       {item!r}")
-
-    standard_msg = "\n".join(lines)
-    raise AssertionError(standard_msg)
-
-
-def check_if_objects_belongs_to_package(
-    object_names: set[str], provider_package: str, yaml_file_path: str, resource_type: str
-):
-    for object_name in object_names:
-        if not object_name.startswith(provider_package):
-            errors.append(
-                f"The `{object_name}` object in {resource_type} list in {yaml_file_path} does not start"
-                f" with the expected {provider_package}."
-            )
-
-
-def parse_module_data(provider_data, resource_type, yaml_file_path):
-    package_dir = ROOT_DIR.joinpath(yaml_file_path).parent
-    provider_package = pathlib.Path(yaml_file_path).parent.as_posix().replace("/", ".")
-    py_files = chain(
-        package_dir.glob(f"**/{resource_type}/*.py"),
-        package_dir.glob(f"{resource_type}/*.py"),
-        package_dir.glob(f"**/{resource_type}/**/*.py"),
-        package_dir.glob(f"{resource_type}/**/*.py"),
+if __name__ == "__main__":
+    sys.path.insert(0, str(AIRFLOW_SOURCES / "dev" / "breeze" / "src"))
+    from airflow_breeze.global_constants import MOUNT_SELECTED
+    from airflow_breeze.utils.console import get_console
+    from airflow_breeze.utils.docker_command_utils import get_extra_docker_flags
+    from airflow_breeze.utils.run_utils import get_ci_image_for_pre_commits, run_command
+
+    airflow_image = get_ci_image_for_pre_commits()
+    cmd_result = run_command(
+        [
+            "docker",
+            "run",
+            "-t",
+            *get_extra_docker_flags(MOUNT_SELECTED),
+            "-e",
+            "SKIP_ENVIRONMENT_INITIALIZATION=true",
+            "--pull",
+            "never",
+            airflow_image,
+            "-c",
+            "python3 /opt/airflow/scripts/in_container/run_provider_yaml_files_check.py",
+        ],
+        check=False,
     )
-    expected_modules = {_filepath_to_module(f) for f in py_files if f.name != "__init__.py"}
-    resource_data = provider_data.get(resource_type, [])
-    return expected_modules, provider_package, resource_data
-
-
-def check_completeness_of_list_of_hooks_sensors_hooks(yaml_files: dict[str, dict]):
-    print("Checking completeness of list of {sensors, hooks, operators}")
-    print(" -- {sensors, hooks, operators} - Expected modules (left) : Current modules (right)")
-    for (yaml_file_path, provider_data), resource_type in product(
-        yaml_files.items(), ["sensors", "operators", "hooks"]
-    ):
-        expected_modules, provider_package, resource_data = parse_module_data(
-            provider_data, resource_type, yaml_file_path
+    if cmd_result.returncode != 0:
+        get_console().print(
+            "[warning]If you see strange stacktraces above, "
+            "run `breeze ci-image build --python 3.7` and try again."
         )
-
-        current_modules = {str(i) for r in resource_data for i in r.get("python-modules", [])}
-        check_if_objects_belongs_to_package(current_modules, provider_package, yaml_file_path, resource_type)
-        try:
-            assert_sets_equal(set(expected_modules), set(current_modules))
-        except AssertionError as ex:
-            nested_error = textwrap.indent(str(ex), "  ")
-            errors.append(
-                f"Incorrect content of key '{resource_type}/python-modules' "
-                f"in file: {yaml_file_path}\n{nested_error}"
-            )
-
-
-def check_duplicates_in_integrations_names_of_hooks_sensors_operators(yaml_files: dict[str, dict]):
-    print("Checking for duplicates in list of {sensors, hooks, operators}")
-    for (yaml_file_path, provider_data), resource_type in product(
-        yaml_files.items(), ["sensors", "operators", "hooks"]
-    ):
-        resource_data = provider_data.get(resource_type, [])
-        current_integrations = [r.get("integration-name", "") for r in resource_data]
-        if len(current_integrations) != len(set(current_integrations)):
-            for integration in current_integrations:
-                if current_integrations.count(integration) > 1:
-                    errors.append(
-                        f"Duplicated content of '{resource_type}/integration-name/{integration}' "
-                        f"in file: {yaml_file_path}"
-                    )
-
-
-def check_completeness_of_list_of_transfers(yaml_files: dict[str, dict]):
-    print("Checking completeness of list of transfers")
-    resource_type = "transfers"
-
-    print(" -- Expected transfers modules(Left): Current transfers Modules(Right)")
-    for yaml_file_path, provider_data in yaml_files.items():
-        expected_modules, provider_package, resource_data = parse_module_data(
-            provider_data, resource_type, yaml_file_path
-        )
-
-        current_modules = {r.get("python-module") for r in resource_data}
-        check_if_objects_belongs_to_package(current_modules, provider_package, yaml_file_path, resource_type)
-        try:
-            assert_sets_equal(set(expected_modules), set(current_modules))
-        except AssertionError as ex:
-            nested_error = textwrap.indent(str(ex), "  ")
-            errors.append(
-                f"Incorrect content of key '{resource_type}/python-module' "
-                f"in file: {yaml_file_path}\n{nested_error}"
-            )
-
-
-def check_hook_classes(yaml_files: dict[str, dict]):
-    print("Checking connection classes belong to package")
-    resource_type = "hook-class-names"
-    for yaml_file_path, provider_data in yaml_files.items():
-        provider_package = pathlib.Path(yaml_file_path).parent.as_posix().replace("/", ".")
-        hook_class_names = provider_data.get(resource_type)
-        if hook_class_names:
-            check_if_objects_belongs_to_package(
-                hook_class_names, provider_package, yaml_file_path, resource_type
-            )
-
-
-def check_duplicates_in_list_of_transfers(yaml_files: dict[str, dict]):
-    print("Checking for duplicates in list of transfers")
-    errors = []
-    resource_type = "transfers"
-    for yaml_file_path, provider_data in yaml_files.items():
-        resource_data = provider_data.get(resource_type, [])
-
-        source_target_integrations = [
-            (r.get("source-integration-name", ""), r.get("target-integration-name", ""))
-            for r in resource_data
-        ]
-        if len(source_target_integrations) != len(set(source_target_integrations)):
-            for integration_couple in source_target_integrations:
-                if source_target_integrations.count(integration_couple) > 1:
-                    errors.append(
-                        f"Duplicated content of \n"
-                        f" '{resource_type}/source-integration-name/{integration_couple[0]}' "
-                        f" '{resource_type}/target-integration-name/{integration_couple[1]}' "
-                        f"in file: {yaml_file_path}"
-                    )
-
-
-def check_invalid_integration(yaml_files: dict[str, dict]):
-    print("Detect unregistered integrations")
-    all_integration_names = set(get_all_integration_names(yaml_files))
-
-    for (yaml_file_path, provider_data), resource_type in product(
-        yaml_files.items(), ["sensors", "operators", "hooks"]
-    ):
-        resource_data = provider_data.get(resource_type, [])
-        current_names = {r["integration-name"] for r in resource_data}
-        invalid_names = current_names - all_integration_names
-        if invalid_names:
-            errors.append(
-                f"Incorrect content of key '{resource_type}/integration-name' in file: {yaml_file_path}. "
-                f"Invalid values: {invalid_names}"
-            )
-
-    for (yaml_file_path, provider_data), key in product(
-        yaml_files.items(), ["source-integration-name", "target-integration-name"]
-    ):
-        resource_data = provider_data.get("transfers", [])
-        current_names = {r[key] for r in resource_data}
-        invalid_names = current_names - all_integration_names
-        if invalid_names:
-            errors.append(
-                f"Incorrect content of key 'transfers/{key}' in file: {yaml_file_path}. "
-                f"Invalid values: {invalid_names}"
-            )
-
-
-def check_doc_files(yaml_files: dict[str, dict]):
-    print("Checking doc files")
-    current_doc_urls: list[str] = []
-    current_logo_urls: list[str] = []
-    for provider in yaml_files.values():
-        if "integrations" in provider:
-            current_doc_urls.extend(
-                guide
-                for guides in provider["integrations"]
-                if "how-to-guide" in guides
-                for guide in guides["how-to-guide"]
-            )
-            current_logo_urls.extend(
-                integration["logo"] for integration in provider["integrations"] if "logo" in integration
-            )
-        if "transfers" in provider:
-            current_doc_urls.extend(
-                op["how-to-guide"] for op in provider["transfers"] if "how-to-guide" in op
-            )
-
-    expected_doc_urls = {
-        f"/docs/{f.relative_to(DOCS_DIR).as_posix()}"
-        for f in DOCS_DIR.glob("apache-airflow-providers-*/operators/**/*.rst")
-        if f.name != "index.rst" and "_partials" not in f.parts
-    } | {
-        f"/docs/{f.relative_to(DOCS_DIR).as_posix()}"
-        for f in DOCS_DIR.glob("apache-airflow-providers-*/operators.rst")
-    }
-    expected_logo_urls = {
-        f"/{f.relative_to(DOCS_DIR).as_posix()}"
-        for f in DOCS_DIR.glob("integration-logos/**/*")
-        if f.is_file()
-    }
-
-    try:
-        print(" -- Checking document urls: expected (left), current (right)")
-        assert_sets_equal(set(expected_doc_urls), set(current_doc_urls))
-
-        print(" -- Checking logo urls: expected (left), current (right)")
-        assert_sets_equal(set(expected_logo_urls), set(current_logo_urls))
-    except AssertionError as ex:
-        print(ex)
-        sys.exit(1)
-
-
-def check_unique_provider_name(yaml_files: dict[str, dict]):
-    provider_names = [d["name"] for d in yaml_files.values()]
-    duplicates = {x for x in provider_names if provider_names.count(x) > 1}
-    if duplicates:
-        errors.append(f"Provider name must be unique. Duplicates: {duplicates}")
-
-
-def check_providers_are_mentioned_in_issue_template(yaml_files: dict[str, dict]):
-    prefix_len = len("apache-airflow-providers-")
-    short_provider_names = [d["package-name"][prefix_len:] for d in yaml_files.values()]
-    # exclude deprecated provider that shouldn't be in issue template
-    deprecated_providers: list[str] = []
-    for item in deprecated_providers:
-        short_provider_names.remove(item)
-    jsonpath_expr = parse('$.body[?(@.attributes.label == "Apache Airflow Provider(s)")]..options[*]')
-    with PROVIDER_ISSUE_TEMPLATE_PATH.open() as issue_file:
-        issue_template = yaml.safe_load(issue_file)
-    all_mentioned_providers = [match.value for match in jsonpath_expr.find(issue_template)]
-    try:
-        print(
-            f" -- Checking providers: present in code (left), "
-            f"mentioned in {PROVIDER_ISSUE_TEMPLATE_PATH} (right)"
-        )
-        assert_sets_equal(set(short_provider_names), set(all_mentioned_providers))
-    except AssertionError as ex:
-        print(ex)
-        sys.exit(1)
-
-
-def check_providers_have_all_documentation_files(yaml_files: dict[str, dict]):
-    expected_files = ["commits.rst", "index.rst", "installing-providers-from-sources.rst"]
-    for package_info in yaml_files.values():
-        package_name = package_info["package-name"]
-        provider_dir = DOCS_DIR.joinpath(package_name)
-        for file in expected_files:
-            if not provider_dir.joinpath(file).is_file():
-                errors.append(
-                    f"The provider {package_name} misses `{file}` in documentation. "
-                    f"Please add the file to {provider_dir}"
-                )
-
-
-if __name__ == "__main__":
-    provider_files_pattern = pathlib.Path(ROOT_DIR).glob("airflow/providers/**/provider.yaml")
-    all_provider_files = sorted(str(path) for path in provider_files_pattern)
-
-    if len(sys.argv) > 1:
-        paths = sorted(sys.argv[1:])
-    else:
-        paths = all_provider_files
-
-    all_parsed_yaml_files: dict[str, dict] = _load_package_data(paths)
-
-    all_files_loaded = len(all_provider_files) == len(paths)
-    check_integration_duplicates(all_parsed_yaml_files)
-
-    check_completeness_of_list_of_hooks_sensors_hooks(all_parsed_yaml_files)
-    check_duplicates_in_integrations_names_of_hooks_sensors_operators(all_parsed_yaml_files)
-
-    check_completeness_of_list_of_transfers(all_parsed_yaml_files)
-    check_duplicates_in_list_of_transfers(all_parsed_yaml_files)
-    check_hook_classes(all_parsed_yaml_files)
-    check_unique_provider_name(all_parsed_yaml_files)
-    check_providers_are_mentioned_in_issue_template(all_parsed_yaml_files)
-    check_providers_have_all_documentation_files(all_parsed_yaml_files)
-
-    if all_files_loaded:
-        # Only check those if all provider files are loaded
-        check_doc_files(all_parsed_yaml_files)
-        check_invalid_integration(all_parsed_yaml_files)
-
-    if errors:
-        console = Console(width=400, color_system="standard")
-        console.print(f"[red]Found {len(errors)} errors in providers[/]")
-        for error in errors:
-            console.print(f"[red]Error:[/] {error}")
-        sys.exit(1)
+    sys.exit(cmd_result.returncode)
diff --git a/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py b/scripts/in_container/run_provider_yaml_files_check.py
similarity index 81%
copy from scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
copy to scripts/in_container/run_provider_yaml_files_check.py
index 5622212f46..cab365eb21 100755
--- a/scripts/ci/pre_commit/pre_commit_check_provider_yaml_files.py
+++ b/scripts/in_container/run_provider_yaml_files_check.py
@@ -17,11 +17,16 @@
 # under the License.
 from __future__ import annotations
 
+import importlib
+import inspect
 import json
+import os
 import pathlib
+import platform
 import sys
 import textwrap
 from collections import Counter
+from enum import Enum
 from itertools import chain, product
 from typing import Any, Iterable
 
@@ -31,6 +36,8 @@ from jsonpath_ng.ext import parse
 from rich.console import Console
 from tabulate import tabulate
 
+from airflow.cli.commands.info_command import Architecture
+
 try:
     from yaml import CSafeLoader as SafeLoader
 except ImportError:
@@ -41,7 +48,7 @@ if __name__ != "__main__":
         "This file is intended to be executed as an executable program. You cannot use it as a module."
     )
 
-ROOT_DIR = pathlib.Path(__file__).resolve().parents[3]
+ROOT_DIR = pathlib.Path(__file__).resolve().parents[2]
 DOCS_DIR = ROOT_DIR.joinpath("docs")
 PROVIDER_DATA_SCHEMA_PATH = ROOT_DIR.joinpath("airflow", "provider.yaml.schema.json")
 PROVIDER_ISSUE_TEMPLATE_PATH = ROOT_DIR.joinpath(
@@ -51,6 +58,8 @@ CORE_INTEGRATIONS = ["SQL", "Local"]
 
 errors = []
 
+console = Console(width=400, color_system="standard")
+
 
 def _filepath_to_module(filepath: pathlib.Path) -> str:
     p = filepath.resolve().relative_to(ROOT_DIR).as_posix()
@@ -136,15 +145,62 @@ def assert_sets_equal(set1, set2):
     raise AssertionError(standard_msg)
 
 
-def check_if_objects_belongs_to_package(
-    object_names: set[str], provider_package: str, yaml_file_path: str, resource_type: str
+class ObjectType(Enum):
+    MODULE = "module"
+    CLASS = "class"
+
+
+def check_if_object_exist(object_name: str, resource_type: str, yaml_file_path: str, object_type: ObjectType):
+    try:
+        if object_type == ObjectType.CLASS:
+            module_name, object_name = object_name.rsplit(".", maxsplit=1)
+            the_class = getattr(importlib.import_module(module_name), object_name)
+            if the_class and inspect.isclass(the_class):
+                return
+        elif object_type == ObjectType.MODULE:
+            module = importlib.import_module(object_name)
+            if inspect.ismodule(module):
+                return
+        else:
+            raise RuntimeError(f"Wrong enum {object_type}???")
+    except Exception as e:
+        if architecture == Architecture.ARM:
+            if "pymssql" in str(e) or "MySQLdb" in str(e):
+                console.print(
+                    f"[yellow]The imports fail on ARM: {object_name} in {resource_type} {e}, "
+                    f"but it is expected.[/]"
+                )
+                return
+        errors.append(
+            f"The `{object_name}` object in {resource_type} list in {yaml_file_path} does not exist "
+            f"or is not a class: {e}"
+        )
+    else:
+        errors.append(
+            f"The `{object_name}` object in {resource_type} list in {yaml_file_path} does not exist "
+            f"or is not a {object_type.value}."
+        )
+
+
+def check_if_objects_exist_and_belong_to_package(
+    object_names: set[str],
+    provider_package: str,
+    yaml_file_path: str,
+    resource_type: str,
+    object_type: ObjectType,
 ):
     for object_name in object_names:
+        if os.environ.get("VERBOSE"):
+            console.print(
+                f"[bright_blue]Checking if {object_name} of {resource_type} "
+                f"in {yaml_file_path} is {object_type.value} and belongs to {provider_package} package"
+            )
         if not object_name.startswith(provider_package):
             errors.append(
                 f"The `{object_name}` object in {resource_type} list in {yaml_file_path} does not start"
                 f" with the expected {provider_package}."
             )
+        check_if_object_exist(object_name, resource_type, yaml_file_path, object_type)
 
 
 def parse_module_data(provider_data, resource_type, yaml_file_path):
@@ -161,7 +217,7 @@ def parse_module_data(provider_data, resource_type, yaml_file_path):
     return expected_modules, provider_package, resource_data
 
 
-def check_completeness_of_list_of_hooks_sensors_hooks(yaml_files: dict[str, dict]):
+def check_correctness_of_list_of_sensors_operators_hook_modules(yaml_files: dict[str, dict]):
     print("Checking completeness of list of {sensors, hooks, operators}")
     print(" -- {sensors, hooks, operators} - Expected modules (left) : Current modules (right)")
     for (yaml_file_path, provider_data), resource_type in product(
@@ -172,7 +228,9 @@ def check_completeness_of_list_of_hooks_sensors_hooks(yaml_files: dict[str, dict
         )
 
         current_modules = {str(i) for r in resource_data for i in r.get("python-modules", [])}
-        check_if_objects_belongs_to_package(current_modules, provider_package, yaml_file_path, resource_type)
+        check_if_objects_exist_and_belong_to_package(
+            current_modules, provider_package, yaml_file_path, resource_type, ObjectType.MODULE
+        )
         try:
             assert_sets_equal(set(expected_modules), set(current_modules))
         except AssertionError as ex:
@@ -210,7 +268,9 @@ def check_completeness_of_list_of_transfers(yaml_files: dict[str, dict]):
         )
 
         current_modules = {r.get("python-module") for r in resource_data}
-        check_if_objects_belongs_to_package(current_modules, provider_package, yaml_file_path, resource_type)
+        check_if_objects_exist_and_belong_to_package(
+            current_modules, provider_package, yaml_file_path, resource_type, ObjectType.MODULE
+        )
         try:
             assert_sets_equal(set(expected_modules), set(current_modules))
         except AssertionError as ex:
@@ -222,14 +282,26 @@ def check_completeness_of_list_of_transfers(yaml_files: dict[str, dict]):
 
 
 def check_hook_classes(yaml_files: dict[str, dict]):
-    print("Checking connection classes belong to package")
+    print("Checking connection classes belong to package, exist and are classes")
     resource_type = "hook-class-names"
     for yaml_file_path, provider_data in yaml_files.items():
         provider_package = pathlib.Path(yaml_file_path).parent.as_posix().replace("/", ".")
         hook_class_names = provider_data.get(resource_type)
         if hook_class_names:
-            check_if_objects_belongs_to_package(
-                hook_class_names, provider_package, yaml_file_path, resource_type
+            check_if_objects_exist_and_belong_to_package(
+                hook_class_names, provider_package, yaml_file_path, resource_type, ObjectType.CLASS
+            )
+
+
+def check_extra_link_classes(yaml_files: dict[str, dict]):
+    print("Checking extra-links belong to package, exist and are classes")
+    resource_type = "extra-links"
+    for yaml_file_path, provider_data in yaml_files.items():
+        provider_package = pathlib.Path(yaml_file_path).parent.as_posix().replace("/", ".")
+        extra_links = provider_data.get(resource_type)
+        if extra_links:
+            check_if_objects_exist_and_belong_to_package(
+                extra_links, provider_package, yaml_file_path, resource_type, ObjectType.CLASS
             )
 
 
@@ -372,6 +444,8 @@ def check_providers_have_all_documentation_files(yaml_files: dict[str, dict]):
 
 
 if __name__ == "__main__":
+    architecture = Architecture().get_current()
+    console.print(f"Verifying packages on {architecture} architecture. Platform: {platform.machine()}.")
     provider_files_pattern = pathlib.Path(ROOT_DIR).glob("airflow/providers/**/provider.yaml")
     all_provider_files = sorted(str(path) for path in provider_files_pattern)
 
@@ -385,12 +459,13 @@ if __name__ == "__main__":
     all_files_loaded = len(all_provider_files) == len(paths)
     check_integration_duplicates(all_parsed_yaml_files)
 
-    check_completeness_of_list_of_hooks_sensors_hooks(all_parsed_yaml_files)
     check_duplicates_in_integrations_names_of_hooks_sensors_operators(all_parsed_yaml_files)
 
     check_completeness_of_list_of_transfers(all_parsed_yaml_files)
     check_duplicates_in_list_of_transfers(all_parsed_yaml_files)
     check_hook_classes(all_parsed_yaml_files)
+    check_extra_link_classes(all_parsed_yaml_files)
+    check_correctness_of_list_of_sensors_operators_hook_modules(all_parsed_yaml_files)
     check_unique_provider_name(all_parsed_yaml_files)
     check_providers_are_mentioned_in_issue_template(all_parsed_yaml_files)
     check_providers_have_all_documentation_files(all_parsed_yaml_files)
@@ -401,7 +476,6 @@ if __name__ == "__main__":
         check_invalid_integration(all_parsed_yaml_files)
 
     if errors:
-        console = Console(width=400, color_system="standard")
         console.print(f"[red]Found {len(errors)} errors in providers[/]")
         for error in errors:
             console.print(f"[red]Error:[/] {error}")


[airflow] 04/04: Change Architecture and OperatingSystem classies into Enums (#28627)

Posted by pi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

pierrejeambrun pushed a commit to branch v2-5-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 2f5060edefa36338081ed1ffd7e12d669b02fe82
Author: Jarek Potiuk <ja...@potiuk.com>
AuthorDate: Mon Jan 2 05:58:54 2023 +0100

    Change Architecture and OperatingSystem classies into Enums (#28627)
    
    Since they are objects already, there is a very little overhead
    into making them Enums and it has the nice property of being able
    to add type hinting for the returned values.
    
    (cherry picked from commit 8a15557f6fe73feab0e49f97b295160820ad7cfd)
---
 airflow/cli/commands/info_command.py               | 22 +++++++++++++---------
 .../in_container/run_provider_yaml_files_check.py  |  2 +-
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/airflow/cli/commands/info_command.py b/airflow/cli/commands/info_command.py
index a8a7c760ab..7261dfc484 100644
--- a/airflow/cli/commands/info_command.py
+++ b/airflow/cli/commands/info_command.py
@@ -23,6 +23,7 @@ import os
 import platform
 import subprocess
 import sys
+from enum import Enum
 from urllib.parse import urlsplit, urlunsplit
 
 import httpx
@@ -124,16 +125,17 @@ class PiiAnonymizer(Anonymizer):
         return urlunsplit((url_parts.scheme, netloc, url_parts.path, url_parts.query, url_parts.fragment))
 
 
-class OperatingSystem:
+class OperatingSystem(Enum):
     """Operating system."""
 
     WINDOWS = "Windows"
     LINUX = "Linux"
     MACOSX = "Mac OS"
     CYGWIN = "Cygwin"
+    UNKNOWN = "Unknown"
 
     @staticmethod
-    def get_current() -> str | None:
+    def get_current() -> OperatingSystem:
         """Get current operating system."""
         if os.name == "nt":
             return OperatingSystem.WINDOWS
@@ -143,24 +145,26 @@ class OperatingSystem:
             return OperatingSystem.MACOSX
         elif "cygwin" in sys.platform:
             return OperatingSystem.CYGWIN
-        return None
+        return OperatingSystem.UNKNOWN
 
 
-class Architecture:
+class Architecture(Enum):
     """Compute architecture."""
 
     X86_64 = "x86_64"
     X86 = "x86"
     PPC = "ppc"
     ARM = "arm"
+    UNKNOWN = "unknown"
 
     @staticmethod
-    def get_current():
+    def get_current() -> Architecture:
         """Get architecture."""
-        return _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+        current_architecture = _MACHINE_TO_ARCHITECTURE.get(platform.machine().lower())
+        return current_architecture if current_architecture else Architecture.UNKNOWN
 
 
-_MACHINE_TO_ARCHITECTURE = {
+_MACHINE_TO_ARCHITECTURE: dict[str, Architecture] = {
     "amd64": Architecture.X86_64,
     "x86_64": Architecture.X86_64,
     "i686-64": Architecture.X86_64,
@@ -259,8 +263,8 @@ class AirflowInfo:
         python_version = sys.version.replace("\n", " ")
 
         return [
-            ("OS", operating_system or "NOT AVAILABLE"),
-            ("architecture", arch or "NOT AVAILABLE"),
+            ("OS", operating_system.value),
+            ("architecture", arch.value),
             ("uname", str(uname)),
             ("locale", str(_locale)),
             ("python_version", python_version),
diff --git a/scripts/in_container/run_provider_yaml_files_check.py b/scripts/in_container/run_provider_yaml_files_check.py
index cab365eb21..c2cfe565ae 100755
--- a/scripts/in_container/run_provider_yaml_files_check.py
+++ b/scripts/in_container/run_provider_yaml_files_check.py
@@ -444,7 +444,7 @@ def check_providers_have_all_documentation_files(yaml_files: dict[str, dict]):
 
 
 if __name__ == "__main__":
-    architecture = Architecture().get_current()
+    architecture = Architecture.get_current()
     console.print(f"Verifying packages on {architecture} architecture. Platform: {platform.machine()}.")
     provider_files_pattern = pathlib.Path(ROOT_DIR).glob("airflow/providers/**/provider.yaml")
     all_provider_files = sorted(str(path) for path in provider_files_pattern)