You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by ka...@apache.org on 2020/12/09 00:04:48 UTC

[airflow] branch master updated: Simplify publishing of documentation (#12892)

This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/master by this push:
     new e595d35  Simplify publishing of documentation (#12892)
e595d35 is described below

commit e595d35bf4b57865f930938df12a673c3792e35e
Author: Kamil BreguĊ‚a <mi...@users.noreply.github.com>
AuthorDate: Wed Dec 9 01:03:22 2020 +0100

    Simplify publishing of documentation (#12892)
    
    Close: #11423
    Close: #11152
---
 dev/README_RELEASE_AIRFLOW.md               |  40 +++++-
 dev/README_RELEASE_PROVIDER_PACKAGES.md     |  53 ++++++++
 docs/build_docs.py                          | 182 +++----------------------
 docs/exts/airflow_intersphinx.py            |  21 +--
 docs/exts/docs_build/code_utils.py          |  13 +-
 docs/exts/docs_build/docs_builder.py        | 203 ++++++++++++++++++++++++++++
 docs/exts/docs_build/github_action_utils.py |  38 ++++++
 docs/publish_docs.py                        |  97 +++++++++++++
 8 files changed, 468 insertions(+), 179 deletions(-)

diff --git a/dev/README_RELEASE_AIRFLOW.md b/dev/README_RELEASE_AIRFLOW.md
index 523b3fc..ddb025b 100644
--- a/dev/README_RELEASE_AIRFLOW.md
+++ b/dev/README_RELEASE_AIRFLOW.md
@@ -35,6 +35,7 @@
   - [Publish release to SVN](#publish-release-to-svn)
   - [Prepare PyPI "release" packages](#prepare-pypi-release-packages)
   - [Update CHANGELOG.md](#update-changelogmd)
+  - [Publish documentation](#publish-documentation)
   - [Notify developers of release](#notify-developers-of-release)
   - [Update Announcements page](#update-announcements-page)
 
@@ -551,6 +552,43 @@ At this point we release an official package:
 
 - Update CHANGELOG.md with the details, and commit it.
 
+## Publish documentation
+
+Documentation is an essential part of the product and should be made available to users.
+In our cases, documentation for the released versions is published in a separate repository - [`apache/airflow-site`](https://github.com/apache/airflow-site), but the documentation source code and build tools are available in the `apache/airflow` repository, so you have to coordinate between the two repositories to be able to build the documentation.
+
+Documentation for providers can be found in the ``/docs/apache-airflow`` directory.
+
+- First, copy the airflow-site repository and set the environment variable ``AIRFLOW_SITE_DIRECTORY``.
+
+    ```shell script
+    git clone https://github.com/apache/airflow-site.git airflow-site
+    cd airflow-site
+    export AIRFLOW_SITE_DIRECTORY="$(pwd)"
+    ```
+
+- Then you can go to the directory and build the necessary documentation packages
+
+    ```shell script
+    cd "${AIRFLOW_REPO_ROOT}"
+    ./breeze build-docs -- --package apache-airflow --for-production
+    ```
+
+- Now you can preview the documentation.
+
+    ```shell script
+    ./docs/start_doc_server.sh
+    ```
+
+- Copy the documentation to the ``airflow-site`` repository, create commit and push changes.
+
+    ```shell script
+    ./docs/publish_docs.py --package apache-airflow
+    cd "${AIRFLOW_SITE_DIRECTORY}"
+    git commit -m "Add documentation for Apache Airflow ${VERSION}"
+    git push
+    ```
+
 ## Notify developers of release
 
 - Notify users@airflow.apache.org (cc'ing dev@airflow.apache.org and announce@apache.org) that
@@ -583,7 +621,7 @@ https://pypi.python.org/pypi/apache-airflow
 
 The documentation is available on:
 https://airflow.apache.org/
-https://airflow.apache.org/docs/${VERSION}/
+https://airflow.apache.org/docs/apache-airflow/${VERSION}/
 
 Find the CHANGELOG here for more details:
 
diff --git a/dev/README_RELEASE_PROVIDER_PACKAGES.md b/dev/README_RELEASE_PROVIDER_PACKAGES.md
index 274f241..1686b75 100644
--- a/dev/README_RELEASE_PROVIDER_PACKAGES.md
+++ b/dev/README_RELEASE_PROVIDER_PACKAGES.md
@@ -41,6 +41,7 @@
   - [Build and sign the source and convenience packages](#build-and-sign-the-source-and-convenience-packages-1)
   - [Commit the source packages to Apache SVN repo](#commit-the-source-packages-to-apache-svn-repo-1)
   - [Publish the Regular convenience package to PyPI](#publish-the-regular-convenience-package-to-pypi)
+  - [Publish documentation](#publish-documentation)
   - [Notify developers of release](#notify-developers-of-release)
 
 <!-- END doctoc generated TOC please keep comment here to allow auto update -->
@@ -884,6 +885,58 @@ twine upload -r pypi dist/*
 
 * Again, confirm that the packages are available under the links printed.
 
+## Publish documentation
+
+Documentation is an essential part of the product and should be made available to users.
+In our cases, documentation  for the released versions is published in a separate repository - [`apache/airflow-site`](https://github.com/apache/airflow-site), but the documentation source code and build tools are available in the `apache/airflow` repository, so you have to coordinate between the two repositories to be able to build the documentation.
+
+Documentation for providers can be found in the `/docs/apache-airflow-providers` directory and the `/docs/apache-airflow-providers-*/` directory. The first directory contains the package contents lists and should be updated every time a new version of provider packages is released.
+
+- First, copy the airflow-site repository and set the environment variable ``AIRFLOW_SITE_DIRECTORY``.
+
+    ```shell script
+    git clone https://github.com/apache/airflow-site.git airflow-site
+    cd airflow-site
+    export AIRFLOW_SITE_DIRECTORY="$(pwd)"
+    ```
+
+- Then you can go to the directory and build the necessary documentation packages
+
+    ```shell script
+    cd "${AIRFLOW_REPO_ROOT}"
+    ./breeze build-docs -- \
+      --package apache-airflow-providers \
+      --package apache-airflow-providers-apache-airflow \
+      --package apache-airflow-providers-telegram \
+      --for-production
+    ```
+
+- Now you can preview the documentation.
+
+    ```shell script
+    ./docs/start_doc_server.sh
+    ```
+
+- Copy the documentation to the ``airflow-site`` repository
+
+    ```shell script
+    ./docs/publish_docs.py \
+        --package apache-airflow-providers \
+        --package apache-airflow-providers-apache-airflow \
+        --package apache-airflow-providers-telegram \
+
+    cd "${AIRFLOW_SITE_DIRECTORY}"
+    ```
+
+- If you publish a new package, you must add it to [the docs index](https://github.com/apache/airflow-site/blob/master/landing-pages/site/content/en/docs/_index.md):
+
+- Create commit and push changes.
+
+    ```shell script
+    git commit -m "Add documentation for backport packages - $(date "+%Y-%m-%d%n")"
+    git push
+    ```
+
 ## Notify developers of release
 
 - Notify users@airflow.apache.org (cc'ing dev@airflow.apache.org and announce@apache.org) that
diff --git a/docs/build_docs.py b/docs/build_docs.py
index 739c4a4..35fd353 100755
--- a/docs/build_docs.py
+++ b/docs/build_docs.py
@@ -17,44 +17,34 @@
 # under the License.
 import argparse
 import fnmatch
-import os
-import re
-import shlex
-import shutil
 import sys
 from collections import defaultdict
-from contextlib import contextmanager
-from glob import glob
-from subprocess import run
-from tempfile import NamedTemporaryFile, TemporaryDirectory
 from typing import Dict, List, Optional, Tuple
 
 from tabulate import tabulate
 
 from docs.exts.docs_build import dev_index_generator, lint_checks  # pylint: disable=no-name-in-module
+from docs.exts.docs_build.docs_builder import (  # pylint: disable=no-name-in-module
+    DOCS_DIR,
+    AirflowDocsBuilder,
+    get_available_packages,
+)
 from docs.exts.docs_build.errors import (  # pylint: disable=no-name-in-module
     DocBuildError,
     display_errors_summary,
-    parse_sphinx_warnings,
 )
+from docs.exts.docs_build.github_action_utils import with_group  # pylint: disable=no-name-in-module
 from docs.exts.docs_build.spelling_checks import (  # pylint: disable=no-name-in-module
     SpellingError,
     display_spelling_error_summary,
-    parse_spelling_warnings,
 )
-from docs.exts.provider_yaml_utils import load_package_data  # pylint: disable=no-name-in-module
 
 if __name__ != "__main__":
-    raise Exception(
+    raise SystemExit(
         "This file is intended to be executed as an executable program. You cannot use it as a module."
         "To run this script, run the ./build_docs.py command"
     )
 
-ROOT_PROJECT_DIR = os.path.abspath(os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir))
-ROOT_PACKAGE_DIR = os.path.join(ROOT_PROJECT_DIR, "airflow")
-DOCS_DIR = os.path.join(ROOT_PROJECT_DIR, "docs")
-ALL_PROVIDER_YAMLS = load_package_data()
-
 CHANNEL_INVITATION = """\
 If you need help, write to #documentation channel on Airflow's Slack.
 Channel link: https://apache-airflow.slack.com/archives/CJ1LVREHX
@@ -68,150 +58,6 @@ ERRORS_ELIGIBLE_TO_REBUILD = [
 ]
 
 
-@contextmanager
-def with_group(title):
-    """
-    If used in Github Action, creates an expandable group in the Github Action log.
-    Otherwise, dispaly simple text groups.
-
-    For more information, see:
-    https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#grouping-log-lines
-    """
-    if os.environ.get('GITHUB_ACTIONS', 'false') != "true":
-        print("#" * 20, title, "#" * 20)
-        yield
-        return
-    print(f"::group::{title}")
-    yield
-    print("\033[0m")
-    print("::endgroup::")
-
-
-class AirflowDocsBuilder:
-    """Documentation builder for Airflow."""
-
-    def __init__(self, package_name: str):
-        self.package_name = package_name
-
-    @property
-    def _doctree_dir(self) -> str:
-        return f"{DOCS_DIR}/_doctrees/docs/{self.package_name}"
-
-    @property
-    def _out_dir(self) -> str:
-        if self.package_name == 'apache-airflow-providers':
-            # Disable versioning. This documentation does not apply to any issued product and we can update
-            # it as needed, i.e. with each new package of providers.
-            return f"{DOCS_DIR}/_build/docs/{self.package_name}"
-        else:
-            return f"{DOCS_DIR}/_build/docs/{self.package_name}/latest"
-
-    @property
-    def _src_dir(self) -> str:
-        return f"{DOCS_DIR}/{self.package_name}"
-
-    def clean_files(self) -> None:
-        """Cleanup all artifacts generated by previous builds."""
-        api_dir = os.path.join(self._src_dir, "_api")
-
-        shutil.rmtree(api_dir, ignore_errors=True)
-        shutil.rmtree(self._out_dir, ignore_errors=True)
-        os.makedirs(api_dir, exist_ok=True)
-        os.makedirs(self._out_dir, exist_ok=True)
-
-        print(f"Recreated content of the {shlex.quote(self._out_dir)} and {shlex.quote(api_dir)} folders")
-
-    def check_spelling(self):
-        """Checks spelling."""
-        spelling_errors = []
-        with TemporaryDirectory() as tmp_dir, with_group(f"Check spelling: {self.package_name}"):
-            build_cmd = [
-                "sphinx-build",
-                "-W",  # turn warnings into errors
-                "-T",  # show full traceback on exception
-                "-b",  # builder to use
-                "spelling",
-                "-c",
-                DOCS_DIR,
-                "-d",  # path for the cached environment and doctree files
-                self._doctree_dir,
-                self._src_dir,  # path to documentation source files
-                tmp_dir,
-            ]
-            print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
-            env = os.environ.copy()
-            env['AIRFLOW_PACKAGE_NAME'] = self.package_name
-            completed_proc = run(  # pylint: disable=subprocess-run-check
-                build_cmd, cwd=self._src_dir, env=env
-            )
-            if completed_proc.returncode != 0:
-                spelling_errors.append(
-                    SpellingError(
-                        file_path=None,
-                        line_no=None,
-                        spelling=None,
-                        suggestion=None,
-                        context_line=None,
-                        message=(
-                            f"Sphinx spellcheck returned non-zero exit status: {completed_proc.returncode}."
-                        ),
-                    )
-                )
-                warning_text = ""
-                for filepath in glob(f"{tmp_dir}/**/*.spelling", recursive=True):
-                    with open(filepath) as speeling_file:
-                        warning_text += speeling_file.read()
-
-                spelling_errors.extend(parse_spelling_warnings(warning_text, self._src_dir))
-        return spelling_errors
-
-    def build_sphinx_docs(self) -> List[DocBuildError]:
-        """Build Sphinx documentation"""
-        build_errors = []
-        with NamedTemporaryFile() as tmp_file, with_group(f"Building docs: {self.package_name}"):
-            build_cmd = [
-                "sphinx-build",
-                "-T",  # show full traceback on exception
-                "--color",  # do emit colored output
-                "-b",  # builder to use
-                "html",
-                "-d",  # path for the cached environment and doctree files
-                self._doctree_dir,
-                "-c",
-                DOCS_DIR,
-                "-w",  # write warnings (and errors) to given file
-                tmp_file.name,
-                self._src_dir,  # path to documentation source files
-                self._out_dir,  # path to output directory
-            ]
-            print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
-            env = os.environ.copy()
-            env['AIRFLOW_PACKAGE_NAME'] = self.package_name
-            completed_proc = run(  # pylint: disable=subprocess-run-check
-                build_cmd, cwd=self._src_dir, env=env
-            )
-            if completed_proc.returncode != 0:
-                build_errors.append(
-                    DocBuildError(
-                        file_path=None,
-                        line_no=None,
-                        message=f"Sphinx returned non-zero exit status: {completed_proc.returncode}.",
-                    )
-                )
-            tmp_file.seek(0)
-            warning_text = tmp_file.read().decode()
-            # Remove 7-bit C1 ANSI escape sequences
-            warning_text = re.sub(r"\x1B[@-_][0-?]*[ -/]*[@-~]", "", warning_text)
-            build_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))
-        return build_errors
-
-
-def get_available_packages():
-    """Get list of all available packages to build."""
-    provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
-    return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]
-
-
 def _get_parser():
     available_packages_list = " * " + "\n * ".join(get_available_packages())
     parser = argparse.ArgumentParser(
@@ -233,18 +79,25 @@ def _get_parser():
     parser.add_argument(
         '--spellcheck-only', dest='spellcheck_only', action='store_true', help='Only perform spellchecking'
     )
+    parser.add_argument(
+        '--for-production',
+        dest='for_production',
+        action='store_true',
+        help=('Builds documentation for official release i.e. all links point to stable version'),
+    )
+
     return parser
 
 
 def build_docs_for_packages(
-    current_packages: List[str], docs_only: bool, spellcheck_only: bool
+    current_packages: List[str], docs_only: bool, spellcheck_only: bool, for_production: bool
 ) -> Tuple[Dict[str, List[DocBuildError]], Dict[str, List[SpellingError]]]:
     """Builds documentation for single package and returns errors"""
     all_build_errors: Dict[str, List[DocBuildError]] = defaultdict(list)
     all_spelling_errors: Dict[str, List[SpellingError]] = defaultdict(list)
     for package_name in current_packages:
         print("#" * 20, package_name, "#" * 20)
-        builder = AirflowDocsBuilder(package_name=package_name)
+        builder = AirflowDocsBuilder(package_name=package_name, for_production=for_production)
         builder.clean_files()
         if not docs_only:
             spelling_errors = builder.check_spelling()
@@ -309,6 +162,7 @@ def main():
     spellcheck_only = args.spellcheck_only
     disable_checks = args.disable_checks
     package_filters = args.package_filter
+    for_production = args.for_production
 
     print("Current package filters: ", package_filters)
     current_packages = (
@@ -326,6 +180,7 @@ def main():
         current_packages=current_packages,
         docs_only=docs_only,
         spellcheck_only=spellcheck_only,
+        for_production=for_production,
     )
     if package_build_errors:
         all_build_errors.update(package_build_errors)
@@ -347,6 +202,7 @@ def main():
             current_packages=to_retry_packages,
             docs_only=docs_only,
             spellcheck_only=spellcheck_only,
+            for_production=for_production,
         )
         if package_build_errors:
             all_build_errors.update(package_build_errors)
diff --git a/docs/exts/airflow_intersphinx.py b/docs/exts/airflow_intersphinx.py
index b090092..2438b57 100644
--- a/docs/exts/airflow_intersphinx.py
+++ b/docs/exts/airflow_intersphinx.py
@@ -28,11 +28,6 @@ DOCS_DIR = os.path.join(ROOT_DIR, 'docs')
 DOCS_PROVIDER_DIR = os.path.join(ROOT_DIR, 'docs')
 S3_DOC_URL = "http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com"
 
-# Customize build for readthedocs.io
-# See:
-# https://docs.readthedocs.io/en/stable/faq.html#how-do-i-change-behavior-when-building-with-read-the-docs
-IS_RTD = os.environ.get('READTHEDOCS') == 'True'
-
 
 def _create_init_py(app, config):
     del app
@@ -47,17 +42,15 @@ def _create_init_py(app, config):
 
 def _generate_provider_intersphinx_mapping():
     airflow_mapping = {}
+    for_production = os.environ.get('AIRFLOW_FOR_PRODUCTION', 'false') == 'true'
+    current_version = 'stable' if for_production else 'latest'
+
     for provider in load_package_data():
         package_name = provider['package-name']
         if os.environ.get('AIRFLOW_PACKAGE_NAME') == package_name:
             continue
 
-        # For local build and S3, use relative URLS.
-        # For RTD, use absolute URLs
-        if IS_RTD:
-            provider_base_url = f"{S3_DOC_URL}/docs/{package_name}/latest/"
-        else:
-            provider_base_url = f'/docs/{package_name}/latest/'
+        provider_base_url = f'/docs/{package_name}/{current_version}/'
 
         airflow_mapping[package_name] = (
             # base URI
@@ -70,14 +63,14 @@ def _generate_provider_intersphinx_mapping():
             # In this case, the local index will be read. If unsuccessful, the remote index
             # will be fetched.
             (
-                f'{DOCS_DIR}/_build/docs/{package_name}/latest/objects.inv',
+                f'{DOCS_DIR}/_build/docs/{package_name}/{current_version}/objects.inv',
                 f'{S3_DOC_URL}/docs/{package_name}/latest/objects.inv',
             ),
         )
     if os.environ.get('AIRFLOW_PACKAGE_NAME') != 'apache-airflow':
         airflow_mapping['apache-airflow'] = (
             # base URI
-            '/docs/apache-airflow/latest/',
+            f'/docs/apache-airflow/{current_version}/',
             # Index locations list
             # If passed None, this will try to fetch the index from `[base_url]/objects.inv`
             # If we pass a path containing `://` then we will try to index from the given address.
@@ -86,7 +79,7 @@ def _generate_provider_intersphinx_mapping():
             # In this case, the local index will be read. If unsuccessful, the remote index
             # will be fetched.
             (
-                f'{DOCS_DIR}/_build/docs/apache-airflow/latest/objects.inv',
+                f'{DOCS_DIR}/_build/docs/apache-airflow/{current_version}/objects.inv',
                 f'{S3_DOC_URL}/docs/apache-airflow/latest/objects.inv',
             ),
         )
diff --git a/docs/exts/docs_build/code_utils.py b/docs/exts/docs_build/code_utils.py
index e35c8a4..e77d0b6 100644
--- a/docs/exts/docs_build/code_utils.py
+++ b/docs/exts/docs_build/code_utils.py
@@ -14,7 +14,7 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
-
+import os
 from contextlib import suppress
 
 
@@ -60,3 +60,14 @@ def prepare_code_snippet(file_path: str, line_no: int, context_lines_count: int
         # Join lines
         code = "\n".join(code_lines)
     return code
+
+
+def pretty_format_path(path: str, start: str) -> str:
+    """Formats the path by marking the important part in bold."""
+    end = '\033[0m'
+    bold = '\033[1m'
+
+    relpath = os.path.relpath(path, start)
+    if relpath == path:
+        return f"{bold}path{end}"
+    return f"{start}/{bold}{relpath}{end}"
diff --git a/docs/exts/docs_build/docs_builder.py b/docs/exts/docs_build/docs_builder.py
new file mode 100644
index 0000000..1ee5614
--- /dev/null
+++ b/docs/exts/docs_build/docs_builder.py
@@ -0,0 +1,203 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import re
+import shlex
+import shutil
+from glob import glob
+from subprocess import run
+from tempfile import NamedTemporaryFile, TemporaryDirectory
+from typing import List
+
+from docs.exts.docs_build.code_utils import pretty_format_path
+from docs.exts.docs_build.errors import DocBuildError, parse_sphinx_warnings
+from docs.exts.docs_build.github_action_utils import with_group
+from docs.exts.docs_build.spelling_checks import SpellingError, parse_spelling_warnings
+from docs.exts.provider_yaml_utils import load_package_data
+
+ROOT_PROJECT_DIR = os.path.abspath(
+    os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir, os.pardir, os.pardir)
+)
+DOCS_DIR = os.path.join(ROOT_PROJECT_DIR, "docs")
+ALL_PROVIDER_YAMLS = load_package_data()
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+
+
+class AirflowDocsBuilder:
+    """Documentation builder for Airflow."""
+
+    def __init__(self, package_name: str, for_production: bool):
+        self.package_name = package_name
+        self.for_production = for_production
+
+    @property
+    def _doctree_dir(self) -> str:
+        return f"{DOCS_DIR}/_doctrees/docs/{self.package_name}"
+
+    @property
+    def is_versioned(self):
+        """Is current documentation package versioned?"""
+        # Disable versioning. This documentation does not apply to any issued product and we can update
+        # it as needed, i.e. with each new package of providers.
+        return self.package_name != 'apache-airflow-providers'
+
+    @property
+    def _build_dir(self) -> str:
+        if self.is_versioned:
+            version = "stable" if self.for_production else "latest"
+            return f"{DOCS_DIR}/_build/docs/{self.package_name}/{version}"
+        else:
+            return f"{DOCS_DIR}/_build/docs/{self.package_name}"
+
+    @property
+    def _current_version(self):
+        if not self.is_versioned:
+            raise Exception("This documentation package is not versioned")
+        if self.package_name == 'apache-airflow':
+            from airflow.version import version as airflow_version
+
+            return airflow_version
+        if self.package_name.startswith('apache-airflow-providers-'):
+            provider = next(p for p in ALL_PROVIDER_YAMLS if p['package-name'] == self.package_name)
+            return provider['versions'][0]
+        return Exception(f"Unsupported package: {self.package_name}")
+
+    @property
+    def _publish_dir(self) -> str:
+        if self.is_versioned:
+            return f"docs-archive/{self.package_name}/{self._current_version}"
+        else:
+            return f"docs-archive/{self.package_name}"
+
+    @property
+    def _src_dir(self) -> str:
+        return f"{DOCS_DIR}/{self.package_name}"
+
+    def clean_files(self) -> None:
+        """Cleanup all artifacts generated by previous builds."""
+        api_dir = os.path.join(self._src_dir, "_api")
+
+        shutil.rmtree(api_dir, ignore_errors=True)
+        shutil.rmtree(self._build_dir, ignore_errors=True)
+        os.makedirs(api_dir, exist_ok=True)
+        os.makedirs(self._build_dir, exist_ok=True)
+
+    def check_spelling(self):
+        """Checks spelling."""
+        spelling_errors = []
+        with TemporaryDirectory() as tmp_dir, with_group(f"Check spelling: {self.package_name}"):
+            build_cmd = [
+                "sphinx-build",
+                "-W",  # turn warnings into errors
+                "-T",  # show full traceback on exception
+                "-b",  # builder to use
+                "spelling",
+                "-c",
+                DOCS_DIR,
+                "-d",  # path for the cached environment and doctree files
+                self._doctree_dir,
+                self._src_dir,  # path to documentation source files
+                tmp_dir,
+            ]
+            print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
+            env = os.environ.copy()
+            env['AIRFLOW_PACKAGE_NAME'] = self.package_name
+            if self.for_production:
+                env['AIRFLOW_FOR_PRODUCTION'] = 'true'
+            completed_proc = run(  # pylint: disable=subprocess-run-check
+                build_cmd, cwd=self._src_dir, env=env
+            )
+            if completed_proc.returncode != 0:
+                spelling_errors.append(
+                    SpellingError(
+                        file_path=None,
+                        line_no=None,
+                        spelling=None,
+                        suggestion=None,
+                        context_line=None,
+                        message=(
+                            f"Sphinx spellcheck returned non-zero exit status: {completed_proc.returncode}."
+                        ),
+                    )
+                )
+                warning_text = ""
+                for filepath in glob(f"{tmp_dir}/**/*.spelling", recursive=True):
+                    with open(filepath) as speeling_file:
+                        warning_text += speeling_file.read()
+
+                spelling_errors.extend(parse_spelling_warnings(warning_text, self._src_dir))
+        return spelling_errors
+
+    def build_sphinx_docs(self) -> List[DocBuildError]:
+        """Build Sphinx documentation"""
+        build_errors = []
+        with NamedTemporaryFile() as tmp_file, with_group(f"Building docs: {self.package_name}"):
+            build_cmd = [
+                "sphinx-build",
+                "-T",  # show full traceback on exception
+                "--color",  # do emit colored output
+                "-b",  # builder to use
+                "html",
+                "-d",  # path for the cached environment and doctree files
+                self._doctree_dir,
+                "-c",
+                DOCS_DIR,
+                "-w",  # write warnings (and errors) to given file
+                tmp_file.name,
+                self._src_dir,  # path to documentation source files
+                self._build_dir,  # path to output directory
+            ]
+            print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
+            env = os.environ.copy()
+            env['AIRFLOW_PACKAGE_NAME'] = self.package_name
+            if self.for_production:
+                env['AIRFLOW_FOR_PRODUCTION'] = 'true'
+
+            completed_proc = run(  # pylint: disable=subprocess-run-check
+                build_cmd, cwd=self._src_dir, env=env
+            )
+            if completed_proc.returncode != 0:
+                build_errors.append(
+                    DocBuildError(
+                        file_path=None,
+                        line_no=None,
+                        message=f"Sphinx returned non-zero exit status: {completed_proc.returncode}.",
+                    )
+                )
+            tmp_file.seek(0)
+            warning_text = tmp_file.read().decode()
+            # Remove 7-bit C1 ANSI escape sequences
+            warning_text = re.sub(r"\x1B[@-_][0-?]*[ -/]*[@-~]", "", warning_text)
+            build_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))
+        return build_errors
+
+    def publish(self):
+        """Copy documentation packages files to airflow-site repository."""
+        print(f"Publishing docs for {self.package_name}")
+        output_dir = os.path.join(AIRFLOW_SITE_DIR, self._publish_dir)
+        pretty_source = pretty_format_path(self._build_dir, os.getcwd())
+        pretty_target = pretty_format_path(output_dir, AIRFLOW_SITE_DIR)
+        print(f"Copy directory: {pretty_source} => {pretty_target}")
+        shutil.copytree(self._build_dir, output_dir)
+        print()
+
+
+def get_available_packages():
+    """Get list of all available packages to build."""
+    provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
+    return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]
diff --git a/docs/exts/docs_build/github_action_utils.py b/docs/exts/docs_build/github_action_utils.py
new file mode 100644
index 0000000..241ba64
--- /dev/null
+++ b/docs/exts/docs_build/github_action_utils.py
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from contextlib import contextmanager
+
+
+@contextmanager
+def with_group(title):
+    """
+    If used in Github Action, creates an expandable group in the Github Action log.
+    Otherwise, dispaly simple text groups.
+
+    For more information, see:
+    https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#grouping-log-lines
+    """
+    if os.environ.get('GITHUB_ACTIONS', 'false') != "true":
+        print("#" * 20, title, "#" * 20)
+        yield
+        return
+    print(f"::group::{title}")
+    yield
+    print("\033[0m")
+    print("::endgroup::")
diff --git a/docs/publish_docs.py b/docs/publish_docs.py
new file mode 100755
index 0000000..bab251f
--- /dev/null
+++ b/docs/publish_docs.py
@@ -0,0 +1,97 @@
+#!/usr/bin/env python
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import argparse
+import fnmatch
+import os
+
+from docs.exts.docs_build.docs_builder import AirflowDocsBuilder
+from docs.exts.provider_yaml_utils import load_package_data
+
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+
+if __name__ != "__main__":
+    raise SystemExit(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        "To run this script, run the ./build_docs.py command"
+    )
+
+if not (
+    AIRFLOW_SITE_DIR
+    and os.path.isdir(AIRFLOW_SITE_DIR)
+    and os.path.isdir(os.path.join(AIRFLOW_SITE_DIR, 'docs-archive'))
+):
+    raise SystemExit(
+        'Before using this script, set the environment variable AIRFLOW_SITE_DIRECTORY. This variable '
+        'should contain the path to the airflow-site repository directory. '
+        '${AIRFLOW_SITE_DIRECTORY}/docs-archive must exists.'
+    )
+
+ALL_PROVIDER_YAMLS = load_package_data()
+
+
+def get_available_packages():
+    """Get list of all available packages to build."""
+    provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
+    return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]
+
+
+def _get_parser():
+    available_packages_list = " * " + "\n * ".join(get_available_packages())
+    parser = argparse.ArgumentParser(
+        description='Copies the built documentation to airflow-site repository.',
+        epilog=f"List of supported documentation packages:\n{available_packages_list}" "",
+    )
+    parser.formatter_class = argparse.RawTextHelpFormatter
+    parser.add_argument(
+        '--disable-checks', dest='disable_checks', action='store_true', help='Disables extra checks'
+    )
+    parser.add_argument(
+        "--package-filter",
+        action="append",
+        help=(
+            "Filter specifying for which packages the documentation is to be built. Wildcard are supported."
+        ),
+    )
+
+    return parser
+
+
+def main():
+    """Main code"""
+    args = _get_parser().parse_args()
+    available_packages = get_available_packages()
+
+    package_filters = args.package_filter
+
+    current_packages = (
+        [p for p in available_packages if any(fnmatch.fnmatch(p, f) for f in package_filters)]
+        if package_filters
+        else available_packages
+    )
+    print(f"Publishing docs for {len(current_packages)} package(s)")
+    for pkg in current_packages:
+        print(f" - {pkg}")
+    print()
+    for package_name in current_packages:
+        builder = AirflowDocsBuilder(package_name=package_name, for_production=True)
+        builder.publish()
+
+
+main()