You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by ka...@apache.org on 2020/12/09 00:04:48 UTC
[airflow] branch master updated: Simplify publishing of
documentation (#12892)
This is an automated email from the ASF dual-hosted git repository.
kaxilnaik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/master by this push:
new e595d35 Simplify publishing of documentation (#12892)
e595d35 is described below
commit e595d35bf4b57865f930938df12a673c3792e35e
Author: Kamil BreguĊa <mi...@users.noreply.github.com>
AuthorDate: Wed Dec 9 01:03:22 2020 +0100
Simplify publishing of documentation (#12892)
Close: #11423
Close: #11152
---
dev/README_RELEASE_AIRFLOW.md | 40 +++++-
dev/README_RELEASE_PROVIDER_PACKAGES.md | 53 ++++++++
docs/build_docs.py | 182 +++----------------------
docs/exts/airflow_intersphinx.py | 21 +--
docs/exts/docs_build/code_utils.py | 13 +-
docs/exts/docs_build/docs_builder.py | 203 ++++++++++++++++++++++++++++
docs/exts/docs_build/github_action_utils.py | 38 ++++++
docs/publish_docs.py | 97 +++++++++++++
8 files changed, 468 insertions(+), 179 deletions(-)
diff --git a/dev/README_RELEASE_AIRFLOW.md b/dev/README_RELEASE_AIRFLOW.md
index 523b3fc..ddb025b 100644
--- a/dev/README_RELEASE_AIRFLOW.md
+++ b/dev/README_RELEASE_AIRFLOW.md
@@ -35,6 +35,7 @@
- [Publish release to SVN](#publish-release-to-svn)
- [Prepare PyPI "release" packages](#prepare-pypi-release-packages)
- [Update CHANGELOG.md](#update-changelogmd)
+ - [Publish documentation](#publish-documentation)
- [Notify developers of release](#notify-developers-of-release)
- [Update Announcements page](#update-announcements-page)
@@ -551,6 +552,43 @@ At this point we release an official package:
- Update CHANGELOG.md with the details, and commit it.
+## Publish documentation
+
+Documentation is an essential part of the product and should be made available to users.
+In our cases, documentation for the released versions is published in a separate repository - [`apache/airflow-site`](https://github.com/apache/airflow-site), but the documentation source code and build tools are available in the `apache/airflow` repository, so you have to coordinate between the two repositories to be able to build the documentation.
+
+Documentation for providers can be found in the ``/docs/apache-airflow`` directory.
+
+- First, copy the airflow-site repository and set the environment variable ``AIRFLOW_SITE_DIRECTORY``.
+
+ ```shell script
+ git clone https://github.com/apache/airflow-site.git airflow-site
+ cd airflow-site
+ export AIRFLOW_SITE_DIRECTORY="$(pwd)"
+ ```
+
+- Then you can go to the directory and build the necessary documentation packages
+
+ ```shell script
+ cd "${AIRFLOW_REPO_ROOT}"
+ ./breeze build-docs -- --package apache-airflow --for-production
+ ```
+
+- Now you can preview the documentation.
+
+ ```shell script
+ ./docs/start_doc_server.sh
+ ```
+
+- Copy the documentation to the ``airflow-site`` repository, create commit and push changes.
+
+ ```shell script
+ ./docs/publish_docs.py --package apache-airflow
+ cd "${AIRFLOW_SITE_DIRECTORY}"
+ git commit -m "Add documentation for Apache Airflow ${VERSION}"
+ git push
+ ```
+
## Notify developers of release
- Notify users@airflow.apache.org (cc'ing dev@airflow.apache.org and announce@apache.org) that
@@ -583,7 +621,7 @@ https://pypi.python.org/pypi/apache-airflow
The documentation is available on:
https://airflow.apache.org/
-https://airflow.apache.org/docs/${VERSION}/
+https://airflow.apache.org/docs/apache-airflow/${VERSION}/
Find the CHANGELOG here for more details:
diff --git a/dev/README_RELEASE_PROVIDER_PACKAGES.md b/dev/README_RELEASE_PROVIDER_PACKAGES.md
index 274f241..1686b75 100644
--- a/dev/README_RELEASE_PROVIDER_PACKAGES.md
+++ b/dev/README_RELEASE_PROVIDER_PACKAGES.md
@@ -41,6 +41,7 @@
- [Build and sign the source and convenience packages](#build-and-sign-the-source-and-convenience-packages-1)
- [Commit the source packages to Apache SVN repo](#commit-the-source-packages-to-apache-svn-repo-1)
- [Publish the Regular convenience package to PyPI](#publish-the-regular-convenience-package-to-pypi)
+ - [Publish documentation](#publish-documentation)
- [Notify developers of release](#notify-developers-of-release)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
@@ -884,6 +885,58 @@ twine upload -r pypi dist/*
* Again, confirm that the packages are available under the links printed.
+## Publish documentation
+
+Documentation is an essential part of the product and should be made available to users.
+In our cases, documentation for the released versions is published in a separate repository - [`apache/airflow-site`](https://github.com/apache/airflow-site), but the documentation source code and build tools are available in the `apache/airflow` repository, so you have to coordinate between the two repositories to be able to build the documentation.
+
+Documentation for providers can be found in the `/docs/apache-airflow-providers` directory and the `/docs/apache-airflow-providers-*/` directory. The first directory contains the package contents lists and should be updated every time a new version of provider packages is released.
+
+- First, copy the airflow-site repository and set the environment variable ``AIRFLOW_SITE_DIRECTORY``.
+
+ ```shell script
+ git clone https://github.com/apache/airflow-site.git airflow-site
+ cd airflow-site
+ export AIRFLOW_SITE_DIRECTORY="$(pwd)"
+ ```
+
+- Then you can go to the directory and build the necessary documentation packages
+
+ ```shell script
+ cd "${AIRFLOW_REPO_ROOT}"
+ ./breeze build-docs -- \
+ --package apache-airflow-providers \
+ --package apache-airflow-providers-apache-airflow \
+ --package apache-airflow-providers-telegram \
+ --for-production
+ ```
+
+- Now you can preview the documentation.
+
+ ```shell script
+ ./docs/start_doc_server.sh
+ ```
+
+- Copy the documentation to the ``airflow-site`` repository
+
+ ```shell script
+ ./docs/publish_docs.py \
+ --package apache-airflow-providers \
+ --package apache-airflow-providers-apache-airflow \
+ --package apache-airflow-providers-telegram \
+
+ cd "${AIRFLOW_SITE_DIRECTORY}"
+ ```
+
+- If you publish a new package, you must add it to [the docs index](https://github.com/apache/airflow-site/blob/master/landing-pages/site/content/en/docs/_index.md):
+
+- Create commit and push changes.
+
+ ```shell script
+ git commit -m "Add documentation for backport packages - $(date "+%Y-%m-%d%n")"
+ git push
+ ```
+
## Notify developers of release
- Notify users@airflow.apache.org (cc'ing dev@airflow.apache.org and announce@apache.org) that
diff --git a/docs/build_docs.py b/docs/build_docs.py
index 739c4a4..35fd353 100755
--- a/docs/build_docs.py
+++ b/docs/build_docs.py
@@ -17,44 +17,34 @@
# under the License.
import argparse
import fnmatch
-import os
-import re
-import shlex
-import shutil
import sys
from collections import defaultdict
-from contextlib import contextmanager
-from glob import glob
-from subprocess import run
-from tempfile import NamedTemporaryFile, TemporaryDirectory
from typing import Dict, List, Optional, Tuple
from tabulate import tabulate
from docs.exts.docs_build import dev_index_generator, lint_checks # pylint: disable=no-name-in-module
+from docs.exts.docs_build.docs_builder import ( # pylint: disable=no-name-in-module
+ DOCS_DIR,
+ AirflowDocsBuilder,
+ get_available_packages,
+)
from docs.exts.docs_build.errors import ( # pylint: disable=no-name-in-module
DocBuildError,
display_errors_summary,
- parse_sphinx_warnings,
)
+from docs.exts.docs_build.github_action_utils import with_group # pylint: disable=no-name-in-module
from docs.exts.docs_build.spelling_checks import ( # pylint: disable=no-name-in-module
SpellingError,
display_spelling_error_summary,
- parse_spelling_warnings,
)
-from docs.exts.provider_yaml_utils import load_package_data # pylint: disable=no-name-in-module
if __name__ != "__main__":
- raise Exception(
+ raise SystemExit(
"This file is intended to be executed as an executable program. You cannot use it as a module."
"To run this script, run the ./build_docs.py command"
)
-ROOT_PROJECT_DIR = os.path.abspath(os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir))
-ROOT_PACKAGE_DIR = os.path.join(ROOT_PROJECT_DIR, "airflow")
-DOCS_DIR = os.path.join(ROOT_PROJECT_DIR, "docs")
-ALL_PROVIDER_YAMLS = load_package_data()
-
CHANNEL_INVITATION = """\
If you need help, write to #documentation channel on Airflow's Slack.
Channel link: https://apache-airflow.slack.com/archives/CJ1LVREHX
@@ -68,150 +58,6 @@ ERRORS_ELIGIBLE_TO_REBUILD = [
]
-@contextmanager
-def with_group(title):
- """
- If used in Github Action, creates an expandable group in the Github Action log.
- Otherwise, dispaly simple text groups.
-
- For more information, see:
- https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#grouping-log-lines
- """
- if os.environ.get('GITHUB_ACTIONS', 'false') != "true":
- print("#" * 20, title, "#" * 20)
- yield
- return
- print(f"::group::{title}")
- yield
- print("\033[0m")
- print("::endgroup::")
-
-
-class AirflowDocsBuilder:
- """Documentation builder for Airflow."""
-
- def __init__(self, package_name: str):
- self.package_name = package_name
-
- @property
- def _doctree_dir(self) -> str:
- return f"{DOCS_DIR}/_doctrees/docs/{self.package_name}"
-
- @property
- def _out_dir(self) -> str:
- if self.package_name == 'apache-airflow-providers':
- # Disable versioning. This documentation does not apply to any issued product and we can update
- # it as needed, i.e. with each new package of providers.
- return f"{DOCS_DIR}/_build/docs/{self.package_name}"
- else:
- return f"{DOCS_DIR}/_build/docs/{self.package_name}/latest"
-
- @property
- def _src_dir(self) -> str:
- return f"{DOCS_DIR}/{self.package_name}"
-
- def clean_files(self) -> None:
- """Cleanup all artifacts generated by previous builds."""
- api_dir = os.path.join(self._src_dir, "_api")
-
- shutil.rmtree(api_dir, ignore_errors=True)
- shutil.rmtree(self._out_dir, ignore_errors=True)
- os.makedirs(api_dir, exist_ok=True)
- os.makedirs(self._out_dir, exist_ok=True)
-
- print(f"Recreated content of the {shlex.quote(self._out_dir)} and {shlex.quote(api_dir)} folders")
-
- def check_spelling(self):
- """Checks spelling."""
- spelling_errors = []
- with TemporaryDirectory() as tmp_dir, with_group(f"Check spelling: {self.package_name}"):
- build_cmd = [
- "sphinx-build",
- "-W", # turn warnings into errors
- "-T", # show full traceback on exception
- "-b", # builder to use
- "spelling",
- "-c",
- DOCS_DIR,
- "-d", # path for the cached environment and doctree files
- self._doctree_dir,
- self._src_dir, # path to documentation source files
- tmp_dir,
- ]
- print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
- env = os.environ.copy()
- env['AIRFLOW_PACKAGE_NAME'] = self.package_name
- completed_proc = run( # pylint: disable=subprocess-run-check
- build_cmd, cwd=self._src_dir, env=env
- )
- if completed_proc.returncode != 0:
- spelling_errors.append(
- SpellingError(
- file_path=None,
- line_no=None,
- spelling=None,
- suggestion=None,
- context_line=None,
- message=(
- f"Sphinx spellcheck returned non-zero exit status: {completed_proc.returncode}."
- ),
- )
- )
- warning_text = ""
- for filepath in glob(f"{tmp_dir}/**/*.spelling", recursive=True):
- with open(filepath) as speeling_file:
- warning_text += speeling_file.read()
-
- spelling_errors.extend(parse_spelling_warnings(warning_text, self._src_dir))
- return spelling_errors
-
- def build_sphinx_docs(self) -> List[DocBuildError]:
- """Build Sphinx documentation"""
- build_errors = []
- with NamedTemporaryFile() as tmp_file, with_group(f"Building docs: {self.package_name}"):
- build_cmd = [
- "sphinx-build",
- "-T", # show full traceback on exception
- "--color", # do emit colored output
- "-b", # builder to use
- "html",
- "-d", # path for the cached environment and doctree files
- self._doctree_dir,
- "-c",
- DOCS_DIR,
- "-w", # write warnings (and errors) to given file
- tmp_file.name,
- self._src_dir, # path to documentation source files
- self._out_dir, # path to output directory
- ]
- print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
- env = os.environ.copy()
- env['AIRFLOW_PACKAGE_NAME'] = self.package_name
- completed_proc = run( # pylint: disable=subprocess-run-check
- build_cmd, cwd=self._src_dir, env=env
- )
- if completed_proc.returncode != 0:
- build_errors.append(
- DocBuildError(
- file_path=None,
- line_no=None,
- message=f"Sphinx returned non-zero exit status: {completed_proc.returncode}.",
- )
- )
- tmp_file.seek(0)
- warning_text = tmp_file.read().decode()
- # Remove 7-bit C1 ANSI escape sequences
- warning_text = re.sub(r"\x1B[@-_][0-?]*[ -/]*[@-~]", "", warning_text)
- build_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))
- return build_errors
-
-
-def get_available_packages():
- """Get list of all available packages to build."""
- provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
- return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]
-
-
def _get_parser():
available_packages_list = " * " + "\n * ".join(get_available_packages())
parser = argparse.ArgumentParser(
@@ -233,18 +79,25 @@ def _get_parser():
parser.add_argument(
'--spellcheck-only', dest='spellcheck_only', action='store_true', help='Only perform spellchecking'
)
+ parser.add_argument(
+ '--for-production',
+ dest='for_production',
+ action='store_true',
+ help=('Builds documentation for official release i.e. all links point to stable version'),
+ )
+
return parser
def build_docs_for_packages(
- current_packages: List[str], docs_only: bool, spellcheck_only: bool
+ current_packages: List[str], docs_only: bool, spellcheck_only: bool, for_production: bool
) -> Tuple[Dict[str, List[DocBuildError]], Dict[str, List[SpellingError]]]:
"""Builds documentation for single package and returns errors"""
all_build_errors: Dict[str, List[DocBuildError]] = defaultdict(list)
all_spelling_errors: Dict[str, List[SpellingError]] = defaultdict(list)
for package_name in current_packages:
print("#" * 20, package_name, "#" * 20)
- builder = AirflowDocsBuilder(package_name=package_name)
+ builder = AirflowDocsBuilder(package_name=package_name, for_production=for_production)
builder.clean_files()
if not docs_only:
spelling_errors = builder.check_spelling()
@@ -309,6 +162,7 @@ def main():
spellcheck_only = args.spellcheck_only
disable_checks = args.disable_checks
package_filters = args.package_filter
+ for_production = args.for_production
print("Current package filters: ", package_filters)
current_packages = (
@@ -326,6 +180,7 @@ def main():
current_packages=current_packages,
docs_only=docs_only,
spellcheck_only=spellcheck_only,
+ for_production=for_production,
)
if package_build_errors:
all_build_errors.update(package_build_errors)
@@ -347,6 +202,7 @@ def main():
current_packages=to_retry_packages,
docs_only=docs_only,
spellcheck_only=spellcheck_only,
+ for_production=for_production,
)
if package_build_errors:
all_build_errors.update(package_build_errors)
diff --git a/docs/exts/airflow_intersphinx.py b/docs/exts/airflow_intersphinx.py
index b090092..2438b57 100644
--- a/docs/exts/airflow_intersphinx.py
+++ b/docs/exts/airflow_intersphinx.py
@@ -28,11 +28,6 @@ DOCS_DIR = os.path.join(ROOT_DIR, 'docs')
DOCS_PROVIDER_DIR = os.path.join(ROOT_DIR, 'docs')
S3_DOC_URL = "http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com"
-# Customize build for readthedocs.io
-# See:
-# https://docs.readthedocs.io/en/stable/faq.html#how-do-i-change-behavior-when-building-with-read-the-docs
-IS_RTD = os.environ.get('READTHEDOCS') == 'True'
-
def _create_init_py(app, config):
del app
@@ -47,17 +42,15 @@ def _create_init_py(app, config):
def _generate_provider_intersphinx_mapping():
airflow_mapping = {}
+ for_production = os.environ.get('AIRFLOW_FOR_PRODUCTION', 'false') == 'true'
+ current_version = 'stable' if for_production else 'latest'
+
for provider in load_package_data():
package_name = provider['package-name']
if os.environ.get('AIRFLOW_PACKAGE_NAME') == package_name:
continue
- # For local build and S3, use relative URLS.
- # For RTD, use absolute URLs
- if IS_RTD:
- provider_base_url = f"{S3_DOC_URL}/docs/{package_name}/latest/"
- else:
- provider_base_url = f'/docs/{package_name}/latest/'
+ provider_base_url = f'/docs/{package_name}/{current_version}/'
airflow_mapping[package_name] = (
# base URI
@@ -70,14 +63,14 @@ def _generate_provider_intersphinx_mapping():
# In this case, the local index will be read. If unsuccessful, the remote index
# will be fetched.
(
- f'{DOCS_DIR}/_build/docs/{package_name}/latest/objects.inv',
+ f'{DOCS_DIR}/_build/docs/{package_name}/{current_version}/objects.inv',
f'{S3_DOC_URL}/docs/{package_name}/latest/objects.inv',
),
)
if os.environ.get('AIRFLOW_PACKAGE_NAME') != 'apache-airflow':
airflow_mapping['apache-airflow'] = (
# base URI
- '/docs/apache-airflow/latest/',
+ f'/docs/apache-airflow/{current_version}/',
# Index locations list
# If passed None, this will try to fetch the index from `[base_url]/objects.inv`
# If we pass a path containing `://` then we will try to index from the given address.
@@ -86,7 +79,7 @@ def _generate_provider_intersphinx_mapping():
# In this case, the local index will be read. If unsuccessful, the remote index
# will be fetched.
(
- f'{DOCS_DIR}/_build/docs/apache-airflow/latest/objects.inv',
+ f'{DOCS_DIR}/_build/docs/apache-airflow/{current_version}/objects.inv',
f'{S3_DOC_URL}/docs/apache-airflow/latest/objects.inv',
),
)
diff --git a/docs/exts/docs_build/code_utils.py b/docs/exts/docs_build/code_utils.py
index e35c8a4..e77d0b6 100644
--- a/docs/exts/docs_build/code_utils.py
+++ b/docs/exts/docs_build/code_utils.py
@@ -14,7 +14,7 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
-
+import os
from contextlib import suppress
@@ -60,3 +60,14 @@ def prepare_code_snippet(file_path: str, line_no: int, context_lines_count: int
# Join lines
code = "\n".join(code_lines)
return code
+
+
+def pretty_format_path(path: str, start: str) -> str:
+ """Formats the path by marking the important part in bold."""
+ end = '\033[0m'
+ bold = '\033[1m'
+
+ relpath = os.path.relpath(path, start)
+ if relpath == path:
+ return f"{bold}path{end}"
+ return f"{start}/{bold}{relpath}{end}"
diff --git a/docs/exts/docs_build/docs_builder.py b/docs/exts/docs_build/docs_builder.py
new file mode 100644
index 0000000..1ee5614
--- /dev/null
+++ b/docs/exts/docs_build/docs_builder.py
@@ -0,0 +1,203 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import re
+import shlex
+import shutil
+from glob import glob
+from subprocess import run
+from tempfile import NamedTemporaryFile, TemporaryDirectory
+from typing import List
+
+from docs.exts.docs_build.code_utils import pretty_format_path
+from docs.exts.docs_build.errors import DocBuildError, parse_sphinx_warnings
+from docs.exts.docs_build.github_action_utils import with_group
+from docs.exts.docs_build.spelling_checks import SpellingError, parse_spelling_warnings
+from docs.exts.provider_yaml_utils import load_package_data
+
+ROOT_PROJECT_DIR = os.path.abspath(
+ os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir, os.pardir, os.pardir)
+)
+DOCS_DIR = os.path.join(ROOT_PROJECT_DIR, "docs")
+ALL_PROVIDER_YAMLS = load_package_data()
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+
+
+class AirflowDocsBuilder:
+ """Documentation builder for Airflow."""
+
+ def __init__(self, package_name: str, for_production: bool):
+ self.package_name = package_name
+ self.for_production = for_production
+
+ @property
+ def _doctree_dir(self) -> str:
+ return f"{DOCS_DIR}/_doctrees/docs/{self.package_name}"
+
+ @property
+ def is_versioned(self):
+ """Is current documentation package versioned?"""
+ # Disable versioning. This documentation does not apply to any issued product and we can update
+ # it as needed, i.e. with each new package of providers.
+ return self.package_name != 'apache-airflow-providers'
+
+ @property
+ def _build_dir(self) -> str:
+ if self.is_versioned:
+ version = "stable" if self.for_production else "latest"
+ return f"{DOCS_DIR}/_build/docs/{self.package_name}/{version}"
+ else:
+ return f"{DOCS_DIR}/_build/docs/{self.package_name}"
+
+ @property
+ def _current_version(self):
+ if not self.is_versioned:
+ raise Exception("This documentation package is not versioned")
+ if self.package_name == 'apache-airflow':
+ from airflow.version import version as airflow_version
+
+ return airflow_version
+ if self.package_name.startswith('apache-airflow-providers-'):
+ provider = next(p for p in ALL_PROVIDER_YAMLS if p['package-name'] == self.package_name)
+ return provider['versions'][0]
+ return Exception(f"Unsupported package: {self.package_name}")
+
+ @property
+ def _publish_dir(self) -> str:
+ if self.is_versioned:
+ return f"docs-archive/{self.package_name}/{self._current_version}"
+ else:
+ return f"docs-archive/{self.package_name}"
+
+ @property
+ def _src_dir(self) -> str:
+ return f"{DOCS_DIR}/{self.package_name}"
+
+ def clean_files(self) -> None:
+ """Cleanup all artifacts generated by previous builds."""
+ api_dir = os.path.join(self._src_dir, "_api")
+
+ shutil.rmtree(api_dir, ignore_errors=True)
+ shutil.rmtree(self._build_dir, ignore_errors=True)
+ os.makedirs(api_dir, exist_ok=True)
+ os.makedirs(self._build_dir, exist_ok=True)
+
+ def check_spelling(self):
+ """Checks spelling."""
+ spelling_errors = []
+ with TemporaryDirectory() as tmp_dir, with_group(f"Check spelling: {self.package_name}"):
+ build_cmd = [
+ "sphinx-build",
+ "-W", # turn warnings into errors
+ "-T", # show full traceback on exception
+ "-b", # builder to use
+ "spelling",
+ "-c",
+ DOCS_DIR,
+ "-d", # path for the cached environment and doctree files
+ self._doctree_dir,
+ self._src_dir, # path to documentation source files
+ tmp_dir,
+ ]
+ print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
+ env = os.environ.copy()
+ env['AIRFLOW_PACKAGE_NAME'] = self.package_name
+ if self.for_production:
+ env['AIRFLOW_FOR_PRODUCTION'] = 'true'
+ completed_proc = run( # pylint: disable=subprocess-run-check
+ build_cmd, cwd=self._src_dir, env=env
+ )
+ if completed_proc.returncode != 0:
+ spelling_errors.append(
+ SpellingError(
+ file_path=None,
+ line_no=None,
+ spelling=None,
+ suggestion=None,
+ context_line=None,
+ message=(
+ f"Sphinx spellcheck returned non-zero exit status: {completed_proc.returncode}."
+ ),
+ )
+ )
+ warning_text = ""
+ for filepath in glob(f"{tmp_dir}/**/*.spelling", recursive=True):
+ with open(filepath) as speeling_file:
+ warning_text += speeling_file.read()
+
+ spelling_errors.extend(parse_spelling_warnings(warning_text, self._src_dir))
+ return spelling_errors
+
+ def build_sphinx_docs(self) -> List[DocBuildError]:
+ """Build Sphinx documentation"""
+ build_errors = []
+ with NamedTemporaryFile() as tmp_file, with_group(f"Building docs: {self.package_name}"):
+ build_cmd = [
+ "sphinx-build",
+ "-T", # show full traceback on exception
+ "--color", # do emit colored output
+ "-b", # builder to use
+ "html",
+ "-d", # path for the cached environment and doctree files
+ self._doctree_dir,
+ "-c",
+ DOCS_DIR,
+ "-w", # write warnings (and errors) to given file
+ tmp_file.name,
+ self._src_dir, # path to documentation source files
+ self._build_dir, # path to output directory
+ ]
+ print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
+ env = os.environ.copy()
+ env['AIRFLOW_PACKAGE_NAME'] = self.package_name
+ if self.for_production:
+ env['AIRFLOW_FOR_PRODUCTION'] = 'true'
+
+ completed_proc = run( # pylint: disable=subprocess-run-check
+ build_cmd, cwd=self._src_dir, env=env
+ )
+ if completed_proc.returncode != 0:
+ build_errors.append(
+ DocBuildError(
+ file_path=None,
+ line_no=None,
+ message=f"Sphinx returned non-zero exit status: {completed_proc.returncode}.",
+ )
+ )
+ tmp_file.seek(0)
+ warning_text = tmp_file.read().decode()
+ # Remove 7-bit C1 ANSI escape sequences
+ warning_text = re.sub(r"\x1B[@-_][0-?]*[ -/]*[@-~]", "", warning_text)
+ build_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))
+ return build_errors
+
+ def publish(self):
+ """Copy documentation packages files to airflow-site repository."""
+ print(f"Publishing docs for {self.package_name}")
+ output_dir = os.path.join(AIRFLOW_SITE_DIR, self._publish_dir)
+ pretty_source = pretty_format_path(self._build_dir, os.getcwd())
+ pretty_target = pretty_format_path(output_dir, AIRFLOW_SITE_DIR)
+ print(f"Copy directory: {pretty_source} => {pretty_target}")
+ shutil.copytree(self._build_dir, output_dir)
+ print()
+
+
+def get_available_packages():
+ """Get list of all available packages to build."""
+ provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
+ return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]
diff --git a/docs/exts/docs_build/github_action_utils.py b/docs/exts/docs_build/github_action_utils.py
new file mode 100644
index 0000000..241ba64
--- /dev/null
+++ b/docs/exts/docs_build/github_action_utils.py
@@ -0,0 +1,38 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+from contextlib import contextmanager
+
+
+@contextmanager
+def with_group(title):
+ """
+ If used in Github Action, creates an expandable group in the Github Action log.
+ Otherwise, dispaly simple text groups.
+
+ For more information, see:
+ https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#grouping-log-lines
+ """
+ if os.environ.get('GITHUB_ACTIONS', 'false') != "true":
+ print("#" * 20, title, "#" * 20)
+ yield
+ return
+ print(f"::group::{title}")
+ yield
+ print("\033[0m")
+ print("::endgroup::")
diff --git a/docs/publish_docs.py b/docs/publish_docs.py
new file mode 100755
index 0000000..bab251f
--- /dev/null
+++ b/docs/publish_docs.py
@@ -0,0 +1,97 @@
+#!/usr/bin/env python
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import argparse
+import fnmatch
+import os
+
+from docs.exts.docs_build.docs_builder import AirflowDocsBuilder
+from docs.exts.provider_yaml_utils import load_package_data
+
+AIRFLOW_SITE_DIR = os.environ.get('AIRFLOW_SITE_DIRECTORY')
+
+if __name__ != "__main__":
+ raise SystemExit(
+ "This file is intended to be executed as an executable program. You cannot use it as a module."
+ "To run this script, run the ./build_docs.py command"
+ )
+
+if not (
+ AIRFLOW_SITE_DIR
+ and os.path.isdir(AIRFLOW_SITE_DIR)
+ and os.path.isdir(os.path.join(AIRFLOW_SITE_DIR, 'docs-archive'))
+):
+ raise SystemExit(
+ 'Before using this script, set the environment variable AIRFLOW_SITE_DIRECTORY. This variable '
+ 'should contain the path to the airflow-site repository directory. '
+ '${AIRFLOW_SITE_DIRECTORY}/docs-archive must exists.'
+ )
+
+ALL_PROVIDER_YAMLS = load_package_data()
+
+
+def get_available_packages():
+ """Get list of all available packages to build."""
+ provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
+ return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]
+
+
+def _get_parser():
+ available_packages_list = " * " + "\n * ".join(get_available_packages())
+ parser = argparse.ArgumentParser(
+ description='Copies the built documentation to airflow-site repository.',
+ epilog=f"List of supported documentation packages:\n{available_packages_list}" "",
+ )
+ parser.formatter_class = argparse.RawTextHelpFormatter
+ parser.add_argument(
+ '--disable-checks', dest='disable_checks', action='store_true', help='Disables extra checks'
+ )
+ parser.add_argument(
+ "--package-filter",
+ action="append",
+ help=(
+ "Filter specifying for which packages the documentation is to be built. Wildcard are supported."
+ ),
+ )
+
+ return parser
+
+
+def main():
+ """Main code"""
+ args = _get_parser().parse_args()
+ available_packages = get_available_packages()
+
+ package_filters = args.package_filter
+
+ current_packages = (
+ [p for p in available_packages if any(fnmatch.fnmatch(p, f) for f in package_filters)]
+ if package_filters
+ else available_packages
+ )
+ print(f"Publishing docs for {len(current_packages)} package(s)")
+ for pkg in current_packages:
+ print(f" - {pkg}")
+ print()
+ for package_name in current_packages:
+ builder = AirflowDocsBuilder(package_name=package_name, for_production=True)
+ builder.publish()
+
+
+main()