You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/17 10:36:35 UTC

[GitHub] [airflow] uranusjr commented on a change in pull request #19210: Optimizes building of Production image for non-modified www files

uranusjr commented on a change in pull request #19210:
URL: https://github.com/apache/airflow/pull/19210#discussion_r751081095



##########
File path: Dockerfile
##########
@@ -160,13 +160,21 @@ ARG INSTALL_PROVIDERS_FROM_SOURCES="false"
 # But it also can be `.` from local installation or GitHub URL pointing to specific branch or tag
 # Of Airflow. Note That for local source installation you need to have local sources of
 # Airflow checked out together with the Dockerfile and AIRFLOW_SOURCES_FROM and AIRFLOW_SOURCES_TO
-# set to "." and "/opt/airflow" respectively.
+# set to "." and "/opt/airflow" respectively. Similarly AIRFLOW_SOURCES_WWW_FROM/TO are set to right source
+# and destination
 ARG AIRFLOW_INSTALLATION_METHOD="apache-airflow"
 # By default latest released version of airflow is installed (when empty) but this value can be overridden
 # and we can install version according to specification (For example ==2.0.2 or <3.0.0).
 ARG AIRFLOW_VERSION_SPECIFICATION=""
 # By default we do not upgrade to latest dependencies
 ARG UPGRADE_TO_NEWER_DEPENDENCIES="false"
+# By default we install latest airflow from PyPI so we do not need to copy sources of Airflow
+# www to compile the assets but in case of breeze/CI builds we use latest sources and we override those
+# those SOURCES_FROM/TO with "airflow/wwww" and "/opt/airflow/airflow/wwww" respectively.
+# This is to rebuild the assets only when any of the www sources change
+ARG AIRFLOW_SOURCES_WWW_FROM="empty"
+ARG AIRFLOW_SOURCES_WWW_TO="/empty"

Review comment:
       ```suggestion
   # By default we install latest airflow from PyPI so we do not need to copy sources of Airflow
   # www to compile the assets but in case of breeze/CI builds we use latest sources and we override those
   # those SOURCES_FROM/TO with "airflow/www" and "/opt/airflow/airflow/www" respectively.
   # This is to rebuild the assets only when any of the www sources change
   ARG AIRFLOW_SOURCES_WWW_FROM="empty"
   ARG AIRFLOW_SOURCES_WWW_TO="/empty"
   ```
   
   Also what does `empty` mean?

##########
File path: scripts/ci/libraries/_verify_image.sh
##########
@@ -340,7 +340,7 @@ function verify_image::verify_prod_image_as_root() {
 function verify_image::verify_production_image_has_dist_folder() {
     start_end::group_start "Verify prod image has dist folder (compiled www assets): ${DOCKER_IMAGE}"
     # shellcheck disable=SC2016
-    verify_image::check_command "Dist folder" '[ -f $(python -m site --user-site)/airflow/www/static/dist/manifest.json ] || exit 1'
+    verify_image::check_command "Dist folder" '[ -f /.venv/lib/python$(python -c "import sys; print(str(sys.version_info[0])+\".\"+str(sys.version_info[1]))")/site-packages/airflow/www/static/dist/manifest.json ] || exit 1'

Review comment:
       Probably easier to use `importlib.resources` instead of manually constructing the path.

##########
File path: airflow/operators/python.py
##########
@@ -380,6 +390,8 @@ def __init__(
                 self.requirements.append('lazy-object-proxy')
             if self.use_dill and 'dill' not in self.requirements:
                 self.requirements.append('dill')
+        # Only allows to use clone_virtualenv_packages if no python version is specified
+        self.clone_virtualenv_packages = clone_airflow_virtualenv if not python_version else False

Review comment:
       I think we should
   
   1. Raise an exception if both arguments are set
   2. Maybe ignore `python_version` if it matches `sys.version_info`...?

##########
File path: airflow/utils/python_virtualenv.py
##########
@@ -75,7 +85,11 @@ def remove_task_decorator(python_source: str, task_decorator_name: str) -> str:
 
 
 def prepare_virtualenv(
-    venv_directory: str, python_bin: str, system_site_packages: bool, requirements: List[str]
+    venv_directory: str,
+    python_bin: str,
+    system_site_packages: bool,
+    requirements: List[str],
+    clone_virtualenv_packages: bool = False,

Review comment:
       I feel we may want to infer this from `system_site_packages` by default. I’d imagine most existing usages of this function specify `system_site_packages` because they want to access Airflow, but that would be broken by this PR since Airflow is no longer a part of the site-packages directory.

##########
File path: scripts/docker/install_from_docker_context_files.sh
##########
@@ -88,14 +87,14 @@ function install_airflow_and_providers_from_docker_context_files(){
             --constraint /tmp/constraints.txt
         rm /tmp/constraints.txt
         # make sure correct PIP version is used \
-        pip install ${AIRFLOW_INSTALL_USER_FLAG} --upgrade "pip==${AIRFLOW_PIP_VERSION}"
+        pip install --upgrade "pip==${AIRFLOW_PIP_VERSION}"
         # then upgrade if needed without using constraints to account for new limits in setup.py
-        pip install ${AIRFLOW_INSTALL_USER_FLAG} --upgrade --upgrade-strategy only-if-needed \
+        pip install --upgrade --upgrade-strategy only-if-needed \
              ${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages}
     fi
 
     # make sure correct PIP version is left installed
-    pip install ${AIRFLOW_INSTALL_USER_FLAG} --upgrade "pip==${AIRFLOW_PIP_VERSION}"
+    pip install --upgrade "pip==${AIRFLOW_PIP_VERSION}"

Review comment:
       ```suggestion
       pip install "pip==${AIRFLOW_PIP_VERSION}"
   ```

##########
File path: airflow/utils/python_virtualenv.py
##########
@@ -89,10 +103,17 @@ def prepare_virtualenv(
     :type system_site_packages: bool
     :param requirements: List of additional python packages
     :type requirements: List[str]
+    :param clone_virtualenv_packages: whether to clone aurflow virtualenv package if airflow is run
+         in virtualenv - default is False
+    :type clone_virtualenv_packages: bool
     :return: Path to a binary file with Python in a virtual environment.
     :rtype: str
     """
-    virtualenv_cmd = _generate_virtualenv_cmd(venv_directory, python_bin, system_site_packages)
+    virtualenv_cmd = _generate_virtualenv_cmd(
+        venv_directory, python_bin, system_site_packages, clone_virtualenv_packages
+    )
+    # Virtualenv-clone requires the directory to be non-existing
+    shutil.rmtree(path=venv_directory, ignore_errors=True)

Review comment:
       We may want to explicitly error out here instead to avoid the user accidentally nuking something. A `force` flag can be added if needed.

##########
File path: scripts/docker/install_airflow_dependencies_from_branch_tip.sh
##########
@@ -39,11 +39,11 @@ function install_airflow_dependencies_from_branch_tip() {
     fi
     # Install latest set of dependencies using constraints. In case constraints were upgraded and there
     # are conflicts, this might fail, but it should be fixed in the following installation steps
-    pip install ${AIRFLOW_INSTALL_USER_FLAG} \
+    pip install \
       "https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \
       --constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" || true
     # make sure correct PIP version is used
-    pip install ${AIRFLOW_INSTALL_USER_FLAG} --upgrade "pip==${AIRFLOW_PIP_VERSION}"
+    pip install --upgrade "pip==${AIRFLOW_PIP_VERSION}"

Review comment:
       ```suggestion
       pip install "pip==${AIRFLOW_PIP_VERSION}"
   ```
   
   `--upgrade` is useless when a version is explicitly specified.

##########
File path: scripts/docker/install_pip_version.sh
##########
@@ -30,11 +30,12 @@
 . "$( dirname "${BASH_SOURCE[0]}" )/common.sh"
 
 function install_pip_version() {
-    pip install --no-cache-dir --upgrade "pip==${AIRFLOW_PIP_VERSION}" && mkdir -p /root/.local/bin
+    pip install --no-cache-dir --upgrade "pip==${AIRFLOW_PIP_VERSION}"

Review comment:
       ```suggestion
       pip install --no-cache-dir "pip==${AIRFLOW_PIP_VERSION}"
   ```

##########
File path: setup.cfg
##########
@@ -157,6 +157,7 @@ install_requires =
     termcolor>=1.1.0
     typing-extensions>=3.7.4;python_version<"3.8"
     unicodecsv>=0.14.1
+    virtualenv-clone

Review comment:
       We should also add `virtualenv>=20`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org