You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by po...@apache.org on 2022/04/21 13:59:09 UTC

[airflow] branch main updated: add script to initialise virtualenv (#22971)

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new 03bef084b3 add script to initialise virtualenv (#22971)
03bef084b3 is described below

commit 03bef084b3f1611e1becdd6ad0ff4c0d2dd909ac
Author: Joppe Vos <44...@users.noreply.github.com>
AuthorDate: Thu Apr 21 15:59:03 2022 +0200

    add script to initialise virtualenv (#22971)
    
    
    
    Co-authored-by: Jarek Potiuk <ja...@potiuk.com>
---
 BREEZE.rst                                         |   4 +-
 CONTRIBUTING.rst                                   |  39 ++---
 CONTRIBUTORS_QUICK_START.rst                       |   4 +-
 dev/provider_packages/README.md                    |   6 +-
 .../installation/installing-from-pypi.rst          |   3 +
 scripts/tools/initialize_virtualenv.py             | 186 +++++++++++++++++++++
 6 files changed, 214 insertions(+), 28 deletions(-)

diff --git a/BREEZE.rst b/BREEZE.rst
index 1c60c270ca..ff585346f7 100644
--- a/BREEZE.rst
+++ b/BREEZE.rst
@@ -454,7 +454,7 @@ Regular development tasks:
 * Enter interactive shell in CI container when ``shell`` (or no command) is specified
 * Start containerised, development-friendly airflow installation with ``breeze start-airflow`` command
 * Build documentation with ``breeze build-docs`` command
-* Initialize local virtualenv with ``./breeze-legacy initialize-local-virtualenv`` command
+* Initialize local virtualenv with ``./scripts/tools/initialize_virtualenv.py`` command
 * Build CI docker image with ``breeze build-image`` command
 * Cleanup breeze with ``breeze cleanup`` command
 * Run static checks with autocomplete support ``breeze static-check`` command
@@ -969,7 +969,7 @@ To use your host IDE with Breeze:
 
 .. code-block:: bash
 
-  ./breeze-legacy initialize-local-virtualenv --python 3.8
+   ./scripts/tools/initialize_virtualenv.py
 
 .. warning::
    Make sure that you use the right Python version in this command - matching the Python version you have
diff --git a/CONTRIBUTING.rst b/CONTRIBUTING.rst
index 6f10d485e0..2c31c2cdcd 100644
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -257,13 +257,14 @@ to make them immediately visible in the environment.
 
 .. code-block:: bash
 
-   mkvirtualenv myenv --python=python3.7
+   mkvirtualenv myenv --python=python3.9
 
 5. Initialize the created environment:
 
 .. code-block:: bash
 
-   ./breeze-legacy initialize-local-virtualenv --python 3.7
+   ./scripts/tools/initialize_virtualenv.py
+
 
 6. Open your IDE (for example, PyCharm) and select the virtualenv you created
    as the project's default virtualenv in your IDE.
@@ -886,39 +887,33 @@ There are several sets of constraints we keep:
 
 We also have constraints with "source-providers" but they are used i
 
-The first ones can be used as constraints file when installing Apache Airflow in a repeatable way.
+The first two can be used as constraints file when installing Apache Airflow in a repeatable way.
 It can be done from the sources:
 
-.. code-block:: bash
+from the PyPI package:
 
-  pip install -e . \
-    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
+.. code-block:: bash
 
+  pip install apache-airflow==2.2.5 \
+    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.2.5/constraints-3.7.txt"
 
-or from the PyPI package:
+When you install airflow from sources (in editable mode) you should use "constraints-source-providers"
+instead (this accounts for the case when some providers have not yet been released and have conflicting
+requirements).
 
 .. code-block:: bash
 
-  pip install apache-airflow \
-    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
+  pip install -e . \
+    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"
 
 
 This works also with extras - for example:
 
 .. code-block:: bash
 
-  pip install .[ssh] \
-    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
-
-
-As of apache-airflow 1.10.12 it is also possible to use constraints directly from GitHub using specific
-tag/hash name. We tag commits working for particular release with constraints-<version> tag. So for example
-fixed valid constraints 1.10.12 can be used by using ``constraints-1.10.12`` tag:
-
-.. code-block:: bash
+  pip install ".[ssh]" \
+    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints--source-providers-3.7.txt"
 
-  pip install apache-airflow[ssh]==1.10.12 \
-      --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-1.10.12/constraints-3.7.txt"
 
 There are different set of fixed constraint files for different python major/minor versions and you should
 use the right file for the right python version.
@@ -940,7 +935,9 @@ if the tests are successful.
 Documentation
 =============
 
-Documentation for ``apache-airflow`` package and other packages that are closely related to it ie. providers packages are in ``/docs/`` directory. For detailed information on documentation development, see: `docs/README.rst <docs/README.rst>`_
+Documentation for ``apache-airflow`` package and other packages that are closely related to it ie.
+providers packages are in ``/docs/`` directory. For detailed information on documentation development,
+see: `docs/README.rst <docs/README.rst>`_
 
 Static code checks
 ==================
diff --git a/CONTRIBUTORS_QUICK_START.rst b/CONTRIBUTORS_QUICK_START.rst
index 2558d255c7..ba4bce2526 100644
--- a/CONTRIBUTORS_QUICK_START.rst
+++ b/CONTRIBUTORS_QUICK_START.rst
@@ -321,7 +321,7 @@ Installing airflow in the local virtual environment ``airflow-env`` with breeze.
 
 .. code-block:: bash
 
-  $ ./breeze-legacy initialize-local-virtualenv --python 3.8
+  $ ./scripts/tools/initialize_virtualenv.py
 
 3. Add following line to ~/.bashrc in order to call breeze command from anywhere.
 
@@ -1132,7 +1132,7 @@ Installing airflow in the local virtual environment ``airflow-env`` with breeze.
 .. code-block:: bash
 
   $ sudo apt-get install sqlite libsqlite3-dev default-libmysqlclient-dev postgresql
-  $ ./breeze-legacy initialize-local-virtualenv --python 3.8
+  $ ./scripts/tools/initialize_virtualenv.py
 
 
 2. Add following line to ~/.bashrc in order to call breeze command from anywhere.
diff --git a/dev/provider_packages/README.md b/dev/provider_packages/README.md
index e46c1dfe94..408d6c16cd 100644
--- a/dev/provider_packages/README.md
+++ b/dev/provider_packages/README.md
@@ -227,19 +227,19 @@ that any new added providers are not added as packages (in case they are not yet
 
 ```shell script
 INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -e ".[devel_all]" \
-    --constraint https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.6.txt
+    --constraint https://raw.githubusercontent.com/apache/airflow/constraints-main/constraintsource-providers-3.7.txt
 ```
 
 Note that you might need to add some extra dependencies to your system to install "devel_all" - many
 dependencies are needed to make a clean install - the `Breeze` environment has all the
 dependencies installed in case you have problem with setting up your local virtualenv.
 
-You can also use `breeze` to prepare your virtualenv (it will print extra information if some
+You can also use the script `initialize_virtualenv.py` to prepare your virtualenv (it will print extra information if some
 dependencies are missing/installation fails and it will also reset your SQLite test db in
 the `${HOME}/airflow` directory:
 
 ```shell script
-./breeze initialize-local-virtualenv
+./scripts/tools/initialize_virtualenv.py
 ```
 
 You can find description of all the commands and more information about the "prepare"
diff --git a/docs/apache-airflow/installation/installing-from-pypi.rst b/docs/apache-airflow/installation/installing-from-pypi.rst
index 82dc838b39..ed59d21fb1 100644
--- a/docs/apache-airflow/installation/installing-from-pypi.rst
+++ b/docs/apache-airflow/installation/installing-from-pypi.rst
@@ -136,6 +136,9 @@ the time of preparing of the airflow version. However, usually you can use "main
 to install latest version of providers. Usually the providers work with most versions of Airflow, if there
 will be any incompatibilities, it will be captured as package dependencies.
 
+Note that "main" is just an example - you might need to choose a specific airflow version to install providers
+in specific version.
+
 .. code-block:: bash
 
     PYTHON_VERSION="$(python --version | cut -d " " -f 2 | cut -d "." -f 1-2)"
diff --git a/scripts/tools/initialize_virtualenv.py b/scripts/tools/initialize_virtualenv.py
new file mode 100755
index 0000000000..2b32717e62
--- /dev/null
+++ b/scripts/tools/initialize_virtualenv.py
@@ -0,0 +1,186 @@
+#!/usr/bin/env python3
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import os
+import shlex
+import shutil
+import subprocess
+import sys
+from pathlib import Path
+
+if __name__ not in ("__main__", "__mp_main__"):
+    raise SystemExit(
+        "This file is intended to be executed as an executable program. You cannot use it as a module."
+        f"To run this script, run the ./{__file__} command"
+    )
+
+
+def clean_up_airflow_home(airflow_home: Path):
+    if airflow_home.exists():
+        print(f"Removing {airflow_home}")
+        shutil.rmtree(airflow_home, ignore_errors=True)
+
+
+def check_if_in_virtualenv() -> bool:
+    return hasattr(sys, 'real_prefix') or (hasattr(sys, 'base_prefix') and sys.base_prefix != sys.prefix)
+
+
+def check_for_package_extras() -> str:
+    """
+    check if the user provided any extra packages to install.
+    defaults to package 'devel'.
+    """
+    if len(sys.argv) > 1:
+        if len(sys.argv) > 2:
+            print("Provide extras as 1 argument like: \"devel,google,snowflake\"")
+            sys.exit(1)
+        return sys.argv[1]
+    return "devel"
+
+
+def pip_install_requirements() -> int:
+    """
+    install the requirements of the current python version.
+    return 0 if success, anything else is an error.
+    """
+
+    extras = check_for_package_extras()
+    print(
+        f"""
+Installing requirements.
+
+Airflow is installed with "{extras}" extra.
+
+----------------------------------------------------------------------------------------
+
+IMPORTANT NOTE ABOUT EXTRAS !!!
+
+You can specify extras as single coma-separated parameter to install. For example
+
+* google,amazon,microsoft.azure
+* devel_all
+
+Note that "devel_all" installs all possible dependencies and we have > 600 of them,
+which might not be possible to install cleanly on your host because of lack of
+system packages. It's easier to install extras one-by-one as needed.
+
+----------------------------------------------------------------------------------------
+
+"""
+    )
+    version = get_python_version()
+    constraint = (
+        f"https://raw.githubusercontent.com/apache/airflow/constraints-main/"
+        f"constraints-source-providers-{version}.txt"
+    )
+    pip_install_command = ["pip", "install", "-e", f".[{extras}]", "--constraint", constraint]
+    quoted_command = " ".join([shlex.quote(parameter) for parameter in pip_install_command])
+    print()
+    print(f"Running command: \n   {quoted_command}\n")
+    e = subprocess.run(pip_install_command)
+    return e.returncode
+
+
+def get_python_version() -> str:
+    """
+    return the version of python we are running.
+    """
+    major = sys.version_info[0]
+    minor = sys.version_info[1]
+    return f"{major}.{minor}"
+
+
+def main():
+    """
+    Setup local virtual environment.
+    """
+    airflow_home_dir = os.environ.get("AIRFLOW_HOME", Path.home() / "airflow")
+    airflow_sources = str(Path(__file__).parents[2])
+
+    if not check_if_in_virtualenv():
+        print(
+            "Local virtual environment not activated.\nPlease create and activate it "
+            "first. (for example using 'python3 -m venv venv && source venv/bin/activate')"
+        )
+        sys.exit(1)
+
+    print("Initializing environment...")
+    print(f"This will remove the folder {airflow_home_dir} and reset all the databases!")
+    response = input("Are you sure? (y/N/q)")
+    if response != "y":
+        sys.exit(2)
+
+    print(f"\nWiping and recreating {airflow_home_dir}")
+
+    if airflow_home_dir == airflow_sources:
+        print("AIRFLOW_HOME and Source code are in the same path")
+        print(
+            f"When running this script it will delete all files in path {airflow_home_dir} "
+            "to clear dynamic files like config/logs/db"
+        )
+        print("Please move the airflow source code elsewhere to avoid deletion")
+
+        sys.exit(3)
+
+    clean_up_airflow_home(airflow_home_dir)
+
+    return_code = pip_install_requirements()
+
+    if return_code != 0:
+        print(
+            "To solve persisting issues with the installation, you might need the "
+            "prerequisites installed on your system.\n "
+            "Try running the command below and rerun virtualenv installation\n"
+        )
+
+        os_type = sys.platform
+        if os_type == "darwin":
+            print("brew install sqlite mysql postgresql openssl")
+            print("export LDFLAGS=\"-L/usr/local/opt/openssl/lib\"")
+            print("export CPPFLAGS=\"-I/usr/local/opt/openssl/include\"")
+        else:
+            print(
+                "sudo apt install build-essential python3-dev libsqlite3-dev openssl"
+                "sqlite default-libmysqlclient-dev libmysqlclient-dev postgresql"
+            )
+        sys.exit(4)
+
+    print("\nResetting AIRFLOW sqlite database...")
+    env = os.environ.copy()
+    env["AIRFLOW__CORE__LOAD_EXAMPLES"] = "False"
+    env["AIRFLOW__CORE__UNIT_TEST_MODE"] = "False"
+    env["AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_ENABLED"] = "False"
+    env["AIRFLOW__CORE__DAGS_FOLDER"] = f"{airflow_sources}/empty"
+    env["AIRFLOW__CORE__PLUGINS_FOLDER"] = f"{airflow_sources}/empty"
+    subprocess.run(["airflow", "db", "reset", "--yes"], env=env)
+
+    print("\nResetting AIRFLOW sqlite unit test database...")
+    env = os.environ.copy()
+    env["AIRFLOW__CORE__LOAD_EXAMPLES"] = "True"
+    env["AIRFLOW__CORE__UNIT_TEST_MODE"] = "False"
+    env["AIRFLOW__DATABASE__SQL_ALCHEMY_POOL_ENABLED"] = "False"
+    env["AIRFLOW__CORE__DAGS_FOLDER"] = f"{airflow_sources}/empty"
+    env["AIRFLOW__CORE__PLUGINS_FOLDER"] = f"{airflow_sources}/empty"
+    subprocess.run(["airflow", "db", "reset", "--yes"], env=env)
+
+    print("\nInitialization of environment complete! Go ahead and develop Airflow!")
+
+
+if __name__ == "__main__":
+    main()