You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/11/28 20:01:47 UTC

[GitHub] [airflow] potiuk opened a new pull request #19867: Initial commit for new Breeze project

potiuk opened a new pull request #19867:
URL: https://github.com/apache/airflow/pull/19867


   It includes:
   
   * proposal for initial ADRs (Architecture Decision records)
     where we will keep decision records about both - Breeze2 and CI
   * scaffolding for the new breeze command including command line,
     pre-commit checks, automated tests in CI and requirements
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982329436


   I have more thoughts on environment provision (still don’t like automatically setting up virtual environments much, it can go wrong way to often), but let’s merge this first since those don’t really affect the “actual Breeze” functionalities.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758683713



##########
File path: dev/breeze/doc/adr/0001-record-architecture-decisions.md
##########
@@ -0,0 +1,48 @@
+<!--

Review comment:
       One more comment - ADR is basically the same concept that RFCs are for and I think they have proven their worth over time. Similarly PEPs for Python. This is far smaller scale and limited to CI/dev enviroment only but I would love to see if theyr value apply in this scale as well (I do believe so).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758229725



##########
File path: dev/Breeze/conftest.py
##########
@@ -0,0 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import sys
+from pathlib import Path
+
+ROOT_DIR = str(Path(__file__))
+
+sys.path.insert(1, ROOT_DIR)

Review comment:
       This is for now only for testing which we want to do during the CI builds, so this is needed as those sources are in the same monorepo (which I think is something we should keeep for as long as humanly possible).  But we would not include the file in the package even if we decide to have the package (and we do not need to run tests when it is released as package).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981653216


   Note that `execv` does not work on Windows, so that’d be a problem if we’re moving toward supporting Windows.
   
   You mention pipx earlier, and I feel it might be a better setup for contributors? Just `pipx install apache-airflow-breeze` and run `breeze` in the repository root. This is also more friendly to Windows since pipx (actually pip) would automatically create an exe for the command.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758229725



##########
File path: dev/Breeze/conftest.py
##########
@@ -0,0 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import sys
+from pathlib import Path
+
+ROOT_DIR = str(Path(__file__))
+
+sys.path.insert(1, ROOT_DIR)

Review comment:
       This is for now only for testing which we want to do during the CI builds, so this is needed as those sources are in the same monorepo (which I think is something we should keep for as long as humanly possible).  But we would not include the file in the package even if we decide to have the package (and we do not need to run tests when it is released as package).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758663294



##########
File path: Breeze2
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib
+if sys.version_info.major != 3 or sys.version_info.minor < 7:
+    print("ERROR! Make sure you use Python 3.7+ !!")
+    sys.exit(1)
+
+import subprocess
+from os import execv
+from pathlib import Path
+
+AIRFLOW_SOURCES_DIR = Path(__file__).parent.resolve()
+BUILD_DIR = AIRFLOW_SOURCES_DIR / ".build"
+BUILD_BREEZE_DIR = BUILD_DIR / "breeze2"
+BUILD_BREEZE_CFG_SAVED = BUILD_BREEZE_DIR / "setup.cfg.saved"
+BUILD_BREEZE_VENV_DIR = BUILD_BREEZE_DIR / "venv"
+BUILD_BREEZE_VENV_BIN_DIR = BUILD_BREEZE_VENV_DIR / "bin"
+BUILD_BREEZE_VENV_PIP = BUILD_BREEZE_VENV_BIN_DIR / "pip"
+BUILD_BREEZE_VENV_BREEZE = BUILD_BREEZE_VENV_BIN_DIR / "Breeze2"
+
+BREEZE_SOURCE_PATH = AIRFLOW_SOURCES_DIR / "dev" / "breeze"
+BREEZE_SETUP_CFG_PATH = BREEZE_SOURCE_PATH / "setup.cfg"
+
+BUILD_BREEZE_DIR.mkdir(parents=True, exist_ok=True)
+
+
+def needs_installation() -> bool:
+    """Returns true if Breeze's virtualenv needs (re)installation"""
+    if not BUILD_BREEZE_VENV_DIR.exists() or not BUILD_BREEZE_CFG_SAVED.exists():
+        return True
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED) as saved:
+        current_config, saved_config = current.read(), saved.read()
+    return current_config != saved_config

Review comment:
       ```suggestion
       return BREEZE_SETUP_CFG_PATH.read_text() != BUILD_BREEZE_CFG_SAVED.read_text()
   ```
   or
   ```suggestion
       return BREEZE_SETUP_CFG_PATH.read_bytes() != BUILD_BREEZE_CFG_SAVED.read_bytes()
   ```
   
   We use pathlib so we can write it a bit easier.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758681427



##########
File path: dev/breeze/doc/adr/0001-record-architecture-decisions.md
##########
@@ -0,0 +1,48 @@
+<!--

Review comment:
       Let's see.
   
   I saw a few failures, and a few successes. 
   
   I've also head a few complaints that Breeze's/CI design decisions were not documented (in fact they are all documented and explained in those documents: 
   * BREEZE.rst
   * CI.rst
   * IMAGES.rst
   * PULL_REQUEST_WORKFLOW.rst
   * LOCAL_VIRTUALENV.rst
    
   and possibly few other places.
   
   I think there is clearly a need to understand why certain choices were made and at least in semi-structured way that will be easy to track. This is an attempt to not mix design decisions, with resulting user documentation (which was actually the main problem with the previous approach).
   
   So I think a good way to capture desing decisions and keep them close to the code is good. This is because if we keep them in Google Docs, they will be easily lost. They also have one problem which ADRs handle  - you can easily create an ADR where the decision is changed or updated, and adr tool has a nice way to create new ADR wchih supersedes the previous one (with automated references in both docs). This is kinda pain to maintain if you have "live" documentation in docs, because either you keep on manually adding new docs  and try to refer between them somehow (which one is the last one? is this the most recent decision?) or they contain only the latest decision with lack of easy visble evolution of the decision. Or - more lkely -  they are never updated and contain only the old decisions.
   
   I think Google Docs are one of the ways (similarly as comments in PRs/issues and slack discussions) to actually "do" the discussion, but I find ADRs as particularly good to permanently "capture" the decisions for future self and people who will be trying to find out some WHYs in the future.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758257939



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       Because `python -m` adds the current working directory to `sys.path`, while an entry point (which is a script) does not. This is one of the lesser known Python quirk; I don’t know what the actual name is but personally call it “interpreter mode” (which adds cwd) and “script mode” (which does not).
   
   Pytest actually has specific documentation on this: https://docs.pytest.org/en/6.2.x/goodpractices.html#choosing-a-test-layout-import-rules
   
   This is why having a pip-installable package helps; we can lay out `dev/breeze` as a Python package:
   
   ```
   airflow/dev/breeze
    |-- pyproject.toml
    |-- setup.py  # assuming we’ll use setuptools...
    |-- src/
    |    `-- breeze/  # importable module
    `-- tests/
         |-- conftest.py
         `-- test_breeze.py
   ```
   
   And `pip install -e ./dev/breeze` + `pytest dev/breeze/tests` would work automatically.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981630444


   Using Flit + PEP 621 would eliminate much of the guess work how to get things right 😉 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981693655


   > Note that `execv` does not work on Windows, so that’d be a problem if we’re moving toward supporting Windows.
   
   This is not what the documentation tells:  https://docs.python.org/3/library/os.html#os.execv
   
   | Availability: Unix, Windows.
   
   I think we'd have to check whether to add `.exe` or not (but I think it should work without). But I see no reason why it would not work.
   
   > You mention pipx earlier, and I feel it might be a better setup for contributors? Just `pipx install apache-airflow-breeze` and run `breeze` in the repository root. This is also more friendly to Windows since pipx (actually pip) would automatically create an exe for the command.
   
   What I really want, is to be able to run Breeze with all the latest modifications included so I really want the workflow:
   
   1) `git checkout main && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> run all latest and greates Breeze. I do not want any "install" command between those two. 
   
   What's even more, I want to be able to switch to another branch and do:
   
   1) `git checkout v2-2-test && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> Run latest breeze from v2-2 branch` 
   
   Without having to reinstall anything. 
   
   Any extra steps will add confusion and people (casual users especially) will run older Breeze versions with newer code and the other way round, which is exactly what I want to avoid.
   
   > This actually gives me a thought, since you feel it’s important to make the virtual environment (for Breeze) invisibal to the user. Maybe we should build Breeze entirely upon an existing task runner solution, instead of rolling our own infrastructure. [Nox](https://nox.thea.codes/en/stable/) has pretty much have virtual environment management and task invocation figured out—we can make Breeze a wrapper to `noxfile.py`, and just write subcommands as Nox tasks.
   
   I think nox mainly focuses on creating temporary venvs, and also I **really** do not want any dependencies to be able to run the above workflow. 
   
   I want precisely three things:
   
   * python3.7+
   * docker (TBD version)
   * docker-compose (which soon will be part of docker anyway) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] leahecole commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
leahecole commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758758692



##########
File path: pyproject.toml
##########
@@ -16,5 +16,5 @@
 # under the License.
 [tool.black]
 line-length = 110
-target-version = ['py36', 'py37', 'py38']
+target-version = ['py36', 'py37', 'py38', 'py39']

Review comment:
       Totally missed the hierarchy - makes sense. Glad it's on the horizon! 🐍 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #19867:
URL: https://github.com/apache/airflow/pull/19867


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758664110



##########
File path: Breeze2
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib
+if sys.version_info.major != 3 or sys.version_info.minor < 7:
+    print("ERROR! Make sure you use Python 3.7+ !!")
+    sys.exit(1)
+
+import subprocess
+from os import execv
+from pathlib import Path
+
+AIRFLOW_SOURCES_DIR = Path(__file__).parent.resolve()
+BUILD_DIR = AIRFLOW_SOURCES_DIR / ".build"
+BUILD_BREEZE_DIR = BUILD_DIR / "breeze2"
+BUILD_BREEZE_CFG_SAVED = BUILD_BREEZE_DIR / "setup.cfg.saved"
+BUILD_BREEZE_VENV_DIR = BUILD_BREEZE_DIR / "venv"
+BUILD_BREEZE_VENV_BIN_DIR = BUILD_BREEZE_VENV_DIR / "bin"
+BUILD_BREEZE_VENV_PIP = BUILD_BREEZE_VENV_BIN_DIR / "pip"
+BUILD_BREEZE_VENV_BREEZE = BUILD_BREEZE_VENV_BIN_DIR / "Breeze2"
+
+BREEZE_SOURCE_PATH = AIRFLOW_SOURCES_DIR / "dev" / "breeze"
+BREEZE_SETUP_CFG_PATH = BREEZE_SOURCE_PATH / "setup.cfg"
+
+BUILD_BREEZE_DIR.mkdir(parents=True, exist_ok=True)
+
+
+def needs_installation() -> bool:
+    """Returns true if Breeze's virtualenv needs (re)installation"""
+    if not BUILD_BREEZE_VENV_DIR.exists() or not BUILD_BREEZE_CFG_SAVED.exists():
+        return True
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED) as saved:
+        current_config, saved_config = current.read(), saved.read()
+    return current_config != saved_config
+
+
+def save_config():
+    """Saves cfg file to virtualenv to check if there is a need for reinstallation of the virtualenv"""
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED, "w") as saved:
+        saved.write(current.read())

Review comment:
       ```suggestion
       BUILD_BREEZE_CFG_SAVED.write_text(BREEZE_SETUP_CFG_PATH.read_text())
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981656776


   This actually gives me a thought, since you feel it’s important to make the virtual environment (for Breeze) invisibal to the user. Maybe we should build Breeze entirely upon an existing task runner solution, instead of rolling our own infrastructure. [Nox](https://nox.thea.codes/en/stable/) has pretty much have virtual environment management and task invocation figured out—we can make Breeze a wrapper to `noxfile.py`, and just write subcommands as Nox tasks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981693655


   > Note that `execv` does not work on Windows, so that’d be a problem if we’re moving toward supporting Windows.
   
   This is not what the documentation tells:  https://docs.python.org/3/library/os.html#os.execv
   
   | Availability: Unix, Windows.
   
   I think we'd have to check whether to add `.exe` or not (but I think it should work without). But I see no reason why it would not work.
   
   > You mention pipx earlier, and I feel it might be a better setup for contributors? Just `pipx install apache-airflow-breeze` and run `breeze` in the repository root. This is also more friendly to Windows since pipx (actually pip) would automatically create an exe for the command.
   
   What I really want, is to be able to run Breeze with all the latest modifications included so I really want the workflow:
   
   1) `git checkout main && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> run all latest and greates Breeze. I do not want any "install" command between those two. 
   
   What's even more, I want to be able to switch to another branch and do:
   
   1) `git checkout v2-2-test && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> Run latest Breeze from v2-2 branch` 
   
   Without having to reinstall anything (including when any dependencies change for Breeze between those versions)
   
   Any extra steps will add confusion and people (casual users especially) will run older Breeze versions with newer code and the other way round, which is exactly what I want to avoid.
   
   > This actually gives me a thought, since you feel it’s important to make the virtual environment (for Breeze) invisibal to the user. Maybe we should build Breeze entirely upon an existing task runner solution, instead of rolling our own infrastructure. [Nox](https://nox.thea.codes/en/stable/) has pretty much have virtual environment management and task invocation figured out—we can make Breeze a wrapper to `noxfile.py`, and just write subcommands as Nox tasks.
   
   I think nox mainly focuses on creating temporary venvs, and also I **really** do not want any dependencies to be able to run the above workflow. 
   
   I want precisely three things:
   
   * python3.7+
   * docker (TBD version)
   * docker-compose (which soon will be part of docker anyway) 
   
   See here: https://github.com/apache/airflow/pull/19867/files#diff-4d391b1ed30cbb2a992cddb4e9cb7d2f2f43a202bae3dd74d6dd44b0751c430dR116


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r759342666



##########
File path: Breeze2
##########
@@ -0,0 +1,55 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib

Review comment:
       Oh yeah. That was a mentl shortcut. The comment should be:
   
   > We do the check here, before importing pathlib becasue Python <3.4 does not have pathlib and import would fail with cryptic "missing pathlib module" error.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758227000



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       We might consider that indeed.
   
   I think there is a certain simplicity that the code to manage development is inside the repo with the code. It solves a number of problems, and especially mkes it easier to use versioning. I would treat that as an option to follow in the future and the way how we "release it" eventually when we get to the point where it is is "ready to use". 
   
   One potential problem with it is that you have to actively "install" the latest version if you want new features and by having everything in the same repo, of the code that is used to manage it, you have it "by default" (which also means that  it is less stable). 
   
   Keeping it under "dev/Breeze" folder does not preclude having a separate package and PyPI release. There could be two modes of runnign Breeze - one from the sources, and one from the package. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758241077



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       Still would not help with `conftest.py` running tests  where conftest.py is there mostly to overcome default import behaviour of `pytest`. The funny thing is that `python -m pytest` works fine, but when you try to run tests with `pytest`, the imports are not working the way you'd think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758323488



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       OK. Updated it to follow your suggestions @uranusjr - I like it much more too. Can you please take a look ? Setuptools and configuration of it is kinda bad documented so some of the stuff there is a bit guessing.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981606775


   I am not sure if we want to release the whole `airlfow-breeze` package on Pypi, but at least the way it is setup now is more "standard" and does not require any hacks indeed. (especially the dir structure and pytest discovery).
   
   I am not sure yet how to approach the "small bootstrap" package - I like how the install would work now with editable installed Breeze in local virtualenv (where I think we want to have the whole of Breeze installed rather than the "bootstrap" script only).
   
   Maybe when we install it as "editable" locally it should install the whole of breeze (all packages) but then when we distribute it in  PyPI we only install the "bootstrap" script there. I guess that would be possible? Happy to brainstorm on this one @uranusjr 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981652299


   (and we'd need to add some protection against old python version)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981693655


   > Note that `execv` does not work on Windows, so that’d be a problem if we’re moving toward supporting Windows.
   
   This is not what the documentation tells:  https://docs.python.org/3/library/os.html#os.execv
   
   | Availability: Unix, Windows.
   
   I think we'd have to check whether to add `.exe` or not (but I think it should work without). But I see no reason why it would not work.
   
   > You mention pipx earlier, and I feel it might be a better setup for contributors? Just `pipx install apache-airflow-breeze` and run `breeze` in the repository root. This is also more friendly to Windows since pipx (actually pip) would automatically create an exe for the command.
   
   What I really want, is to be able to run Breeze with all the latest modifications included so I really want the workflow:
   
   1) `git checkout main && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> run all latest and greates Breeze. I do not want any "install" command between those two. 
   
   What's even more, I want to be able to switch to another branch and do:
   
   1) git checkout v2-2-test && git pull
   2) `./Breeze2` (or `python Breeze2` for Windows) -> Run latest breeze from v2-2 branch` 
   
   Without having to reinstall anything. 
   
   Any extra steps will add confusion and people (casual users especially) will run older Breeze versions with newer code and the other way round, which is exactly what I want to avoid.
   
   > This actually gives me a thought, since you feel it’s important to make the virtual environment (for Breeze) invisibal to the user. Maybe we should build Breeze entirely upon an existing task runner solution, instead of rolling our own infrastructure. [Nox](https://nox.thea.codes/en/stable/) has pretty much have virtual environment management and task invocation figured out—we can make Breeze a wrapper to `noxfile.py`, and just write subcommands as Nox tasks.
   
   I think nox mainly focuses on creating temporary venvs, and also I **really** do not want any dependencies to be able to run the above workflow. 
   
   I want precisely three things:
   
   * python3.7+
   * docker (TBD version)
   * docker-compose (which soon will be part of docker anyway) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981710071


   Also we could nicely build a very, very small `./Breeze2.exe` in the main dir  of Airflow - and have it as part of the repo if we want to make it really Windows-friendly. That's rather easy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981804657


   I have also added current stats of the languages used - I beleive sh should be less than 300 lines of code (< 0.1% when we finish).
   
   ```
   SLOC    Directory         SLOC-by-Language (Sorted)
   
   144905  tests             python=144761,xml=132,sh=12
   130115  airflow           python=127249,javascript=2827,sh=39
   12052   docs              javascript=8977,python=2931,sh=144
   9073    scripts           sh=7457,python=1616
   6314    chart             python=6218,sh=96
   3665    top_dir           sh=2896,python=769
   3102    dev               python=2938,sh=164
   1723    kubernetes_tests  python=1723
   280     docker_tests      python=280
   140     metastore_browser python=140
   109     clients           sh=109
   28      images              sh=28
   
   Totals grouped by language (dominant language first):
   python:      288625 (92.65%)
   javascript:     11804 (3.79%)
   sh:           10945 (3.51%)
   xml:            132 (0.04%)
   ```
   
   I am pretty happy with what we have now as the first PR:
   
   * automated bootstrapping of virtualenv installation for `Breeze2` command
   * the bootstrapping is resistant to any missing dependencies. Currently it only requires Python3.7+ to be installed locally and nothing else.
   * the bootstrapping will automatically update the `Breeze2` venv whenever necesseary (first time, and whenever setup.cfg changes, including checking out branch).
   * standardised package layout and using `pip install -e .` to install virtualenb
   * likely Windows support for bootstraping (to be tested on Windows)
   * it's possible to add `Breeze2.exe` later on generated from `Breeze2` later on if we want to make it really nicely work on Windows (currently `python Breeze2` should be used to start Breeze on Windows)
    
   I don't think we need to have another Breeze package in PyPI with this setup, it's simply not needed as the `Breeze2` bootstraping should do the work nicely for the user. The nice we have is that the standard package structure will help with development (pip install -e .) even if we won't build and publish the package in PyPI.
   
   I already nicely integrated it as module in my IntelliJ without extra "magic" using the bootstrapped virtualenv - should nicely work also on Windows. @edithturn @Bowrna - maybe you could check it out and check it on your windows machines to see how easy it is to bootstrap (also maybe you could try to generate ./Breeze2.exe using `pyinstaller` and see if it woudl work? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982505277


   > I have more thoughts on environment provision (still don’t like automatically setting up virtual environments much, it can go wrong way to often), but let’s merge this first since those don’t really affect the “actual Breeze” functionalities.
   
   Yep. happy to improve it later on :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981703120


   > > This is not what the documentation tells: [docs.python.org/3/library/os.html#os.execv](https://docs.python.org/3/library/os.html#os.execv)
   > > | Availability: Unix, Windows.
   > 
   > I was inaccurate. The command works, but probably won’t do what you want slightly_smiling_face
   > 
   > [BPO-19124](https://bugs.python.org/issue19124) (and the linked [BPO-19066](https://bugs.python.org/issue19066)) has some detailed information on this. Quoting some dire but accurate comments:
   
   Ah yeah. then we cn replace it with subprocess call I think - the stdin/out handles will be fine, the only problem will be an extra "small" process running 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982515916


   > I have more thoughts on environment provision (still don’t like automatically setting up virtual environments much, it can go wrong way to often), but let’s merge this first since those don’t really affect the “actual Breeze” functionalities.
   
   One more comment: It actually might well be that when `Breeze2` reaches maturity and we will not update is as frequently, installing via PyPI and discovery of Airflow might be `better` workflow - so let's keep it as a possibility. But for now I think especially when we will develop it, keeping it the way where you can switch to latest version of Breeze2 by just `git-pull` is a very much needed feature.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r759343500



##########
File path: dev/breeze/doc/adr/0001-record-architecture-decisions.md
##########
@@ -0,0 +1,48 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+<!-- START doctoc generated TOC please keep comment here to allow auto update -->
+<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
+**Table of Contents**  *generated with [DocToc](https://github.com/thlorenz/doctoc)*
+
+- [1. Record architecture decisions](#1-record-architecture-decisions)
+  - [Status](#status)
+  - [Context](#context)
+  - [Decision](#decision)
+  - [Consequences](#consequences)
+
+# 1. Record architecture decisions

Review comment:
       Yep. I thin you were the one who motivated me to do it :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758241077



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       Still would not help with running tests via CI where conftest.py is there mostly to overcome default import behaviour of `pytest` when it comes to importing. The funny thing is that `python -m pytest` works fine, but when you try to run tests with `pytest`, the imports are not working the way you'd think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758241077



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       Still would not help with `conftest.py` running tests  where conftest.py is there mostly to overcome default import behaviour of `pytest` when it comes to importing. The funny thing is that `python -m pytest` works fine, but when you try to run tests with `pytest`, the imports are not working the way you'd think.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981601611


   It's not exactly what you proposed, but I think with this setup we will be able to modify (in the future) the entrypoint script, to be able to create venv and run equivalent of `pip install -e .` in this venv if someone runs it without installing it first and then we could simply run breeze from there. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981695138


   See the latest one (except the execv question on windows  - it is very simple (50 lines)  and it handles all that I want:
   
   * nice message when old python version is used
   * automated venv installation and activation
   * checking if upgrade is needed and doing so


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982046722


   Any more thought @uranusjr ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758239569



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       Ah I see. I misunderstood that. Hmm. Might be a good idea for kind of "bootstraping" of breeze. I like the idea.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981650929


   Sounds simple enough:
   
   ```
   #!/usr/bin/env python3
   import subprocess
   import sys
   from os import execv
   from pathlib import Path
   
   AIRFLOW_SOURCES_DIR = Path(__file__).parent.resolve()
   BUILD_DIR = AIRFLOW_SOURCES_DIR / ".build"
   BREEZE_VENV_DIR = BUILD_DIR / "breeze2" / "venv"
   BREEZE_VENV_BIN_DIR = BREEZE_VENV_DIR / "bin"
   BREEZE_VENV_PIP = BREEZE_VENV_BIN_DIR / "pip"
   BREEZE_VENV_BREEZE = BREEZE_VENV_BIN_DIR / "Breeze2"
   
   BREEZE_SOURCE_PATH = AIRFLOW_SOURCES_DIR / "dev" / "breeze"
   
   if not BREEZE_VENV_DIR.exists():
       BREEZE_VENV_DIR.mkdir(parents=True, exist_ok=True)
       subprocess.Popen([sys.executable, "-m", "venv", f"{BREEZE_VENV_DIR}"]).wait()
       subprocess.Popen([f"{BREEZE_VENV_PIP}", "install", "-e", "."], cwd=BREEZE_SOURCE_PATH).wait()
   
   execv(f"{BREEZE_VENV_BREEZE}", [f"{BREEZE_VENV_BREEZE}"] + sys.argv[1:])
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981693655


   > Note that `execv` does not work on Windows, so that’d be a problem if we’re moving toward supporting Windows.
   
   This is not what the documentation tells:  https://docs.python.org/3/library/os.html#os.execv
   
   | Availability: Unix, Windows.
   
   I think we'd have to check whether to add `.exe` or not (but I think it should work without). But I see no reason why it would not work.
   
   > You mention pipx earlier, and I feel it might be a better setup for contributors? Just `pipx install apache-airflow-breeze` and run `breeze` in the repository root. This is also more friendly to Windows since pipx (actually pip) would automatically create an exe for the command.
   
   What I really want, is to be able to run Breeze with all the latest modifications included so I really want the workflow:
   
   1) `git checkout main && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> run all latest and greates Breeze. I do not want any "install" command between those two. 
   
   What's even more, I want to be able to switch to another branch and do:
   
   1) `git checkout v2-2-test && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> Run latest Breeze from v2-2 branch` 
   
   Without having to reinstall anything (including when any dependencies change for Breeze between those versions)
   
   Any extra steps will add confusion and people (casual users especially) will run older Breeze versions with newer code and the other way round, which is exactly what I want to avoid.
   
   > This actually gives me a thought, since you feel it’s important to make the virtual environment (for Breeze) invisibal to the user. Maybe we should build Breeze entirely upon an existing task runner solution, instead of rolling our own infrastructure. [Nox](https://nox.thea.codes/en/stable/) has pretty much have virtual environment management and task invocation figured out—we can make Breeze a wrapper to `noxfile.py`, and just write subcommands as Nox tasks.
   
   I think nox mainly focuses on creating temporary venvs, and also I **really** do not want any dependencies to be able to run the above workflow. 
   
   I want precisely three things:
   
   * python3.7+
   * docker (TBD version)
   * docker-compose (which soon will be part of docker anyway) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981700207


   > This is not what the documentation tells: [docs.python.org/3/library/os.html#os.execv](https://docs.python.org/3/library/os.html#os.execv)
   > 
   > | Availability: Unix, Windows.
   
   I was inaccurate. The command works, but probably won’t do what you want 🙂 
   
   [BPO-19124](https://bugs.python.org/issue19124) (and the linked [BPO-19066](https://bugs.python.org/issue19066)) has some detailed information on this. Quoting some dire but accurate comments:
   
   > [C]ontrol is returned to cmd when the child process *starts* (and afterwards you have cmd and the child connected to the same console).
   
   > This is why I said that execv() is useless on Windows and that you should just use subprocess instead.
   
   > The exec functions provided by the Windows C runtime really are practically useless, due to creating an orphaned process, disrupting synchronous operation, and returning the wrong status code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758200977



##########
File path: dev/Breeze/conftest.py
##########
@@ -0,0 +1,23 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+import sys
+from pathlib import Path
+
+ROOT_DIR = str(Path(__file__))
+
+sys.path.insert(1, ROOT_DIR)

Review comment:
       This would also not be needed is Breeze is an installable package!

##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       I wonder if it’s be a good idea to distribute Breeze 2 on PyPI. The package would contain metadata (including dependencies) and a very small `Breeze` command that simply finds code in the checked-out Airflow repository to execute (so the bulk of the logic still lives with the Airflow source code and can be updated more easily). That’d make setup very straightforward.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758231392



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       We should make an ADR at some point about it at a later phase weighting pros/cons and generally deciding about the "distribution" methonds




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981804657


   I have also added current stats of the languages used - I beleive it shoudl be less than 300 lines of code (< 0.1% when we finish).
   
   ```
   SLOC    Directory         SLOC-by-Language (Sorted)
   
   144905  tests             python=144761,xml=132,sh=12
   130115  airflow           python=127249,javascript=2827,sh=39
   12052   docs              javascript=8977,python=2931,sh=144
   9073    scripts           sh=7457,python=1616
   6314    chart             python=6218,sh=96
   3665    top_dir           sh=2896,python=769
   3102    dev               python=2938,sh=164
   1723    kubernetes_tests  python=1723
   280     docker_tests      python=280
   140     metastore_browser python=140
   109     clients           sh=109
   28      images              sh=28
   
   Totals grouped by language (dominant language first):
   python:      288625 (92.65%)
   javascript:     11804 (3.79%)
   sh:           10945 (3.51%)
   xml:            132 (0.04%)
   ```
   
   I am pretty happy with what we have now as the first PR:
   
   * automated bootstrapping of virtualenv installation for `Breeze2` command
   * the bootstrapping is resistant to any missing dependencies. Currently it only requires Python3.7+ to be installed locally and nothing else.
   * the bootstrapping will automatically update the `Breeze2` venv whenever necesseary (first time, and whenever setup.cfg changes, including checking out branch).
   * standardised package layout and using `pip install -e .` to install virtualenb
   * likely Windows support for bootstraping (to be tested on Windows)
   * it's possible to add `Breeze2.exe` later on generated from `Breeze2` later on if we want to make it really nicely work on Windows (currently `python Breeze2` should be used to start Breeze on Windows)
    
   I don't think we need to have another Breeze package in PyPI with this setup, it's simply not needed as the `Breeze2` bootstraping should do the work nicely for the user. The nice we have is that the standard package structure will help with development (pip install -e .) even if we won't build and publish the package in PyPI.
   
   I already nicely integrated it as module in my IntelliJ without extra "magic" using the bootstrapped virtualenv - should nicely work also on Windows. @edithturn @Bowrna - maybe you could check it out and check it on your windows machines to see how easy it is to bootstrap (also maybe you could try to generate ./Breeze2.exe using `pyinstaller` and see if it woudl work? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r759342666



##########
File path: Breeze2
##########
@@ -0,0 +1,55 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib

Review comment:
       Oh yeah. That was a mentl shortcut. The comment should be:
   
   | We do the check here, before importing pathlib becasue Python <3.4 does not have pathlib and import would fail with cryptic "missing pathlib module" error.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981705118


   Smth like that in the last commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758667680



##########
File path: Breeze2
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib
+if sys.version_info.major != 3 or sys.version_info.minor < 7:
+    print("ERROR! Make sure you use Python 3.7+ !!")
+    sys.exit(1)
+
+import subprocess
+from os import execv
+from pathlib import Path
+
+AIRFLOW_SOURCES_DIR = Path(__file__).parent.resolve()
+BUILD_DIR = AIRFLOW_SOURCES_DIR / ".build"
+BUILD_BREEZE_DIR = BUILD_DIR / "breeze2"
+BUILD_BREEZE_CFG_SAVED = BUILD_BREEZE_DIR / "setup.cfg.saved"
+BUILD_BREEZE_VENV_DIR = BUILD_BREEZE_DIR / "venv"
+BUILD_BREEZE_VENV_BIN_DIR = BUILD_BREEZE_VENV_DIR / "bin"
+BUILD_BREEZE_VENV_PIP = BUILD_BREEZE_VENV_BIN_DIR / "pip"
+BUILD_BREEZE_VENV_BREEZE = BUILD_BREEZE_VENV_BIN_DIR / "Breeze2"
+
+BREEZE_SOURCE_PATH = AIRFLOW_SOURCES_DIR / "dev" / "breeze"
+BREEZE_SETUP_CFG_PATH = BREEZE_SOURCE_PATH / "setup.cfg"
+
+BUILD_BREEZE_DIR.mkdir(parents=True, exist_ok=True)
+
+
+def needs_installation() -> bool:
+    """Returns true if Breeze's virtualenv needs (re)installation"""
+    if not BUILD_BREEZE_VENV_DIR.exists() or not BUILD_BREEZE_CFG_SAVED.exists():
+        return True
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED) as saved:
+        current_config, saved_config = current.read(), saved.read()
+    return current_config != saved_config
+
+
+def save_config():
+    """Saves cfg file to virtualenv to check if there is a need for reinstallation of the virtualenv"""
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED, "w") as saved:
+        saved.write(current.read())

Review comment:
       TIL :). The Path is amazing :D




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758663294



##########
File path: Breeze2
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib
+if sys.version_info.major != 3 or sys.version_info.minor < 7:
+    print("ERROR! Make sure you use Python 3.7+ !!")
+    sys.exit(1)
+
+import subprocess
+from os import execv
+from pathlib import Path
+
+AIRFLOW_SOURCES_DIR = Path(__file__).parent.resolve()
+BUILD_DIR = AIRFLOW_SOURCES_DIR / ".build"
+BUILD_BREEZE_DIR = BUILD_DIR / "breeze2"
+BUILD_BREEZE_CFG_SAVED = BUILD_BREEZE_DIR / "setup.cfg.saved"
+BUILD_BREEZE_VENV_DIR = BUILD_BREEZE_DIR / "venv"
+BUILD_BREEZE_VENV_BIN_DIR = BUILD_BREEZE_VENV_DIR / "bin"
+BUILD_BREEZE_VENV_PIP = BUILD_BREEZE_VENV_BIN_DIR / "pip"
+BUILD_BREEZE_VENV_BREEZE = BUILD_BREEZE_VENV_BIN_DIR / "Breeze2"
+
+BREEZE_SOURCE_PATH = AIRFLOW_SOURCES_DIR / "dev" / "breeze"
+BREEZE_SETUP_CFG_PATH = BREEZE_SOURCE_PATH / "setup.cfg"
+
+BUILD_BREEZE_DIR.mkdir(parents=True, exist_ok=True)
+
+
+def needs_installation() -> bool:
+    """Returns true if Breeze's virtualenv needs (re)installation"""
+    if not BUILD_BREEZE_VENV_DIR.exists() or not BUILD_BREEZE_CFG_SAVED.exists():
+        return True
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED) as saved:
+        current_config, saved_config = current.read(), saved.read()
+    return current_config != saved_config

Review comment:
       ```suggestion
       return BREEZE_SETUP_CFG_PATH.read_text() != BUILD_BREEZE_CFG_SAVED.read_text()
   ```
   or
   ```suggestion
       return BREEZE_SETUP_CFG_PATH. read_bytes() != BUILD_BREEZE_CFG_SAVED. read_bytes()
   ```
   
   We use pathlib so we can write it a bit easier.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758659262



##########
File path: pyproject.toml
##########
@@ -16,5 +16,5 @@
 # under the License.
 [tool.black]
 line-length = 110
-target-version = ['py36', 'py37', 'py38']
+target-version = ['py36', 'py37', 'py38', 'py39']

Review comment:
       No yet. This is Airflow pyproject, not Breeze's one and currently py3.10 is not supported for airflow officially duet to mssql libraries not yet available (shoudl be there in January). But good eyes :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981966575


   The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758231392



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       We should make an ADR at some point about it at a later phase weighting pros/cons and generally deciding about the "distribution" methods




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758670585



##########
File path: dev/breeze/doc/adr/0001-record-architecture-decisions.md
##########
@@ -0,0 +1,48 @@
+<!--

Review comment:
       I have nothing against this experiment, but in one project with which I collaborated, I heard that there was an attempt to introduce ADR, but it failed. Simple Google Docs has proven to be a more effective tool.
   CC: @turbaszek 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981601611


   It's not exactly what you proposed @uranusjr , but I think with this setup we will be able to modify (in the future) the entrypoint script, to be able to create venv and run equivalent of `pip install -e .` in this venv if someone runs it without installing it first and then we could simply run breeze from there. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981700207


   > This is not what the documentation tells: [docs.python.org/3/library/os.html#os.execv](https://docs.python.org/3/library/os.html#os.execv)
   > 
   > | Availability: Unix, Windows.
   
   I was inaccurate. The command works, but probably won’t do what you want 🙂 
   
   [BPO-19124](https://bugs.python.org/issue19124) (and the linked [BPO-19066](https://bugs.python.org/issue19066)) has some detailed information on this. Quoting some dire but accurate comments:
   
   > [C]ontrol is returned to cmd when the child process *starts* (and afterwards you have cmd and the child connected to the same console).
   > […]
   > This is why I said that execv() is useless on Windows and that you should just use subprocess instead.
   
   > The exec functions provided by the Windows C runtime really are practically useless, due to creating an orphaned process, disrupting synchronous operation, and returning the wrong status code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981695138


   See the latest one (except the execv question on windows  - it is very simple (50 lines)  and it handles all that I want:
   
   * automated venv installation and activation
   * checking if upgrade is needed and doing so


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758270021



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       So - what you suggest is to have "pip install -e ./dev/breeze` in a virtualenv? I like it.
   
   What I really want (as end-goal) is something that we have with the old breeze where in order to use it, you do not worry about having a separate venv specifically prepared manually by the user (it should be created automatically in ~/.build/ folder and activated when breeze commands are run. 
   
   All the installation should happen under-the-hood and you should be able to run both breeze and tests without having to switch to that venv (kind of what pipx does). And autocomplete should also work with the command.
   
   Do you think that would work in this case (I think so) with the "package" structure you propose ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981652299


   (and we'd need to add some protection against old python verion)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] leahecole commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
leahecole commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758654751



##########
File path: dev/breeze/doc/adr/0002-implement-standalone-python-command.md
##########
@@ -0,0 +1,178 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+<!-- START doctoc generated TOC please keep comment here to allow auto update -->
+<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
+**Table of Contents**  *generated with [DocToc](https://github.com/thlorenz/doctoc)*
+
+- [2. Implement standalone python command](#2-implement-standalone-python-command)
+  - [Status](#status)
+  - [Context](#context)
+  - [Decision](#decision)
+  - [Consequences](#consequences)
+
+<!-- END doctoc generated TOC please keep comment here to allow auto update -->
+
+# 2. Implement standalone python command
+
+Date: 2021-11-28
+
+## Status
+
+Draft
+
+## Context
+
+The [Breeze](https://github.com/apache/airflow/blob/main/BREEZE.rst) is
+a command line development environment for Apache Airflow that makes
+it easy to setup Airflow development and test environment easily
+(< 10 minutes is the goal) and enable contributors to run any subset
+of tests that are executed in our CI environment easily.
+
+The environment has proven to be very useful (it has successfully onboarded
+a number of new contributors, and it makes the development environment of
+even seasoned contributors much easier as it provides a very easy
+replication of the CI environment as well as very easy to setup test
+environment that can be used to run:
+
+* Unit tests
+* Integration tests
+* Kubernetes/Helm tests
+* System tests
+
+It also serves as a base for our CI execution environment. The same scripts and tools are used
+in our CI (based on GitHub actions). A lot of common code and function between CI and Breeze are
+shared between the CI and Breeze. All those tools are held in "ci" package.
+
+Unfortunately, Breeze is largely based on Bash code - for which very few people (except maybe the
+Breeze creator - Jarek Potiuk, the author of this document) have any other feeling that uneasiness,
+disgust and fear of it :).  Since Airflow is largely based on Python, the common consensus is that
+Breeze should be rewritten in Python.
+
+In November 2021, Outreachy sponsored two internship for two interns: @Bowrna and @edithturn were assigned to
+the projects:
+
+* Convert Airflow Local Development environment `Breeze` - from Bash-based to Python-based
+* Rewrite Github Action workflows to Python
+
+With @potiuk, @eladkal and @xurror as mentors.
+
+The long-standing issues about those two projects are (and we hope to close the projects during the
+three months internship - December 2021 - March 2022):
+
+* https://github.com/apache/airflow/issues/12282
+* https://github.com/apache/airflow/issues/13182
+
+There are a number of problems with Bash scripts:
+
+* They are difficult to understand, modify and debug as Bash "magic" is somewhat arcane
+* They are difficult to implement complex logic with
+* Navigating common code that is used from the scripts is cumbersome and lack IDE/tools support
+* Default Bash on MacOS is very old (from 3.* line) and it will not be updated to a newer version
+  which impacts cross-platform Breeze applicability
+* Bash only work for Windows well in WSL2 environment, which further undermines cross-platform

Review comment:
       ```suggestion
   * Bash only works well for Windows in WSL2 environment, which further undermines cross-platform
   ```
   
   small grammar nit

##########
File path: pyproject.toml
##########
@@ -16,5 +16,5 @@
 # under the License.
 [tool.black]
 line-length = 110
-target-version = ['py36', 'py37', 'py38']
+target-version = ['py36', 'py37', 'py38', 'py39']

Review comment:
       Do we also want to add 3.10, or do you only want to keep it in line with what is currently supported in Airflow?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758257939



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       Because `python -m` adds the current working directory to `sys.path`, while an entry point (which is a script) does not. This is one of the lesser known Python quirk; I don’t know what the actual name is but personally call it “interpreter mode” (which adds cwd) and “script mode” (which does not).
   
   Pytest actually has specific documentation on this: https://docs.pytest.org/en/6.2.x/goodpractices.html#choosing-a-test-layout-import-rules
   
   This is why having a pip-installable package helps; we can lay out `dev/breeze` as a Python package:
   
   ```
   airflow/dev/breeze
    |-- pyproject.toml
    |-- setup.py  # assuming we’ll use setuptools...
    |-- src/
    |    `-- breeze/  # Actual Breeze implementation that the PyPI pacakge will also use.
    |         `-- __init__.py
    `-- tests/
         |-- conftest.py
         `-- test_breeze.py
   ```
   
   And `pip install -e ./dev/breeze` + `pytest dev/breeze/tests` would work automatically.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758665058



##########
File path: Breeze2
##########
@@ -0,0 +1,58 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib
+if sys.version_info.major != 3 or sys.version_info.minor < 7:
+    print("ERROR! Make sure you use Python 3.7+ !!")
+    sys.exit(1)
+
+import subprocess
+from os import execv
+from pathlib import Path
+
+AIRFLOW_SOURCES_DIR = Path(__file__).parent.resolve()
+BUILD_DIR = AIRFLOW_SOURCES_DIR / ".build"
+BUILD_BREEZE_DIR = BUILD_DIR / "breeze2"
+BUILD_BREEZE_CFG_SAVED = BUILD_BREEZE_DIR / "setup.cfg.saved"
+BUILD_BREEZE_VENV_DIR = BUILD_BREEZE_DIR / "venv"
+BUILD_BREEZE_VENV_BIN_DIR = BUILD_BREEZE_VENV_DIR / "bin"
+BUILD_BREEZE_VENV_PIP = BUILD_BREEZE_VENV_BIN_DIR / "pip"
+BUILD_BREEZE_VENV_BREEZE = BUILD_BREEZE_VENV_BIN_DIR / "Breeze2"
+
+BREEZE_SOURCE_PATH = AIRFLOW_SOURCES_DIR / "dev" / "breeze"
+BREEZE_SETUP_CFG_PATH = BREEZE_SOURCE_PATH / "setup.cfg"
+
+BUILD_BREEZE_DIR.mkdir(parents=True, exist_ok=True)
+
+
+def needs_installation() -> bool:
+    """Returns true if Breeze's virtualenv needs (re)installation"""
+    if not BUILD_BREEZE_VENV_DIR.exists() or not BUILD_BREEZE_CFG_SAVED.exists():
+        return True
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED) as saved:
+        current_config, saved_config = current.read(), saved.read()
+    return current_config != saved_config
+
+
+def save_config():
+    """Saves cfg file to virtualenv to check if there is a need for reinstallation of the virtualenv"""
+    with open(BREEZE_SETUP_CFG_PATH) as current, open(BUILD_BREEZE_CFG_SAVED, "w") as saved:
+        saved.write(current.read())
+
+
+if needs_installation():
+    print(f"(Re)Installing Breeze's virtualenv in {BUILD_BREEZE_VENV_DIR}")
+    BUILD_BREEZE_VENV_DIR.mkdir(parents=True, exist_ok=True)
+    subprocess.Popen([sys.executable, "-m", "venv", f"{BUILD_BREEZE_VENV_DIR}"]).wait()
+    subprocess.Popen(
+        [f"{BUILD_BREEZE_VENV_PIP}", "install", "--upgrade", "-e", "."], cwd=BREEZE_SOURCE_PATH
+    ).wait()

Review comment:
       ```suggestion
       subprocess.run([sys.executable, "-m", "venv", f"{BUILD_BREEZE_VENV_DIR}"], check=True)
       subprocess.run(
           [f"{BUILD_BREEZE_VENV_PIP}", "install", "--upgrade", "-e", "."], cwd=BREEZE_SOURCE_PATH,
           check=True
       )
   ```
   
   > The recommended approach to invoking subprocesses is to use the run() function for all use cases it can handle. For more advanced use cases, the underlying Popen interface can be used directly.
   > 
   > The run() function was added in Python 3.5; if you need to retain compatibility with older versions, see the Older high-level API section.
   
   https://docs.python.org/3/library/subprocess.html




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
uranusjr commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r758233927



##########
File path: dev/Breeze/requirements.txt
##########
@@ -0,0 +1,4 @@
+click
+pytest
+pytest-xdist
+rich

Review comment:
       > One potential problem with it is that you have to actively "install" the latest version if you want new features and by having everything in the same repo
   
   I mentioned keeping the `Breeze` command small so this won’t be the case. The command will mostly just contain
   
   ```python
   def _find_airflow_repo_root():
       ...
   
   if __name__ == "__main__":
       import sys
   
       root = _find_airflow_repo_root()
       if root is None:
           # emit some sort of error message...
           sys.exit(1)
       sys.path.insert(0, root / "dev")
   
       import breeze
       breeze.run()
   ```
   
   and an upgrade is only needed if we need to change dependencies, which should be seldom (and when it happens you need to pip-install an in-tree Breeze’s requirements.txt anyway).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981649965


   I just added the `Breeze2` script as I'd see it - we do not even have to install the package in PIP this way - see the latest commit @uranusjr (with the caveat that we'd have to add upgrade whenever setup.cfg changes but that should be easy). 
   
   I think that might be nice and portable and "self-running" without extra deps like flit/PEP 621. For me this is really the case of having a runnable "devel" environment. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981693655


   > Note that `execv` does not work on Windows, so that’d be a problem if we’re moving toward supporting Windows.
   
   This is not what the documentation tells:  https://docs.python.org/3/library/os.html#os.execv
   
   | Availability: Unix, Windows.
   
   I think we'd have to check whether to add `.exe` or not (but I think it should work without). But I see no reason why it would not work.
   
   > You mention pipx earlier, and I feel it might be a better setup for contributors? Just `pipx install apache-airflow-breeze` and run `breeze` in the repository root. This is also more friendly to Windows since pipx (actually pip) would automatically create an exe for the command.
   
   What I really want, is to be able to run Breeze with all the latest modifications included so I really want the workflow:
   
   1) `git checkout main && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> run all latest and greates Breeze. I do not want any "install" command between those two. 
   
   What's even more, I want to be able to switch to another branch and do:
   
   1) `git checkout v2-2-test && git pull`
   2) `./Breeze2` (or `python Breeze2` for Windows) -> Run latest Breeze from v2-2 branch` 
   
   Without having to reinstall anything. 
   
   Any extra steps will add confusion and people (casual users especially) will run older Breeze versions with newer code and the other way round, which is exactly what I want to avoid.
   
   > This actually gives me a thought, since you feel it’s important to make the virtual environment (for Breeze) invisibal to the user. Maybe we should build Breeze entirely upon an existing task runner solution, instead of rolling our own infrastructure. [Nox](https://nox.thea.codes/en/stable/) has pretty much have virtual environment management and task invocation figured out—we can make Breeze a wrapper to `noxfile.py`, and just write subcommands as Nox tasks.
   
   I think nox mainly focuses on creating temporary venvs, and also I **really** do not want any dependencies to be able to run the above workflow. 
   
   I want precisely three things:
   
   * python3.7+
   * docker (TBD version)
   * docker-compose (which soon will be part of docker anyway) 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r759198273



##########
File path: dev/breeze/doc/adr/0001-record-architecture-decisions.md
##########
@@ -0,0 +1,48 @@
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements.  See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership.  The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied.  See the License for the
+ specific language governing permissions and limitations
+ under the License.
+ -->
+
+<!-- START doctoc generated TOC please keep comment here to allow auto update -->
+<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
+**Table of Contents**  *generated with [DocToc](https://github.com/thlorenz/doctoc)*
+
+- [1. Record architecture decisions](#1-record-architecture-decisions)
+  - [Status](#status)
+  - [Context](#context)
+  - [Decision](#decision)
+  - [Consequences](#consequences)
+
+# 1. Record architecture decisions

Review comment:
       Nice!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#discussion_r759197693



##########
File path: Breeze2
##########
@@ -0,0 +1,55 @@
+#!/usr/bin/env python3
+# isort: skip
+import os
+import sys
+
+# Python <3.4 does not have pathlib

Review comment:
       Comment doesn't line up with code :)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-981726448


   > Also we could nicely build a very, very small `./Breeze2.exe` in the main dir of Airflow - and have it as part of the repo if we want to make it really Windows-friendly. That's rather easy.
   
   Yeah. Simply running `pyinstaller ./Breeze2` on Windows should produce a nice executable that we could commit to the repo (and rebuild when `Breeze2` script changes which will be extremely rare). 
   
   Note, that it's perfectly OK to add such binary to the repo according to ASF rules (we just have to make sure it is not added to released source package).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982515916


   > I have more thoughts on environment provision (still don’t like automatically setting up virtual environments much, it can go wrong way to often), but let’s merge this first since those don’t really affect the “actual Breeze” functionalities.
   
   One more comment: It actually might well be that when `Breeze2` reaches maturity and we will not update is as frequently, installing via PyPI and discovery of Airflow might be `better` workflow - so let's keep it as a possibility. But for now I think especially when we will develop it, keeping it the way where you can switch to latest version of Breeze2 by just `git pull` is a very much needed feature.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982515916


   > I have more thoughts on environment provision (still don’t like automatically setting up virtual environments much, it can go wrong way to often), but let’s merge this first since those don’t really affect the “actual Breeze” functionalities.
   
   One more comment: It actually might well be that when `Breeze2` reaches maturity and we will not update is as frequently, installing via PyPI and discovery of Airflow might be `better` workflow - so let's keep it as a possibility. But for now I think especially when we will develop it, keeping it the way where you can switch to latest version of breeze by just `git-pull` is a very much needed feature.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #19867: Initial commit for new Breeze project

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #19867:
URL: https://github.com/apache/airflow/pull/19867#issuecomment-982513372


   BTW. Following @mik-laj's advice I reserved PyPI package: the https://pypi.org/project/apache-airflow-breeze/ as well. Maybe we can figure out other /parallell workflows with installig the package from PyPI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org