You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/11 04:14:21 UTC

[GitHub] [airflow] uranusjr opened a new pull request #15317: Delete chart/tests/__init__.py

uranusjr opened a new pull request #15317:
URL: https://github.com/apache/airflow/pull/15317


   Since we're now Python 3 only, this is not needed to make the directory importable. And by deleting it, Pytest won't refuse to collect `chart/tests/conftest.py` and `tests/conftest.py` at the same time due to duplicated import path.
   
   cc @andrewgodwin (iirc you had this issue a while ago?)
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817325708


   Hmm… so `--import-mode=importlib` would cause issues for pytest to resolve `from tests.helm_template_generator` to `/chart/tests/helm_template_generator.py`, since `/chart` is not in `sys.path` in `importlib` mode.
   
   I still think the removal of `/chart/tests/__init__.py` is correct and not problematic, since the directory does not become a namespace package without the file, but instead not a package at all. But if the decision is to still not do it, I guess I can use workarounds to use the very useful `-k` option 🤷 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817300262


   I agree with @mik-laj we should  be careful with namespaces. The question is what we want to achieve with it. 
   
   Do we really want to convert the test/charts into a namespace package ? Why?  Is the only reason because we want to run all regular tests and chart test with the same pytest command (I believe we should not do it because those are completely different sets of tests that SHOULD be run separately and they are part of the "chart" 'sub-project' not airflow ( they even never import airflow and never should). 
   
   There are couple of reasons we should be careful with the implicit namespace packages.
   
   1. Not all the tools are handling implicit namespace packages properly..
   
   For example we have to dynamically add  (and remove) __init__.py now when we are running airflow docs: (https://github.com/apache/airflow/blob/master/docs/exts/provider_init_hack.py) and pylint https://github.com/apache/airflow/blob/master/scripts/in_container/run_pylint.sh#L22  
   
   We even have a pre-commit to check if it is not a left-over so that you would not committ it accidentally:
   https://github.com/apache/airflow/blob/master/scripts/ci/pre_commit/pre_commit_check_providers_init.sh
   
   2. Namespace packages should be used to separate whole "packages" not just a folder in a subdirectory
   
   This is where namespace packages are described https://packaging.python.org/guides/packaging-namespace-packages/. If you look at the examples, you will see why they are introduced in a first place.
   
   There is a misconception (which I also believed in for some time) that after you move to python3.3+, you can simply remove all the __init__.py. This is not true. Namespace packages are really useful when you want to separate your projects into smaller "separately installable" and versioned packages. Very similar to what we do with providers. With the current providers - at least for now - we chose a bit different path where we have a bit hybrid solution (we have namespaces but they are not "complete" separate packages on their own. we might eventually converge into full namespace-compliant way though. That will require a few more iterations, moving the setup dependencies, docs inside each package etc.  - we are not very far from that, but there is a number of non-trivial changes to implement to make it happen (likely something we might consider for 2.1).  
   
   The namespace packages are not really good for separating a single tests subdirectory. Why?
   
   3. Consequences of having separate namespaces. 
   
   There are a number of implications of not having __init__.py (i.e. having an implicit package) - like where they can be installed, how long it takes to search for the packages when you import them (much longer), but also what does it mean for the test code. I had the case when you place the tests in a different namespace and for example you wanted to check if two classes were objects of the same clas/inheritnce - they were different because they were imported in  a different namespace. Those are non-obvious problems that might be alien to an average python developer.
   
   If you look at PEP420 https://www.python.org/dev/peps/pep-0420/#specification  the regular packages are still supposed to contain `__init__.py`:
   
   ```
   Regular packages will continue to have an __init__.py and will reside in a single directory.
   ```
   And 
   ``
   Namespace packages cannot contain an __init__.py. 
   ```
   so i think converting "chart/tests" into a namespace package is not a good idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817269747


   [The Workflow run](https://github.com/apache/airflow/actions/runs/737764070) is cancelling this PR. Building images for the PR has failed. Follow the workflow link to check the reason.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr closed pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
uranusjr closed pull request #15317:
URL: https://github.com/apache/airflow/pull/15317


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817300262


   I agree with @mik-laj we should  be careful with namespaces. The question is what we want to achieve with it. 
   
   Do we really want to convert the test/charts into a namespace package ? Why?  Is the only reason because we want to run all regular tests and chart test with the same pytest command (I believe we should not do it because those are completely different sets of tests that SHOULD be run separately and they are part of the "chart" 'sub-project' not airflow ( they even never import airflow and never should). 
   
   There are couple of reasons we should be careful with the implicit namespace packages.
   
   1. Not all the tools are handling implicit namespace packages properly..
   
   For example we have to dynamically add  (and remove) __init__.py now when we are running airflow docs: (https://github.com/apache/airflow/blob/master/docs/exts/provider_init_hack.py) and pylint https://github.com/apache/airflow/blob/master/scripts/in_container/run_pylint.sh#L22  
   
   We even have a pre-commit to check if it is not a left-over so that you would not committ it accidentally:
   https://github.com/apache/airflow/blob/master/scripts/ci/pre_commit/pre_commit_check_providers_init.sh
   
   2. Namespace packages should be used to separate whole "packages" not just a folder in a subdirectory
   
   This is where namespace packages are described https://packaging.python.org/guides/packaging-namespace-packages/. If you look at the examples, you will see why they are introduced in a first place.
   
   There is a misconception (which I also believed in for some time) that after you move to python3.3+, you can simply remove all the __init__.py. This is not true. Namespace packages are really useful when you want to separate your projects into smaller "separately installable" and versioned packages. Very similar to what we do with providers. With the current providers - at least for now - we chose a bit different path where we have a bit hybrid solution (we have namespaces but they are not "complete" separate packages on their own. we might eventually converge into full namespace-compliant way though. That will require a few more iterations, moving the setup dependencies, docs inside each package etc.  - we are not very far from that, but there is a number of non-trivial changes to implement to make it happen (likely something we might consider for 2.1).  
   
   The namespace packages are not really good for separating a single tests subdirectory. Why?
   
   3. Consequences of having separate namespaces. 
   
   There are a number of implications of not having __init__.py (i.e. having an implicit package) - like where they can be installed, how long it takes to search for the packages when you import them (much longer), but also what does it mean for the test code. I had the case when you place the tests in a different namespace and for example you wanted to check if two classes were objects of the same clas/inheritnce - they were different because they were imported in  a different namespace. Those are non-obvious problems that might be alien to an average python developer.
   
   If you look at PEP420 https://www.python.org/dev/peps/pep-0420/#specification  the regular packages are still supposed to contain `__init__.py`:
   
   ```
   Regular packages will continue to have an __init__.py and will reside in a single directory.
   ```
   And 
   ```
   Namespace packages cannot contain an __init__.py. 
   ```
   
   So i think converting "chart/tests" into a namespace package is not a good idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817323364


   The pytest issue this intends to fix is:
   
   ```
   $ ./breeze tests -- -k test_refresh_py_dag
   ...
   ________________________________________ ERROR collecting test session _________________________________________
   /usr/local/lib/python3.6/site-packages/_pytest/config/__init__.py:570: in _importconftest
       mod = import_path(conftestpath, mode=importmode)
   /usr/local/lib/python3.6/site-packages/_pytest/pathlib.py:551: in import_path
       raise ImportPathMismatchError(module_name, module_file, path)
   E   _pytest.pathlib.ImportPathMismatchError: ('tests.conftest', '/opt/airflow/tests/conftest.py', PosixPath('/opt/airflow/chart/tests/conftest.py'))
   ```
   
   This is because on test collection, both `/opt/airflow` (project root) and `/opt/airflow/chart` are in `sys.path`, so both `/opt/airflow/tests/conftest.py` and `/opt/airflow/chart/tests/conftest.py` becomes `tests.conftest` to pytest. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817262847


   Static checks are sad. Can you fix it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817300262


   I agree with @mik-laj we should  be careful with namespaces. The question is what we want to achieve with it. 
   
   Do we really want to convert the test/charts into a namespace package ? Why?  Is the only reason because we want to run all regular tests and chart test with the same pytest command (I believe we should not do it because those are completely different sets of tests that SHOULD be run separately and they are part of the "chart" 'sub-project' not airflow ( they even never import airflow and never should). 
   
   There are couple of reasons we should be careful with the implicit namespace packages.
   
   1. Not all the tools are handling implicit namespace packages properly..
   
   For example we have to dynamically add  (and remove) __init__.py now when we are running airflow docs: (https://github.com/apache/airflow/blob/master/docs/exts/provider_init_hack.py) and pylint https://github.com/apache/airflow/blob/master/scripts/in_container/run_pylint.sh#L22  
   
   We even have a pre-commit to check if it is not a left-over so that you would not committ it accidentally:
   https://github.com/apache/airflow/blob/master/scripts/ci/pre_commit/pre_commit_check_providers_init.sh
   
   2. Namespace packages should be used to separate whole "packages" not just a folder in a subdirectory
   
   This is where namespace packages are described https://packaging.python.org/guides/packaging-namespace-packages/. If you look at the examples, you will see why they are introduced in a first place.
   
   There is a misconception (which I also believed in for some time) that after you move to python3.3+, you can simply remove all the __init__.py. This is not true. Namespace packages are really useful when you want to separate your projects into smaller "separately installable" and versioned packages. Very similar to what we do with providers. With the current providers - at least for now - we chose a bit different path where we have a bit hybrid solution (we have namespaces but they are not "complete" separate packages on their own. we might eventually converge into full namespace-compliant way though. That will require a few more iterations, moving the setup dependencies, docs inside each package etc.  - we are not very far from that, but there is a number of non-trivial changes to implement to make it happen (likely something we might consider for 2.1).  
   
   The namespace packages are not really good for separating a single tests subdirectory. Why?
   
   3. Consequences of having separate namespaces. 
   
   There are a number of implications of not having __init__.py (i.e. having an implicit namespace package) - like where they can be installed, how long it takes to search for the packages when you import them (much longer), but also what does it mean for the test code. I had the case when you place the tests in a different namespace and for example you wanted to check if two classes were objects of the same clas/inheritnce - they were different because they were imported in  a different namespace. Those are non-obvious problems that might be alien to an average python developer.
   
   If you look at PEP420 https://www.python.org/dev/peps/pep-0420/#specification  the regular packages are still supposed to contain `__init__.py`:
   
   ```
   Regular packages will continue to have an __init__.py and will reside in a single directory.
   ```
   And 
   ```
   Namespace packages cannot contain an __init__.py. 
   ```
   
   So i think converting "chart/tests" into a namespace package is not a good idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817317151






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817317151


   Pytest by default does *not* load a `tests` directory without a `__init__.py` as a namespace package; it has its own module identification logic. This is also why I chanced the `helm_template_generator` imports—by removing the `__init__.py` file, the whole `tests` directory is no longer considered a package (neither namespace nor regular) by pytest.
   
   But if you’re really against removing the file, maybe we can change how pytest loads tests instead, with the new `importlib` strategy introduced in pytest 6.0. I actually just learned about this right now myself as well. Pytest traditionally expects users to put tests in plain directories (not packages) and had a lot of issues dealing with the `__init__.py` files in the tests directory (and thus strongly *discouraged* the setup)), but seems to have become better with it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] mik-laj commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
mik-laj commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817271758


   Besides, I have concerns about whether this kind of packages will work properly with all the static checks we use. We had a discussion about the use of implicate namespaces in the Airflow codebase. See: https://lists.apache.org/thread.html/b461c5d03c2f32d3379a1f9ddacc36e5ef077dccda0e37843064e409%40%3Cdev.airflow.apache.org%3E I don't remember the conclusion, but I think we rejected the idea because we didn't merge the change.
   
   CC: @potiuk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] andrewgodwin commented on pull request #15317: Delete chart/tests/__init__.py

Posted by GitBox <gi...@apache.org>.
andrewgodwin commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817335834


   Yeah, I don't think it warrants removal without a lot more careful work and checking given there's still _some_ level of importing going on, and directories in Python 3 become namespace packages almost accidentally if an import to them is left/generated anywhere in the code...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org