You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/11 12:32:44 UTC

[GitHub] [airflow] potiuk commented on pull request #15317: Delete chart/tests/__init__.py

potiuk commented on pull request #15317:
URL: https://github.com/apache/airflow/pull/15317#issuecomment-817300262


   I agree with @mik-laj we should  be careful with namespaces. The question is what we want to achieve with it. 
   
   Do we really want to convert the test/charts into a namespace package ? Why?  Is the only reason because we want to run all regular tests and chart test with the same pytest command (I believe we should not do it because those are completely different sets of tests that SHOULD be run separately and they are part of the "chart" 'sub-project' not airflow ( they even never import airflow and never should). 
   
   There are couple of reasons we should be careful with the implicit namespace packages.
   
   1. Not all the tools are handling implicit namespace packages properly..
   
   For example we have to dynamically add  (and remove) __init__.py now when we are running airflow docs: (https://github.com/apache/airflow/blob/master/docs/exts/provider_init_hack.py) and pylint https://github.com/apache/airflow/blob/master/scripts/in_container/run_pylint.sh#L22  
   
   We even have a pre-commit to check if it is not a left-over so that you would not committ it accidentally:
   https://github.com/apache/airflow/blob/master/scripts/ci/pre_commit/pre_commit_check_providers_init.sh
   
   2. Namespace packages should be used to separate whole "packages" not just a folder in a subdirectory
   
   This is where namespace packages are described https://packaging.python.org/guides/packaging-namespace-packages/. If you look at the examples, you will see why they are introduced in a first place.
   
   There is a misconception (which I also believed in for some time) that after you move to python3.3+, you can simply remove all the __init__.py. This is not true. Namespace packages are really useful when you want to separate your projects into smaller "separately installable" and versioned packages. Very similar to what we do with providers. With the current providers - at least for now - we chose a bit different path where we have a bit hybrid solution (we have namespaces but they are not "complete" separate packages on their own. we might eventually converge into full namespace-compliant way though. That will require a few more iterations, moving the setup dependencies, docs inside each package etc.  - we are not very far from that, but there is a number of non-trivial changes to implement to make it happen (likely something we might consider for 2.1).  
   
   The namespace packages are not really good for separating a single tests subdirectory. Why?
   
   3. Consequences of having separate namespaces. 
   
   There are a number of implications of not having __init__.py (i.e. having an implicit package) - like where they can be installed, how long it takes to search for the packages when you import them (much longer), but also what does it mean for the test code. I had the case when you place the tests in a different namespace and for example you wanted to check if two classes were objects of the same clas/inheritnce - they were different because they were imported in  a different namespace. Those are non-obvious problems that might be alien to an average python developer.
   
   If you look at PEP420 https://www.python.org/dev/peps/pep-0420/#specification  the regular packages are still supposed to contain `__init__.py`:
   
   ```
   Regular packages will continue to have an __init__.py and will reside in a single directory.
   ```
   And 
   ``
   Namespace packages cannot contain an __init__.py. 
   ```
   so i think converting "chart/tests" into a namespace package is not a good idea.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org