You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/06/22 00:43:53 UTC

[GitHub] [airflow] kaxil opened a new pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

kaxil opened a new pull request #16577:
URL: https://github.com/apache/airflow/pull/16577


   Same as https://github.com/apache/airflow/pull/16494 - However that PR had to be reverted in https://github.com/apache/airflow/pull/16518 as it failed building of PROD image, this PR/commit will fix it.
   
   PROBLEM: Currently the airflow wheel is built with the yarn.lock which is not actually used by the airflow itself. Having this file in the docker image causes the clair and trivy scanners to fail
   
   FIX: The fix is to exclude the yarn.lock by specifying it in the manifest.in
   
   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/main/UPDATING.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#discussion_r656083874



##########
File path: airflow/www/compile_assets.sh
##########
@@ -31,4 +31,9 @@ fi
 yarn install --frozen-lockfile
 yarn run build
 
-find package.json yarn.lock static/css static/js -type f | sort | xargs md5sum > "${MD5SUM_FILE}"
+YARN_LOCK_FILE="yarn.lock"
+if [[ -f "$YARN_LOCK_FILE" ]]; then

Review comment:
       updated in https://github.com/apache/airflow/pull/16577/commits/11c927fc07a53f2a9334121f06e4426c8bb438e0




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#discussion_r656083475



##########
File path: airflow/www/compile_assets.sh
##########
@@ -31,4 +31,9 @@ fi
 yarn install --frozen-lockfile
 yarn run build
 
-find package.json yarn.lock static/css static/js -type f | sort | xargs md5sum > "${MD5SUM_FILE}"
+YARN_LOCK_FILE="yarn.lock"
+if [[ -f "$YARN_LOCK_FILE" ]]; then

Review comment:
       Yep. This was what I fixed in my PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865993731


   > Could you clarify this a bit? It’s not clear where `compile_assets.sh` was not included in from your sentence. (And `compile_assets.sh` is definitely included in wheels currently, I just downloaded the 2.1.0 wheel from PyPI.org, it contains `airflow/www/compile_assets.sh`.)
   
   And yeah - this is a different 'compile assets" script - that's why it is included. This one is used  when you are inside the docker container or in your local environment when you iterate over the .js files and want to recompile them. I tend to forget about this one because I almost never do that. For that much better option is `start-airflow` command in Breeze because it starts the `development` mode of the server which reloads and recompiles stuff dynamically when it's changed I believe (It was added by James some time ago I think).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865906553


   Thanks for the clarification. Excluding the file from sdist sounds reasonable to me from the justification above, so +1 to this PR on its own. The only remaining issue I have is why `compile_assets.sh` is currently included in *wheels* right now, and why this PR can successfully exclude the file, since my understanding is MANIFEST.in is not supposed to affect that. There are probably some weird setuptools mechanisms I’m not understanding correctly, I will need to dig into it more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

Posted by GitBox <gi...@apache.org>.
potiuk closed pull request #16577:
URL: https://github.com/apache/airflow/pull/16577


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865988674


   I hope it's clearer now :). 
   
   There are many things happening in our CI - and things are pretty complex: on one hand we want to use PR sources to verify if they work, on the other we want to build airflow from packages. 
   
   But when you want to test or run the PROD/CI image locally during development (for your tests) you cannot really go through the extra hoop of preparing the packages to install them because you want to mount the sources directly to the image in order to be able to modify the sources. Having this possibility of building the images either from packages or from sources makes it much easier to iterate over your tests inside the image without having to rebuild it every time - this is basically what breeze does - builds CI/PROD image locally in the way that you can mount the local sources and continue developing locally with mounted sources.
   
   This saves A TON of time for iterations over your tests especially when you work with the integrations of Airflow (Kerberos/mongo/redis whatever) but also when you work with errors that manifest only for specific database versions - ability of having all that up-and-running and tests running with exactly the same configuration as on CI makes it super easy to reproduce any errors you see in CI as well (and you can even add `--github-image-id` parameter when your CI build fails and have the exact replication of what happened in CI locally - and you can iterate and reproduce the problems there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865862558


   > > @potiuk @uranusjr Did you guys actually try out? I had, unless I missed something (which is also possible :) )
   > 
   > Yep. I tried. I think the main point (and @urunsjr I believe agrees with me on that) is that we should NOT exclude yarn.lock from the sdist package. It's quite useful to have it ithere, because this will allow the users to repeatably build the package on their own using sources.
   
   Well that defeats the purpose of what this PR is solving, clair and trivy and other scanners will find it.
   
   For ASF project, if someone wants to build from source they can use `apache-airflow-VERSION-source.tar.gz`, example https://dist.apache.org/repos/dist/release/airflow/2.1.0/apache-airflow-2.1.0-source.tar.gz


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865979733


   The error we had when running the compile assets was from the sources of airlflow. When we build the image from sources, we also run the `compile` script after running `pip install .`. This is only when you install airflow from sources (which happens in CI image and in Prod image when you build locally in breeze).
   
   In CI, the images are build from packages generated using PR sources (because we want to run K8S tests using airflow production image build from PR sources AND we want to install prod image using packages similarly when we release the image). 
   
   So in CI those things happen:
   1) first I build airflow package (using sources from the incoming PR) - here assets are compiled during preparing the package
   2) then I build all provider packages (again using sources from the incoming PR)
   3) then PROD image is built using those locally built packages rather than PyPI ones (they are copied to docker-context-files and installed from there).
   
   The problem was that during step 3, there was an extra step run to compile the assets again. This was done using compile_assets scripts from docker folder: `https://github.com/apache/airflow/blob/main/scripts/docker/compile_www_assets.sh` . The "scripts/docker" folder contains all the scripts that are used by the Dockerfile build. It is not part of the package, and should not be added. So possibly this folder is added there by mistake :D. I don't think any of the scripts are needed in sdist package. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865447739


   I'm also curious why excluding the file from MANIFEST.in changes the wheel content. That file is for sdist inclusion and doesn't generally affect the wheel. There is something in the build steps I'm not understanding, probably.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865737335


   Sorry for closing it. I believe #16580 is the right fix (including prod Dockerfile fix), feel free to either approve that one or incorporate here. I don't mind either (or if you find a better way of excluding the lock file only for whl package I am happy as well).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #16577:
URL: https://github.com/apache/airflow/pull/16577


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865686240


   @uraunsjr is right. `MANIFEST.in` is for sdist. For wheel package, we are including everything in the packages (yarn.lock is placed in `airflow.www` package (__init__.py file is in the same folder as yarn.lock). Therefore we need to exclude the file via `exclude_package_data`. 
   
   Closing in favour of #16580 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#discussion_r656083810



##########
File path: airflow/www/compile_assets.sh
##########
@@ -31,4 +31,9 @@ fi
 yarn install --frozen-lockfile
 yarn run build
 
-find package.json yarn.lock static/css static/js -type f | sort | xargs md5sum > "${MD5SUM_FILE}"
+YARN_LOCK_FILE="yarn.lock"
+if [[ -f "$YARN_LOCK_FILE" ]]; then

Review comment:
       the whole script is not run when we install from packages.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865906553


   Thanks for the clarification. Excluding the file from sdist sounds reasonable to me from the justification above, so +1 to this PR on its own. The only remaining issue I have is why `compile_assets.sh` is currently included in *wheels* right now (it’s not listed in `package_data`), and why this PR can successfully exclude the file (MANIFEST.in is not supposed to affect that). There are probably some weird setuptools mechanisms I’m not understanding correctly, I will need to dig into it more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865861315


   > @potiuk @uranusjr Did you guys actually try out? I had, unless I missed something (which is also possible :) )
   
   Yep. I tried. I think the main point (and @urunsjr I believe agrees with me on that) is that ~we should NOT exclude yarn.lock from the sdist package. It's quite useful to have it ithere, because this will allow the users to repeatably build the package on their own using sources.~
   
   UPDATE: Stupid me. Mixed sources with sdist :( 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
kaxil commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865854877


   @potiuk @uranusjr Did you guys actually try out? I had, unless I missed something (which is also possible :) )
   
   The same link that you used Jarek explain the three options:
   
   https://setuptools.readthedocs.io/en/latest/userguide/datafiles.html
   
   ![image](https://user-images.githubusercontent.com/8811558/122906976-98d3f380-d34a-11eb-94ee-e8ee8e368366.png)
   
   If a package has `include_package_data` as `True`, `MANIFEST.in` is also used. Also summarized somewhat in https://github.com/pypa/setuptools/issues/511#issuecomment-570177072
   
   @potiuk I would want to continue with this PR and you can optimize it in yours. I have updated the PR title and description to say I want to exclude yarn.lock file from both tar.gz and .whl


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865914572


   > FYI the compile_assets.sh was not included.
   
   Could you clarify this a bit? It’s not clear where `compile_assets.sh` was not included in from your sentence. (And `compile_assets.sh` is definitely included in wheels currently, I just downloaded the 2.1.0 wheel from PyPI.org, it contains `airflow/www/compile_assets.sh`.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865979733


   The error we had when running the compile assets was from the sources of airlflow. When we build the image from sources, we also run the `compile` script after running `pip install .`. This is only when you install airflow from sources (which happens in CI image and in Prod image when you build locally in breeze).
   
   In CI, the images are build from packages generated using PR sources (because we want to run K8S tests using airflow production image build from PR sources AND we want to install prod image using packages similarly when we release the image). 
   
   So in CI those things happen:
   1) first I build airflow package (using sources from the incoming PR) - here assets are compiled during preparing the package
   2) then I build all provider packages (again using sources from the incoming PR)
   3) then PROD image is built using those locally built packages rather than PyPI ones (they are copied to docker-context-files and installed from there).
   
   The problem was that during step 3, there was an extra step run to compile the assets again (that was a mistake - fixed in this PR). This was done using compile_assets scripts from docker folder: `https://github.com/apache/airflow/blob/main/scripts/docker/compile_www_assets.sh` . The "scripts/docker" folder contains all the scripts that are used by the Dockerfile build. It is not part of the package, and should not be added. So possibly this folder is added there by mistake :D. I don't think any of the scripts are needed in sdist package. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865863373


   > If we include the compiled assets, then we should not include the yarn.lock.
   > 
   > Users don't rebuild form the sdist -- they rebuild from the _source_.
   
   Ah you are right. I forgot that in airflow sdist and sources are separate packages (in providers they are the same). You are right I withdraw my comment and close PR :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865906553


   Thanks for the clarification. Excluding the file from sdist sounds reasonable to me from the justification above, so +1 to this PR on its own. The only remaining issue I have is why `compile_assets.sh` is currently included in *wheels* right now (it shouldn’t since it’s not listed in `package_data`), and why this PR can successfully exclude the file (it shouldn’t since MANIFEST.in is not supposed to affect wheel content). There are probably some weird setuptools mechanisms I’m not understanding correctly, I will need to dig into it more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865910593


   FYI the compile_assets.sh was not included. It was run during image build after the package has been installed. But it's now fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865993731


   > Could you clarify this a bit? It’s not clear where `compile_assets.sh` was not included in from your sentence. (And `compile_assets.sh` is definitely included in wheels currently, I just downloaded the 2.1.0 wheel from PyPI.org, it contains `airflow/www/compile_assets.sh`.)
   
   And yeah - this is a different 'compile assets" script - that's why it is included. This one is used  when you are inside the docker container or in your local environment when you iterate over the .js files and want to recompile them. I tend to forget about this one because I almost never do that. For that much better option is `start-airflow` command in Breeze because it starts the `development` mode of the server which reloads and recompiles stuff dynamically when it's changed I believe (It was added by James some time ago I think).
   
   It should probably be indeed excluded.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865861315


   > @potiuk @uranusjr Did you guys actually try out? I had, unless I missed something (which is also possible :) )
   
   Yep. I tried. I think the main point (and @urunsjr I believe agrees with me on that) is that we should NOT exclude yarn.lock from the sdist package. It's quite useful to have it ithere, because this will allow the users to repeatably built the package on their own using sources. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865863373


   > If we include the compiled assets, then we should not include the yarn.lock.
   > 
   > Users don't rebuild form the sdist -- they rebuild from the _source_.
   >
   > For ASF project, if someone wants to build from source they can use apache-airflow-VERSION-source.tar.gz, example https://dist.apache.org/repos/dist/release/airflow/2.1.0/apache-airflow-2.1.0-source.tar.gz
   
   Ah you are right. I forgot that in airflow sdist and sources are separate packages (in providers they are the same). You are right I withdraw my comment and close PR :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865906553


   Thanks for the clarification. Excluding the file from sdist sounds reasonable to me from the justification above, so +1 to this PR on its own. The only remaining issue I have is why `compile_assets.sh` is currently included in *wheels* right now (it shouldn’t since it’s not listed in `package_data`), and why this PR can successfully exclude the file (it shouldn’t since MANIFEST.in is not supposed to affect wheel content without `include_package_data` set). There are probably some weird setuptools mechanisms I’m not understanding correctly, I will need to dig into it more.
   
   Edit: Oh wait we do have `include_package_data` set. I missed that since that line was declared far away from `package_data`, and it is generally problematic to declare both `package_data` and `include_package_data` together so I naively assumed we didn’t have `include_package_data = true` when I saw `package_data` 🤦
   
   I will probably work on cleaning up the configs then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865445851






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on pull request #16577: Exclude ``yarn.lock`` from built Python wheel file

Posted by GitBox <gi...@apache.org>.
uranusjr commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865445851


   > FIX: The fix is to exclude the yarn.lock by specifying it in the manifest.in
   
   This would also exclude the file from .tar.gz (aka sdist). Is this OK to do?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#discussion_r656081985



##########
File path: airflow/www/compile_assets.sh
##########
@@ -31,4 +31,9 @@ fi
 yarn install --frozen-lockfile
 yarn run build
 
-find package.json yarn.lock static/css static/js -type f | sort | xargs md5sum > "${MD5SUM_FILE}"
+YARN_LOCK_FILE="yarn.lock"
+if [[ -f "$YARN_LOCK_FILE" ]]; then

Review comment:
       We shouldn't be running this script if there is no lock file -- else we have run `yarn install --frozen-lockfile` which will (I hope!) fail.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
uranusjr edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865906553


   Thanks for the clarification. Excluding the file from sdist sounds reasonable to me from the justification above, so +1 to this PR on its own. The only remaining issue I have is why `compile_assets.sh` is currently included in *wheels* right now (it shouldn’t since it’s not listed in `package_data`), and why this PR can successfully exclude the file (it shouldn’t since MANIFEST.in is not supposed to affect wheel content without `include_package_data` set). There are probably some weird setuptools mechanisms I’m not understanding correctly, I will need to dig into it more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865861315


   > @potiuk @uranusjr Did you guys actually try out? I had, unless I missed something (which is also possible :) )
   
   Yep. I tried. I think the main point (and @urunsjr I believe agrees with me on that) is that we should NOT exclude yarn.lock from the sdist package. It's quite useful to have it ithere, because this will allow the users to repeatably build the package on their own using sources. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #16577: Exclude ``yarn.lock`` from release packages

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #16577:
URL: https://github.com/apache/airflow/pull/16577#issuecomment-865862140


   If we include the compiled assets, then we should not include the yarn.lock.
   
   Users don't rebuild form the sdist -- they rebuild from the _source_.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org