You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/18 13:05:08 UTC

[GitHub] [airflow] potiuk opened a new pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

potiuk opened a new pull request #9378:
URL: https://github.com/apache/airflow/pull/9378


   ---
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Target Github ISSUE in description if exists
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
ashb commented on pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#issuecomment-646019785


   @potiuk As a general request when fixing bugs: please could you link to the original PR that this is fixing for in the PR body?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442214812



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       good question !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442214500



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       Because we are configuring a lot of that via `[webserver]` section and some of it also via breeze args I think.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442212373



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       Oh wait.. but that means it will overwrite existing cmd args?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442270993



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       > I used the env var approach because that is the easiest way if you do not have access to the env directly.
   
   @feluelle @ashb -> the ENV approach is good to go. Seems this has the right characteristics.  - it will take precedence over other methods (conf/commandline) but you can still specify other options. So I think for CI image it's good to go -  but for production image definitely this approach is better:
   
   ```
   export GUNICORN_CMD_ARGS="${GUNICORN_CMD_ARGS:-} --worker-tmp-dir /dev/shm"
   ```
   
   I also decided to use /dev/shm as suggested by https://docs.gunicorn.org/en/stable/faq.html#how-do-i-avoid-gunicorn-excessively-blocking-in-os-fchmod:




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#issuecomment-646034614


   > @potiuk As a general request when fixing bugs: please could you link to the original PR that this is fixing for in the PR body?
   
   Sure. Will do it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#issuecomment-646057378


   > @potiuk As a general request when fixing bugs: please could you link to the original PR that this is fixing for in the PR body?
   
   @ashb -> updated with reference.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442227174



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       Can I ask a different question: why do we need to mount /tmp from the host?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
ashb commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442231439



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       Other option. Set this in the entrypoint as:
   
   
   ```
   export GUNICORN_CMD_ARGS="${GUNICORN_CMD_ARGS:-} --worker-tmp-dir /opt/airflow/tmp"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] feluelle commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
feluelle commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442216176



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       I used the env var approach because that is the easiest way if you do not have access to the env directly.
   
   If we would just pass `--worker-tmp-dir` as a breeze cli argument that would probably be better to handle?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #9378:
URL: https://github.com/apache/airflow/pull/9378


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442247207



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       It's ok. But I change it to /dev/shm -> just tested that it works nicely




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on a change in pull request #9378: Fixed crashing webserver after /tmp is mounted from the host

Posted by GitBox <gi...@apache.org>.
potiuk commented on a change in pull request #9378:
URL: https://github.com/apache/airflow/pull/9378#discussion_r442266574



##########
File path: Dockerfile.ci
##########
@@ -315,6 +315,9 @@ WORKDIR ${AIRFLOW_SOURCES}
 
 ENV PATH="${HOME}:${PATH}"
 
+# Needed to stop Gunicorn from crashing when /tmp is now mounted from host
+ENV GUNICORN_CMD_ARGS="--worker-tmp-dir /opt/airflow/tmp"

Review comment:
       @ashb - the root cause was the removal of Java from the CI image. 
   
   @mik-laj spend quite some time after it was removed to make it works and that involved mapping /tmp from host to both Breeze image and to Java run via the docker image and forwarded docker socket.
   
   This is all because we have automated system tests in Google Provder. Previously we just run java in Breeze when we needed it  and we used java to prepare packages to be sent to DataFlow (it needs to package java code and some jars and submit them to DataFlow) 
   
   Previously when java was embedded in the image it "just worked". But when you removed it from the image (which was a good thing BTW.) we needed to find another solution. We used the approach that I piloted with the gcloud CLI - where we have script that runs external tool via Docker run in the docker engine via the forwarded socket. The problem with that was that we had to pass some of the temporary files we created in Python unit tests to the java command line (that was also similar problem for other code ours where we uploaded temporary files to GCS via gcloud command line). 
   
   The easiest way was to map ${AIRFLOW_SOURCES}/tmp to both - the CI airflow container  and to all the tools we run via the forwarded socket. This way the /tmo (we also did it for  /opt/airflow) is shared between the CI container and the gcloud, java, terraform .and all the other tools which we now do not have to embed into the CI image - they are all pulled on demand via docker.
   
   Since we switched to run those tools "docker via 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org