You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/07/23 13:59:02 UTC

[GitHub] [airflow] THISisZOLI opened a new pull request, #25253: Change stdout and stderr access mode to append in commands

THISisZOLI opened a new pull request, #25253:
URL: https://github.com/apache/airflow/pull/25253

   Starting Airflow components using the `--stdout` and `--stderr` arguments redirects the output streams to a log file. Currently the access type for these log files is `w+`.
   
   My problem with this access mode is that it can create sparse files. If the log file is moved or deleted, a new one will be created and the stdout/stderr will be written into it, but not from the start of the file. Instead if the original log file was x characters long, the new log file will contain x null bytes, and the new lines will be appended after that. To fix this problem we need to open the file with an access mode that has the `O_APPEND` flag enabled.
   
   Short list of access modes and flags:
   
   | fopen() mode | open() flags |
   | --- | --- |
   | w+ | O_RDWR \| O_CREAT \| O_TRUNC |
   | w | O_WRONLY \| O_CREAT \| O_TRUNC |
   | a | O_WRONLY \| O_CREAT \| O_APPEND |
   | a+ | O_RDWR \| O_CREAT \| O_APPEND |
   
   Based on this I propose changing the access mode for these output files from `w+` to `a`. I don't think there's any need to have read access so `+` can be dismissed.
   
   The problem with this solution is that the `a` mode does not have the `O_TRUNC` flag enabled, which means that the output file is not truncated when a component is restarted. To preserve this behaviour we can manually truncate the files with `handler.truncate(0)`. We could also set the flags manually for the handlers but I don't think that it will make the solution more readable.
   
   ---
   **^ Add meaningful description above**
   
   Read the **[Pull Request Guidelines](https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#pull-request-guidelines)** for more information.
   In case of fundamental code changes, an Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in a newsfragment file, named `{pr_number}.significant.rst` or `{issue_number}.significant.rst`, in [newsfragments](https://github.com/apache/airflow/tree/main/newsfragments).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1199041058

   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1196771429

   From the issue description:
   
   > What’s the issue preventing `w` from being used here?
   
   My problem with this access mode is that it can create sparse files. If the log file is moved or deleted, a new one will be created and the stdout/stderr will be written into it, but not from the start of the file. Instead if the original log file was x characters long, the new log file will contain x null bytes, and the new lines will be appended after that. To fix this problem we need to open the file with an access mode that has the `O_APPEND` flag enabled.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1198370609

   > Hey, sorry for not responding, I didn't have computer access for a few days. Will look into the failed tests soon.
   
   No worries :). It's open source, you fix things when you can!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr merged pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
uranusjr merged PR #25253:
URL: https://github.com/apache/airflow/pull/25253


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1194196317

   Some tests needs fixing (and rebase woudl be great).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1199040970

   Just simple failures so I fixed them.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] uranusjr commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
uranusjr commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1196351785

   What’s the issue preventing `w` from being used here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1193130041

   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better 🚀.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] THISisZOLI commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
THISisZOLI commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1198365643

   Hey, sorry for not responding, I didn't have computer access for a few days. Will look into the failed tests soon.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1193141196

   I really like this approach It makes a lot of sense. I would love others to comment on it but it gets quite a big +1 from me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk commented on pull request #25253: Change stdout and stderr access mode to append in commands

Posted by GitBox <gi...@apache.org>.
potiuk commented on PR #25253:
URL: https://github.com/apache/airflow/pull/25253#issuecomment-1198137072

   CAn you fix the test failure @THISisZOLI ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org