You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/20 17:52:41 UTC

[GitHub] [airflow] davidcaron opened a new pull request #13793: Fix returning last log line in docker operator

davidcaron opened a new pull request #13793:
URL: https://github.com/apache/airflow/pull/13793


   Iterable values in `cli.attach(..., stream=True)` are not necessarily single lines. 
   Lines can be merged together or split because of buffering.
   When this split happens just before the last newline character, the returned value of `execute` is an empty byte string.
   
   Also, fix the return type annotation (the return type is not changed, only the annotation).
   
   I used these files to recreate the issue:
   
   `Dockerfile`
   ```dockerfile
   FROM python:3.8-slim-buster
   WORKDIR /code
   ADD main.py .
   CMD [ "python", "main.py" ]
   ```
   
   `main.py`
   ```python
   from random import randint
   
   for _ in range(randint(1, 100)):
       print(f"output")
   
   print("last_line")
   ```
   
   `script.py`
   ```python
   from docker import APIClient
   
   
   cli = APIClient()
   
   
   def run_image():
       container = cli.create_container(
           command="python main.py",
           host_config=cli.create_host_config(auto_remove=False),
           image="docker-bug",
           tty=True,
       )
   
       lines = cli.attach(container=container["Id"], stdout=True, stderr=True, stream=True)
   
       cli.start(container["Id"])
   
       line = list(lines)[-1]
   
       cli.wait(container["Id"])
   
       print("--- LAST LINES ---")
       print(repr(line))
       print(repr(cli.logs(container=container["Id"], tail=1).strip()))
   
       cli.remove_container(container["Id"])
   
   
   if __name__ == "__main__":
       for i in range(1000):
           run_image()
   ```
   
   When running:
   ```bash
   $ docker build -t docker-bug .
   $ python script.py
   ```
   
   The output will look something like:
   
   ```
   --- LAST LINES ---
   b'output\r\noutput\r\noutput\r\noutput\r\noutput\r\noutput\r\noutput\r\nlast_line\r\n'
   b'last_line'
   --- LAST LINES ---
   b'output\r\noutput\r\noutput\r\noutput\r\noutput\r\noutput\r\noutput\r\noutput\r\noutput\r\nlast_line\r\n'
   b'last_line'
   --- LAST LINES ---
   b'output\r\noutput\r\nlast_line\r\n'
   b'last_line'
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#discussion_r570707630



##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])

Review comment:
       why not strip here also?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#issuecomment-763823410


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, pylint and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/master/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/master/BREEZE.rst) for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better πŸš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#issuecomment-804494619


   This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#discussion_r570735141



##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])
+                else:
+                    ret = self.cli.logs(container=self.container['Id'], tail=1).strip()
 
             if self.auto_remove:
                 self.cli.remove_container(self.container['Id'])
 
             return ret
 
-    def execute(self, context) -> Optional[str]:
+    def execute(self, context) -> Optional[bytes]:

Review comment:
       coincidence someone is doing exactly that: https://github.com/apache/airflow/pull/13536/files




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] davidcaron commented on a change in pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
davidcaron commented on a change in pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#discussion_r570977882



##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])

Review comment:
       I agree it should probably be stripped, but my intention was to keep the current behavior when `do_xcom_push` and `xcom_all` are all true.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] davidcaron commented on pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
davidcaron commented on pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#issuecomment-774046182


   @dstandish Thanks for your review.
   
   If #13536 gets merged, this one can be closed.
   
   I would personally consider this PR a bug fix, and #13536 a breaking change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] davidcaron commented on pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
davidcaron commented on pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#issuecomment-763835615


   Related, but different issue: https://github.com/apache/airflow/pull/13536 (converts return type to `str`, but this PR keeps `bytes`)
   
   I believe `execute` should return a decoded `str`... but I kept the current return type, so there should be no breaking change in this PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#discussion_r570707630



##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])

Review comment:
       why not strip here also?

##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])
+                else:
+                    ret = self.cli.logs(container=self.container['Id'], tail=1).strip()
 
             if self.auto_remove:
                 self.cli.remove_container(self.container['Id'])
 
             return ret
 
-    def execute(self, context) -> Optional[str]:
+    def execute(self, context) -> Optional[bytes]:

Review comment:
       should we perhaps decode utf-8 (assuming that's what comes out of logs, so that we return a string type? sincere question... πŸ€”

##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])
+                else:
+                    ret = self.cli.logs(container=self.container['Id'], tail=1).strip()
 
             if self.auto_remove:
                 self.cli.remove_container(self.container['Id'])
 
             return ret
 
-    def execute(self, context) -> Optional[str]:
+    def execute(self, context) -> Optional[bytes]:

Review comment:
       coincidence someone is doing exactly that: https://github.com/apache/airflow/pull/13536/files




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] dstandish commented on a change in pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
dstandish commented on a change in pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#discussion_r570708462



##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])
+                else:
+                    ret = self.cli.logs(container=self.container['Id'], tail=1).strip()
 
             if self.auto_remove:
                 self.cli.remove_container(self.container['Id'])
 
             return ret
 
-    def execute(self, context) -> Optional[str]:
+    def execute(self, context) -> Optional[bytes]:

Review comment:
       should we perhaps decode utf-8 (assuming that's what comes out of logs, so that we return a string type? sincere question... πŸ€”




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] davidcaron commented on a change in pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
davidcaron commented on a change in pull request #13793:
URL: https://github.com/apache/airflow/pull/13793#discussion_r570980429



##########
File path: airflow/providers/docker/operators/docker.py
##########
@@ -269,14 +269,17 @@ def _run_image(self) -> Optional[str]:
             # duplicated conditional logic because of expensive operation
             ret = None
             if self.do_xcom_push:
-                ret = self.cli.logs(container=self.container['Id']) if self.xcom_all else line.encode('utf-8')
+                if self.xcom_all:
+                    ret = self.cli.logs(container=self.container['Id'])
+                else:
+                    ret = self.cli.logs(container=self.container['Id'], tail=1).strip()
 
             if self.auto_remove:
                 self.cli.remove_container(self.container['Id'])
 
             return ret
 
-    def execute(self, context) -> Optional[str]:
+    def execute(self, context) -> Optional[bytes]:

Review comment:
       Yes, and I think this would make more sense, but again I wanted to avoid a breaking change...
   
   I would actually be happy with this other PR being merged, as it now includes the fix in my PR.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] closed pull request #13793: Fix returning last log line in docker operator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #13793:
URL: https://github.com/apache/airflow/pull/13793


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org