You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/06/27 14:59:43 UTC

[GitHub] [airflow] pasalkarsachin1 opened a new issue, #24681: Docker is not pushing last line over xcom

pasalkarsachin1 opened a new issue, #24681:
URL: https://github.com/apache/airflow/issues/24681

   ### Apache Airflow Provider(s)
   
   docker
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-docker==2.7.0
   docker==5.0.3
   
   ### Apache Airflow version
   
   2.3.2 (latest released)
   
   ### Operating System
   
   20.04.4 LTS (Focal Fossa)
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   Deployed using docker compose command
   
   ### What happened
   
   Below is my dockeroperator code
   ```
   extract_data_from_presto = DockerOperator(
               task_id='download_data',
               image=IMAGE_NAME,
               api_version='auto',
               auto_remove=True,
               mount_tmp_dir=False,
               docker_url='unix://var/run/docker.sock',
               network_mode="host",
               tty=True,
               xcom_all=False,
               mounts=MOUNTS,
               environment={
                   "PYTHONPATH": "/opt",
               },
               command=f"test.py",
               retries=3,
               dag=dag,
           )
   ```
   Last line printed in docker is not getting pushed over xcom. In my case last line in docker is 
   
   `[2022-06-27, 08:31:34 UTC] {docker.py:312} INFO - {"day": 20220627, "batch": 1656318682, "source": "all",  "os": "ubuntu"}`
   
   However the xcom value returned shown in UI is empty
   <img width="1329" alt="image" src="https://user-images.githubusercontent.com/25153155/175916850-8f50c579-9d26-44bc-94ae-6d072701ff0b.png">
   
   
   
   ### What you think should happen instead
   
   It should have return the `{"day": 20220627, "batch": 1656318682, "source": "all",  "os": "ubuntu"}` as output of return_value
   
   ### How to reproduce
   
   I am not able to exactly produce it with example but it's failing with my application. So I extended the DockerOperator class in my code & copy pasted the `_run_image_with_mounts` method and added 2 print statements
   ```
                   print(f"log lines from attach {log_lines}")
                   try:
                       if self.xcom_all:
                           return [stringify(line).strip() for line in self.cli.logs(**log_parameters)]
                       else:
                           lines = [stringify(line).strip() for line in self.cli.logs(**log_parameters, tail=1)]
                           print(f"lines from logs: {lines}")
   ```
   Value of log_lines comes from this [line](https://github.com/apache/airflow/blob/main/airflow/providers/docker/operators/docker.py#L309)
   
   The output of this is as below. First line is last print in my docker code
   ```
   [2022-06-27, 14:43:26 UTC] {pipeline.py:103} INFO - {"day": 20220627, "batch": 1656340990, "os": "ubuntu", "source": "all"}
   [2022-06-27, 14:43:27 UTC] {logging_mixin.py:115} INFO - log lines from attach ['2022-06-27, 14:43:15 UTC - root - read_from_presto - INFO - Processing datetime is 2022-06-27 14:43:10.755685', '2022-06-27, 14:43:15 UTC - pyhive.presto - presto - INFO - SHOW COLUMNS FROM <truncated data as it's too long>, '{"day": 20220627, "batch": 1656340990, "os": "ubuntu", "source": "all"}']
   [2022-06-27, 14:43:27 UTC] {logging_mixin.py:115} INFO - lines from logs: ['{', '"', 'd', 'a', 'y', '"', ':', '', '2', '0', '2', '2', '0', '6', '2', '7', ',', '', '"', 'b', 'a', 't', 'c', 'h', '"', ':', '', '1', '6', '5', '6', '3', '4', '0', '9', '9', '0', ',', '', '"', 'o', 's', '"', ':', '', '"', 'u', 'b', 'u', 'n', 't', 'u', '"', ',', '', '"', 's', 'o', 'u', 'r', 'c', 'e', '"', ':', '', '"', 'a', 'l', 'l', '"', '}', '', '']
   
   ``` 
   
   From above you can see for some unknown reason `self.cli.logs(**log_parameters, tail=1)` returns array of characters.  This changes was brough as part of [change](https://github.com/apache/airflow/commit/2f4a3d4d4008a95fc36971802c514fef68e8a5d4)  Before that it was returning the data from log_lines
   
   My suggestion to modify the code as below
   ```
                       if self.xcom_all:
                           return [stringify(line).strip() for line in log_lines]
                       else:
                           lines = [stringify(line).strip() for line in log_lines]
                           return lines[-1] if lines else None
   
   ```
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #24681: Docker is not pushing last line over xcom

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #24681:
URL: https://github.com/apache/airflow/issues/24681#issuecomment-1167462483

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] potiuk closed issue #24681: Docker is not pushing last line over xcom

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #24681: Docker is not pushing last line over xcom
URL: https://github.com/apache/airflow/issues/24681


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org