You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Ash Berlin-Taylor (JIRA)" <ji...@apache.org> on 2018/09/13 14:57:00 UTC

[jira] [Commented] (AIRFLOW-3058) Airflow log & multi-threading

    [ https://issues.apache.org/jira/browse/AIRFLOW-3058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613600#comment-16613600 ] 

Ash Berlin-Taylor commented on AIRFLOW-3058:
--------------------------------------------

Just to check the obvious first: are all your nodes running ntp/etc, and do they have the same time?

> Airflow log & multi-threading
> -----------------------------
>
>                 Key: AIRFLOW-3058
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3058
>             Project: Apache Airflow
>          Issue Type: Task
>            Reporter: jack
>            Priority: Major
>         Attachments: 456.PNG, Sni.PNG
>
>
> The airflow log does not show messages in real time when executing scripts with Multi-threading.
>  
> for example:
>  
> The left is the Airflow log time. the right is the actual time of the print in my code. If I would execute the script without airflow the console will show the times on the right.
> !Sni.PNG!
> {code:java}
> 2018-09-13 14:19:17,325] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 14:14:55.230044 Thread: Thread-1 Generate page: #0 run #0 with URL: http://...&cultureid=2&offset=0&limit=1000
> [2018-09-13 14:19:17,325] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 14:14:55.231635 Thread: Thread-2 Generate page: #1 run #0 with URL: http://...&cultureid=2&offset=1000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 14:14:55.233226 Thread: Thread-3 Generate page: #2 run #0 with URL: http://...&cultureid=2&offset=2000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,289] {bash_operator.py:101} INFO - 2018-09-13 14:14:55.234020 Thread: Thread-4 Generate page: #3 run #0 with URL: http://...&cultureid=2&offset=3000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:43.100122 Thread: Thread-1 page 0 finished. Length is 1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:43.100877 Thread: Thread-1 Generate page: #4 run #0 with URL: http://...&cultureid=2&offset=4000&limit=1000
> [2018-09-13 14:19:17,326] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:46.254536 Thread: Thread-3 page 2 finished. Length is 1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:46.255508 Thread: Thread-3 Generate page: #5 run #0 with URL: http://...&cultureid=2&offset=5000&limit=1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:51.096360 Thread: Thread-2 page 1 finished. Length is 1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:51.097269 Thread: Thread-2 Generate page: #6 run #0 with URL: http://...&cultureid=2&offset=6000&limit=1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:53.112621 Thread: Thread-4 page 3 finished. Length is 1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:15:53.113455 Thread: Thread-4 Generate page: #7 run #0 with URL: http://...&cultureid=2&offset=7000&limit=1000
> [2018-09-13 14:19:17,327] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:16:37.345343 Thread: Thread-3 Generate page: #8 run #0 with URL: http://...&cultureid=2&offset=8000&limit=1000
> [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,290] {bash_operator.py:101} INFO - 2018-09-13 14:16:37.701201 Thread: Thread-2 Generate page: #9 run #0 with URL: http://...&cultureid=2&offset=9000&limit=1000
> [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,291] {bash_operator.py:101} INFO - 2018-09-13 14:16:47.283796 Thread: Thread-1 page 4 finished. Length is 1000
> [2018-09-13 14:19:17,328] {base_task_runner.py:98} INFO - Subtask: [2018-09-13 14:19:17,291] {bash_operator.py:101} INFO - 2018-09-13 14:17:27.169359 Thread: Thread-2 page 9 finished. Length is 1000
>  
> {code}
> This never happens when executing regular code.. Happens only with multi-threading. I have some other scripts that the airflow print appears after more than 30 minutes.
>  
>  Check this one:
> hours of delay and then printing everything together. These are not real time. the prints in the log has no correlation to the actual time the command was executed.
>  
> !456.PNG!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)