You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/11/21 15:21:07 UTC

[GitHub] [airflow] ashb commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.

ashb commented on issue #6627: [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution.
URL: https://github.com/apache/airflow/pull/6627#issuecomment-557132298
 
 
   My benchmarking results of this change below.
   
   All my tests were of `BashOperator(..., bash_command="true")`. This change helps bring down the average (14.062799955 to 0.9318446075, x15 faster) and the maximum time (from 22.927918 to 1.865752, x12 faster!)
   
   ** Without  change, 40 dag runs**:
   ```
   $ psql airflow -c "select dag_id, task_id, execution_date, state, count(*), min(duration) as min_dur, avg(duration) as avg_dur, max(duration) as max_dur, sum(duration) as running_time, max(end_date)-min(start_date) as total_duration, avg(start_date - queued_dttm) as queue_delay from task_instance where state is not null group by grouping sets ((dag_id, task_id, state), (dag_id,execution_date, state), (dag_id, state)) order by 1,2,3"
   
      dag_id    |   task_id    |     execution_date     |  state  | count |  min_dur  |   avg_dur    |  max_dur  | running_time | total_duration  |   queue_delay
   -------------+--------------+------------------------+---------+-------+-----------+--------------+-----------+--------------+-----------------+-----------------
    example_dag | print_date1  |                        | success |    10 | 13.540218 |   14.2083814 | 14.745729 |   142.083814 | 00:00:17.130217 | 00:00:14.927132
    example_dag | print_date10 |                        | success |    10 | 11.480039 |   13.4748527 | 14.403048 |   134.748527 | 00:01:45.278854 | 00:09:42.822216
    example_dag | print_date11 |                        | success |    10 | 10.084806 |   13.2927993 | 14.148362 |   132.927993 | 00:01:43.667161 | 00:09:44.954266
    example_dag | print_date12 |                        | success |    10 |  7.529389 |   13.1063045 | 14.152888 |   131.063045 | 00:01:41.729084 | 00:09:48.473788
    example_dag | print_date13 |                        | success |    10 | 13.352645 |    14.171775 | 15.294458 |    141.71775 | 00:05:34.40845  | 00:07:12.078185
    example_dag | print_date14 |                        | success |    10 | 12.685846 |   14.1814867 |  15.92092 |   141.814867 | 00:05:54.331766 | 00:07:00.183005
    example_dag | print_date15 |                        | success |    10 | 13.261651 |   13.6241582 |   14.4745 |   136.241582 | 00:06:01.605317 | 00:06:46.927206
    example_dag | print_date16 |                        | success |    10 | 13.463467 |   13.9461268 | 14.327437 |   139.461268 | 00:06:03.748068 | 00:06:43.670468
    example_dag | print_date17 |                        | success |    10 | 12.838915 |   13.7431276 | 14.127847 |   137.431276 | 00:06:16.971633 | 00:06:39.727049
    example_dag | print_date18 |                        | success |    10 | 13.478566 |   16.0986632 | 19.783983 |   160.986632 | 00:06:39.378521 | 00:06:20.19122
    example_dag | print_date19 |                        | success |    10 | 13.427385 |   15.7544223 | 22.927918 |   157.544223 | 00:06:40.99252  | 00:06:52.527796
    example_dag | print_date2  |                        | success |    10 | 13.692921 |   14.3700994 | 14.829447 |   143.700994 | 00:00:46.73697  | 00:00:24.922475
    example_dag | print_date20 |                        | success |    10 | 13.749788 |   15.9622433 |  21.03353 |   159.622433 | 00:06:46.79555  | 00:06:47.494261
    example_dag | print_date21 |                        | success |    10 | 13.533584 |   15.3698928 | 20.041363 |   153.698928 | 00:07:01.543776 | 00:06:40.238809
    example_dag | print_date22 |                        | success |    10 | 13.292851 |   14.3784663 | 15.897209 |   143.784663 | 00:07:07.038294 | 00:06:22.919986
    example_dag | print_date23 |                        | success |    10 | 13.571403 |   14.7570801 | 16.540063 |   147.570801 | 00:06:58.41123  | 00:06:17.252477
    example_dag | print_date24 |                        | success |    10 | 13.698992 |   14.5122418 | 17.094243 |   145.122418 | 00:06:51.823565 | 00:06:14.655425
    example_dag | print_date25 |                        | success |    10 | 11.696784 |   13.6606921 | 14.262454 |   136.606921 | 00:07:14.91126  | 00:06:10.941776
    example_dag | print_date26 |                        | success |    10 | 13.115977 |   13.8040275 | 14.581811 |   138.040275 | 00:07:22.865953 | 00:05:55.113162
    example_dag | print_date27 |                        | success |    10 | 13.203122 |   13.8406964 | 14.061522 |   138.406964 | 00:07:21.852219 | 00:05:50.572565
    example_dag | print_date28 |                        | success |    10 | 13.684101 |   14.1242237 | 14.625835 |   141.242237 | 00:07:52.563805 | 00:05:45.37103
    example_dag | print_date29 |                        | success |    10 | 13.392751 |   13.8983641 | 14.356764 |   138.983641 | 00:07:43.298189 | 00:05:33.6447
    example_dag | print_date3  |                        | success |    10 | 11.072175 |   13.7151706 | 14.351104 |   137.151706 | 00:00:47.050922 | 00:00:40.070479
    example_dag | print_date30 |                        | success |    10 | 13.220995 |   13.6871487 | 14.455784 |   136.871487 | 00:08:08.386634 | 00:05:19.734116
    example_dag | print_date31 |                        | success |    10 | 12.529552 |   13.7031299 |  14.13769 |   137.031299 | 00:08:00.575368 | 00:05:07.899519
    example_dag | print_date32 |                        | success |    10 | 13.318386 |   14.0317772 | 14.622018 |   140.317772 | 00:08:23.871494 | 00:05:02.388743
    example_dag | print_date33 |                        | success |    10 | 13.380067 |   14.1291853 | 14.981017 |   141.291853 | 00:08:24.882959 | 00:04:52.273283
    example_dag | print_date34 |                        | success |    10 | 12.821078 |   13.7153179 | 14.443366 |   137.153179 | 00:08:25.688936 | 00:04:45.993094
    example_dag | print_date35 |                        | success |    10 | 12.781473 |    13.671214 | 14.770308 |    136.71214 | 00:08:54.559094 | 00:04:38.23618
    example_dag | print_date36 |                        | success |    10 | 13.293374 |   14.0982102 | 14.837228 |   140.982102 | 00:08:54.508936 | 00:04:27.575895
    example_dag | print_date37 |                        | success |    10 | 13.657723 |   13.9828665 | 14.399264 |   139.828665 | 00:08:44.443426 | 00:04:24.720492
    example_dag | print_date38 |                        | success |    10 | 12.415427 |   13.4307027 | 14.742275 |   134.307027 | 00:09:11.21312  | 00:04:07.207969
    example_dag | print_date39 |                        | success |    10 | 12.977849 |   13.9103038 | 14.494109 |   139.103038 | 00:10:45.931352 | 00:06:18.113138
    example_dag | print_date4  |                        | success |    10 | 13.027779 |   13.9621166 | 14.589971 |   139.621166 | 00:01:59.062323 | 00:09:15.243934
    example_dag | print_date40 |                        | success |    10 | 12.899318 |   13.6532077 |   14.3119 |   136.532077 | 00:05:36.128265 | 00:07:45.713126
    example_dag | print_date5  |                        | success |    10 | 13.364638 |   13.9671966 | 14.606845 |   139.671966 | 00:01:57.868484 | 00:09:15.995714
    example_dag | print_date6  |                        | success |    10 | 13.454817 |   14.1199325 | 14.922081 |   141.199325 | 00:01:50.984789 | 00:09:18.331126
    example_dag | print_date7  |                        | success |    10 | 13.316182 |   14.0112888 | 14.782732 |   140.112888 | 00:01:49.556653 | 00:09:20.466958
    example_dag | print_date8  |                        | success |    10 | 10.542196 |   13.6500996 | 14.425894 |   136.500996 | 00:00:46.269235 | 00:00:15.901673
    example_dag | print_date9  |                        | success |    10 |  3.733766 |   12.7930044 | 14.344923 |   127.930044 | 00:01:36.054977 | 00:09:49.170081
    example_dag |              | 2018-01-01 00:00:00+00 | success |    40 | 12.695903 |  14.51603295 | 20.041363 |   580.641318 | 00:13:04.943902 | 00:04:04.183146
    example_dag |              | 2018-01-01 00:05:00+00 | success |    40 | 11.072175 | 14.490386525 | 19.999951 |   579.615461 | 00:13:04.031953 | 00:04:01.462174
    example_dag |              | 2018-01-01 00:10:00+00 | success |    40 | 10.542196 |  14.67187005 | 22.927918 |   586.874802 | 00:12:53.483644 | 00:04:01.159616
    example_dag |              | 2018-01-01 00:15:00+00 | success |    40 | 11.696784 |   14.1309867 | 18.610719 |   565.239468 | 00:13:09.216955 | 00:04:06.713355
    example_dag |              | 2018-01-01 00:20:00+00 | success |    40 | 12.838915 |  13.97867895 | 19.082492 |   559.147158 | 00:12:32.573585 | 00:06:25.172815
    example_dag |              | 2018-01-01 00:25:00+00 | success |    40 |  3.733766 |   13.2246647 | 14.763435 |   528.986588 | 00:13:11.267078 | 00:07:08.041724
    example_dag |              | 2018-01-01 00:30:00+00 | success |    40 | 12.786672 |  13.91106905 | 14.638537 |   556.442762 | 00:12:38.990129 | 00:06:58.554466
    example_dag |              | 2018-01-01 00:35:00+00 | success |    40 | 12.415427 |  13.84971725 | 14.589971 |    553.98869 | 00:11:49.876577 | 00:06:52.566075
    example_dag |              | 2018-01-01 00:40:00+00 | success |    40 | 12.685846 |  13.92356865 | 14.829447 |   556.942746 | 00:12:01.941327 | 00:08:30.816995
    example_dag |              | 2018-01-01 00:45:00+00 | success |    40 | 13.027779 | 13.931024725 | 14.770308 |   557.240989 | 00:12:05.666341 | 00:08:49.990763
    example_dag |              |                        | success |   400 |  3.733766 | 14.062799955 | 22.927918 |  5625.119982 | 00:13:13.096719 | 00:06:05.866113
   ```
   
   **With change, 40 dag runs**
   ```
   $ psql airflow -c "select dag_id, task_id, execution_date, state, count(*), min(duration) as min_dur, avg(duration) as avg_dur, max(duration) as max_dur, sum(duration) as running_time, max(end_date)-min(start_date) as total_duration, avg(start_date - queued_dttm) as queue_delay from task_instance where state is not null group by grouping sets ((dag_id, task_id, state), (dag_id,execution_date, state), (dag_id, state)) order by 1,2,3"
   
      dag_id    |   task_id    |     execution_date     |  state  | count | min_dur  |   avg_dur    | max_dur  | running_time | total_duration  |   queue_delay
   -------------+--------------+------------------------+---------+-------+----------+--------------+----------+--------------+-----------------+-----------------
    example_dag | print_date1  |                        | success |    10 | 1.146504 |    1.4268349 | 1.694726 |    14.268349 | 00:00:02.171718 | 00:00:13.903464
    example_dag | print_date10 |                        | success |    10 | 0.593307 |    0.8229632 | 1.042284 |     8.229632 | 00:00:41.72518  | 00:06:17.844413
    example_dag | print_date11 |                        | success |    10 |  0.62847 |    0.8636334 | 1.173789 |     8.636334 | 00:00:40.747711 | 00:06:18.975095
    example_dag | print_date12 |                        | success |    10 | 0.396318 |    0.8446297 | 1.207533 |     8.446297 | 00:00:38.101728 | 00:06:20.102727
    example_dag | print_date13 |                        | success |    10 | 0.845284 |    1.0616951 | 1.417735 |    10.616951 | 00:03:26.884824 | 00:04:13.506178
    example_dag | print_date14 |                        | success |    10 | 0.710254 |    0.9734616 | 1.304478 |     9.734616 | 00:03:30.503875 | 00:04:08.587778
    example_dag | print_date15 |                        | success |    10 | 0.729193 |    0.9978077 | 1.659975 |     9.978077 | 00:03:34.779666 | 00:04:02.680916
    example_dag | print_date16 |                        | success |    10 | 0.736481 |    1.1479496 | 1.660697 |    11.479496 | 00:03:38.17759  | 00:03:58.284443
    example_dag | print_date17 |                        | success |    10 | 0.721791 |     1.089326 | 1.865752 |     10.89326 | 00:03:43.899201 | 00:03:52.045602
    example_dag | print_date18 |                        | success |    10 |  0.82254 |    1.0861933 | 1.602212 |    10.861933 | 00:03:46.563687 | 00:03:45.934179
    example_dag | print_date19 |                        | success |    10 | 0.694369 |    0.9150857 | 1.185493 |     9.150857 | 00:03:52.568222 | 00:04:02.353
    example_dag | print_date2  |                        | success |    10 | 0.806452 |    1.0095171 | 1.230248 |    10.095171 | 00:00:04.82439  | 00:00:20.988461
    example_dag | print_date20 |                        | success |    10 | 0.743303 |    0.8742865 | 0.983655 |     8.742865 | 00:03:53.99485  | 00:03:58.726793
    example_dag | print_date21 |                        | success |    10 | 0.684588 |    0.8367918 | 1.023203 |     8.367918 | 00:03:55.738761 | 00:03:54.74774
    example_dag | print_date22 |                        | success |    10 | 0.589924 |    0.8825712 | 1.029147 |     8.825712 | 00:03:58.548224 | 00:03:50.84068
    example_dag | print_date23 |                        | success |    10 | 0.767636 |    0.8957238 | 0.986097 |     8.957238 | 00:04:04.217391 | 00:03:46.144873
    example_dag | print_date24 |                        | success |    10 |  0.74223 |    0.9251518 | 1.239134 |     9.251518 | 00:04:05.765597 | 00:03:42.429237
    example_dag | print_date25 |                        | success |    10 | 0.594681 |    0.8781399 | 1.121106 |     8.781399 | 00:04:05.490946 | 00:03:39.28868
    example_dag | print_date26 |                        | success |    10 | 0.663151 |    0.8689899 | 1.052992 |     8.689899 | 00:04:09.399829 | 00:03:34.559888
    example_dag | print_date27 |                        | success |    10 | 0.531863 |    0.8011121 | 1.065627 |     8.011121 | 00:04:13.508844 | 00:03:28.977669
    example_dag | print_date28 |                        | success |    10 | 0.623941 |    0.8106821 | 1.027548 |     8.106821 | 00:04:14.959353 | 00:03:25.053474
    example_dag | print_date29 |                        | success |    10 | 0.744605 |    0.8441154 | 0.964904 |     8.441154 | 00:04:11.377597 | 00:03:22.705902
    example_dag | print_date3  |                        | success |    10 | 0.442053 |    0.9375549 | 1.567445 |     9.375549 | 00:00:15.142371 | 00:00:15.495181
    example_dag | print_date30 |                        | success |    10 | 0.675152 |    0.8852712 | 1.107291 |     8.852712 | 00:04:14.102078 | 00:03:18.933734
    example_dag | print_date31 |                        | success |    10 | 0.519751 |     0.843545 | 1.042148 |      8.43545 | 00:04:24.862378 | 00:03:12.841289
    example_dag | print_date32 |                        | success |    10 | 0.483924 |    0.8193908 | 0.998283 |     8.193908 | 00:04:25.487382 | 00:03:08.866348
    example_dag | print_date33 |                        | success |    10 | 0.562085 |    0.9288077 | 1.227881 |     9.288077 | 00:04:23.784358 | 00:03:06.368832
    example_dag | print_date34 |                        | success |    10 | 0.744678 |    1.0033072 | 1.408322 |    10.033072 | 00:04:44.668717 | 00:02:55.888481
    example_dag | print_date35 |                        | success |    10 | 0.652233 |    0.9915692 | 1.152128 |     9.915692 | 00:05:01.192531 | 00:02:48.305163
    example_dag | print_date36 |                        | success |    10 | 0.553713 |     0.933974 | 1.250523 |      9.33974 | 00:05:02.069341 | 00:02:40.600184
    example_dag | print_date37 |                        | success |    10 | 0.688596 |    1.0893281 | 1.532913 |    10.893281 | 00:05:20.416081 | 00:02:29.20499
    example_dag | print_date38 |                        | success |    10 | 0.565303 |    0.9242697 | 1.503941 |     9.242697 | 00:06:15.347112 | 00:04:09.462437
    example_dag | print_date39 |                        | success |    10 | 0.558524 |    0.8705852 | 1.049251 |     8.705852 | 00:03:20.690685 | 00:04:42.223374
    example_dag | print_date4  |                        | success |    10 | 0.581263 |     0.895164 | 1.130376 |      8.95164 | 00:00:41.89009  | 00:05:49.37534
    example_dag | print_date40 |                        | success |    10 |  0.65816 |    0.9504808 | 1.267472 |     9.504808 | 00:03:23.057506 | 00:04:39.000299
    example_dag | print_date5  |                        | success |    10 | 0.615587 |    0.8474517 | 1.051941 |     8.474517 | 00:00:42.64173  | 00:05:50.834835
    example_dag | print_date6  |                        | success |    10 | 0.675339 |    0.9569241 | 1.506232 |     9.569241 | 00:00:42.380933 | 00:05:51.989674
    example_dag | print_date7  |                        | success |    10 |   0.7404 |    0.9408764 | 1.439553 |     9.408764 | 00:00:41.822069 | 00:05:53.02817
    example_dag | print_date8  |                        | success |    10 | 0.514494 |    0.8535814 |  1.28338 |     8.535814 | 00:00:16.014045 | 00:00:12.423858
    example_dag | print_date9  |                        | success |    10 | 0.354399 |    0.7450411 | 0.949153 |     7.450411 | 00:00:35.280334 | 00:06:20.324074
    example_dag |              | 2018-01-01 00:00:00+00 | success |    40 | 0.354399 |  0.903148125 | 1.621628 |    36.125925 | 00:07:43.652252 | 00:02:38.275005
    example_dag |              | 2018-01-01 00:05:00+00 | success |    40 | 0.627125 |  0.967237575 | 1.532913 |    38.689503 | 00:07:31.482163 | 00:02:36.899465
    example_dag |              | 2018-01-01 00:10:00+00 | success |    40 | 0.442053 |   0.94155605 |  1.52581 |    37.662242 | 00:07:09.337594 | 00:02:37.102932
    example_dag |              | 2018-01-01 00:15:00+00 | success |    40 | 0.520419 |  0.896940775 | 1.452449 |    35.877631 | 00:07:34.200548 | 00:02:41.155998
    example_dag |              | 2018-01-01 00:20:00+00 | success |    40 |  0.60063 |  0.878752225 | 1.230248 |    35.150089 | 00:07:14.674988 | 00:03:55.471032
    example_dag |              | 2018-01-01 00:25:00+00 | success |    40 | 0.554367 |  0.981219525 | 1.621297 |    39.248781 | 00:07:12.30197  | 00:04:14.648725
    example_dag |              | 2018-01-01 00:30:00+00 | success |    40 | 0.558524 |    0.9800735 | 1.660697 |     39.20294 | 00:07:18.737654 | 00:04:14.88136
    example_dag |              | 2018-01-01 00:35:00+00 | success |    40 | 0.483924 |    0.9204226 | 1.603143 |    36.816904 | 00:07:25.533756 | 00:04:15.256257
    example_dag |              | 2018-01-01 00:40:00+00 | success |    40 | 0.664791 |     0.977641 | 1.865752 |     39.10564 | 00:07:39.449438 | 00:05:15.616032
    example_dag |              | 2018-01-01 00:45:00+00 | success |    40 | 0.515129 |    0.8714547 | 1.282473 |    34.858188 | 00:07:40.570824 | 00:05:26.655057
    example_dag |              |                        | success |   400 | 0.354399 | 0.9318446075 | 1.865752 |   372.737843 | 00:07:44.01714  | 00:03:47.596186
   (51 rows)
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services