You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2019/08/15 17:48:37 UTC

[GitHub] [airflow] darrenleeweber commented on a change in pull request #5825: [AIRFLOW-5218] less polling for AWS Batch status

darrenleeweber commented on a change in pull request #5825: [AIRFLOW-5218] less polling for AWS Batch status
URL: https://github.com/apache/airflow/pull/5825#discussion_r314423519
 
 

 ##########
 File path: airflow/contrib/operators/awsbatch_operator.py
 ##########
 @@ -105,6 +106,7 @@ def execute(self, context):
             self.jobId = response['jobId']
             self.jobName = response['jobName']
 
+            sleep(randint(10, 60))
 
 Review comment:
   - There is already a log, i.e. `self.log.info('AWS Batch Job started: %s', response)`
   - The use of a random interval decreases the chances of exceeding an AWS API throttle limit
     - concern about it is reasonable, it's there as an extra measure to avoid throttle limits when there are 100's or 1000's of concurrent tasks all trying to start up batch jobs at about the same time, so a relatively wide window was selected in the hope that it can stagger the API calls randomly within about a minute, assuming the API throttle limits are assessed per second; the initial delay of at least 10 sec allows the batch job a bit of time to spin up, maybe it could be longer, but there it is.
   - this call to the random-sleep should move to within the wait-method (in a follow-up amended commit)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services