You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "dstandish (via GitHub)" <gi...@apache.org> on 2023/02/27 17:54:52 UTC

[GitHub] [airflow] dstandish commented on pull request #28336: Fixed hanged KubernetesPodOperator

dstandish commented on PR #28336:
URL: https://github.com/apache/airflow/pull/28336#issuecomment-1446786308

   Hi, I dismissed my old review, so it's not blocking.
   
   I do have a suggestion though I'm sorry if it's a bit late in the game.  And maybe it doesn't have to be done in this PR.  
   
   But so the thing that stuck out to me when looking at this is, we do a kube api call (in logs_available) every chunk in the log stream.  This seems like it could result in a lot of calls and depending on how many such processes on the cluster could cause problems.  Just a hunch I guess.  But so it would seem to me that to avoid this, perhaps you could run the `logs_available` check in a thread, just have it run periodically, like once every 30 seconds or something, and then when it returns false, just set a `stop` boolean on the consumer so that it knows to exit the loop.  This decouples the checking from the log stream so that you that checking does not increase in response to log volume.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org