You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/01/25 09:14:11 UTC
[GitHub] [airflow] alittlesliceoftom opened a new issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
alittlesliceoftom opened a new issue #13888:
URL: https://github.com/apache/airflow/issues/13888
The Kube pod monitor runs every second (+overhead) as referenced below:
https://github.com/apache/airflow/blob/31b956c6c22476d109c45c99d8a325c5c1e0fd45/airflow/kubernetes/pod_launcher.py#L137
Whilst this is ok in many situations, any slow, long running job ends up with excessive logs, it would be useful to be able to pass through an argument to set the log refresh frequency.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-766670553
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] alittlesliceoftom edited a comment on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
alittlesliceoftom edited a comment on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-891757648
Hey @kdunee a very crude one for #14259, I just replace the command I use with one that runs an echo every 30 secs in addition to the main task.
So I prepend this boilerplate to my kube pod calls...
`cmd_boilerplate = "while true; do echo heartbeat; sleep 30; done &"`
Then my KubePodCalls look like:
```
kubernetes_pod.KubernetesPodOperator(
name="name",
# ? The ID specified for the task.
task_id="task_id",
cmds=bash,
arguments=[
"-c",
f"{cmd_boilerplate} do your thing here",
],
get_logs=True,
**default_kube_args,
)
```
Publishing a bad answer to the internet in hopes of getting a better one :D
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] alittlesliceoftom commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
alittlesliceoftom commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-833490622
Hi both,
Thanks so much for the response. Reading your point @jedcunningham , you're right and I don't get that log return.
I think I actually should have referenced https://github.com/apache/airflow/blob/31b956c6c22476d109c45c99d8a325c5c1e0fd45/airflow/kubernetes/pod_launcher.py#L156
As this is what is called when get_logs = False.
I actually don't have the issue now as I now always get_logs=True.
That said I think there may be benefit in configurability on the sleep time in the condition that you are not getting logs, if only to keep the airflow logs a bit shorter. Typically I was using get_logs=False as a hack in response to #14259 . I now do a different hack so I can get the logs.
I think there is some benefit in making the 2s state update configurable , but it might be marginal. I suspect most users want to get logs in most situations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] alittlesliceoftom commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
alittlesliceoftom commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-891757648
Hey @kdunee a very crude one for #14259, I just replace the command I use with one that runs an echo every 30 secs in addition to the main task.
So I prepend this boilerplate to my kube pod calls...
`cmd_boilerplate = "while true; do echo heartbeat; sleep 30; done &"`
Then my KubePodCalls look like:
```
kubernetes_pod.KubernetesPodOperator(
name="name",
# ? The ID specified for the task.
task_id="task_id",
cmds=bash,
arguments=[
"-c",
f"{cmd_boilerplate} do your thing here",
],
get_logs=True,
**default_kube_args,
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jedcunningham commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
jedcunningham commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-813712563
@alittlesliceoftom, are you seeing a bunch of occurrences of this log line?
https://github.com/apache/airflow/blob/c73052fcb1d39dd8fa4012721f025c375f67f72c/airflow/providers/cncf/kubernetes/utils/pod_launcher.py#L141
We follow the log stream by passing `follow=True` to `read_namespaced_pod_log`, so unless the connection keeps getting reset, we shouldn't be hitting that sleep all that often.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-766670553
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] jhtimmins commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
jhtimmins commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-833228966
@alittlesliceoftom Following up on this. Are you still interested in working on this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kdunee commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
kdunee commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-899438075
Thanks @alittlesliceoftom! As bad at it is, I'm afraid it's the best answer on the internet and the only one that helped me ;)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] kdunee commented on issue #13888: Kubernetes Launchers Log Check in Time is Not Configurable - Excess Logs Slow Airflow
Posted by GitBox <gi...@apache.org>.
kdunee commented on issue #13888:
URL: https://github.com/apache/airflow/issues/13888#issuecomment-887386830
@alittlesliceoftom Hi, can you please share what hack are you using? I'm currently considering disabling `get_logs` to avoid https://github.com/apache/airflow/issues/14259? ;)
> I now do a different hack so I can get the logs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org