You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Aakcht (via GitHub)" <gi...@apache.org> on 2023/02/01 08:52:05 UTC

[GitHub] [airflow] Aakcht opened a new issue, #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Aakcht opened a new issue, #29282:
URL: https://github.com/apache/airflow/issues/29282

   ### Apache Airflow Provider(s)
   
   ssh
   
   ### Versions of Apache Airflow Providers
   
   apache-airflow-providers-ssh>=3.3.0
   
   ### Apache Airflow version
   
   2.5.0
   
   ### Operating System
   
   debian "11 (bullseye)"
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   I have an SSH operator task where the command can take a long time. In recent SSH provider versions(>=3.3.0) it stopped working, as I suspect it is because of #27184 . After this change looks like the timeout is 10 seconds, and after there is no output provided through SSH for 10 seconds I'm getting the following error:
   
   ```
   [2023-01-26, 11:49:57 UTC] {taskinstance.py:1772} ERROR - Task failed with exception
   Traceback (most recent call last):
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/ssh/operators/ssh.py", line 171, in execute
       result = self.run_ssh_client_command(ssh_client, self.command, context=context)
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/ssh/operators/ssh.py", line 156, in run_ssh_client_command
       exit_status, agg_stdout, agg_stderr = self.ssh_hook.exec_ssh_client_command(
     File "/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/ssh/hooks/ssh.py", line 521, in exec_ssh_client_command
       raise AirflowException("SSH command timed out")
   airflow.exceptions.AirflowException: SSH command timed out
   ```
   
   
   At first I thought that this is ok, since I can just set `conn_timeout` extra parameter in my ssh connection. But then I noticed that this parameter from the connection is not used anywhere - so this doesn't work, and you have to modify your task code to set the needed value of this parameter in the SSH operator. What's more, even even with modifying task code it's not possible to achieve the previous behavior(when this parameter was not set) since now it'll be set to 10 when you pass None as value.
   
   ### What you think should happen instead
   
   I think it should be possible to pass timeout parameter through connection extra field for ssh operator (including None value, meaning no timeout).
   
   ### How to reproduce
   
   Add simple DAG with sleeping for more than 10 seconds, for example:
   
   ```python
   
   #  this DAG only works for SSH provider versions <=3.2.0 
   
   from airflow.models import DAG
   from airflow.contrib.operators.ssh_operator import SSHOperator
   from airflow.utils.dates import days_ago
   
   from airflow.operators.dummy import DummyOperator
   
   args = {
       'owner': 'airflow',
       'start_date': days_ago(2),
   }
   
   dag = DAG(
       default_args=args,
       dag_id="test_ssh",
       max_active_runs=1,
       catchup=False,
       schedule_interval="@hourly"
   )
   task0 = SSHOperator(ssh_conn_id='ssh_localhost',
                       task_id="test_sleep",
                       command=f'sleep 15s',
                       dag=dag)
   
   task0
   ```
   
   Try configuring `ssh_localhost` connection to make the DAG work using extra conn_timeout or extra timeout (or other) parameters.
   
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Aakcht commented on issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Posted by "Aakcht (via GitHub)" <gi...@apache.org>.
Aakcht commented on issue #29282:
URL: https://github.com/apache/airflow/issues/29282#issuecomment-1473973114

   > Btw prior to my fix, timeout where essentially not working at all, i.e same as "infinity"
   
   Yeah, that's why it affected me - my DAGs expected infinite timeout, and suddenly it became 10 seconds, and I wasn't able to restore the previous behavior through connection parameters. So please do not remove possibility to set "infinity" through connection parameters in your new PR 😅


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] punx120 commented on issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Posted by "punx120 (via GitHub)" <gi...@apache.org>.
punx120 commented on issue #29282:
URL: https://github.com/apache/airflow/issues/29282#issuecomment-1473963126

   well `conn_timeout` makes sense to be on `SSHHook`, but command timeout is different and it makes sense to have different timeout per operator. I'll try to look at a PR over the week end.
   
   Btw prior to my fix, timeout where essentially not working at all, i.e same as "infinity"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Aakcht commented on issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Posted by "Aakcht (via GitHub)" <gi...@apache.org>.
Aakcht commented on issue #29282:
URL: https://github.com/apache/airflow/issues/29282#issuecomment-1473954726

   @punx120 Interesting use case, I didn't think of it. However as I see other parameters also work the same way(`ssh_conn_id`/`conn_timeout`/`banner_timeout`), so I'm not sure if it's correct or widely supported. I think the maintainers should have more insight into this, so you could try submitting another issue/discussion (or PR) for this case to get the discussion going with the maintainers.
   
   Yeah, and as for `conn_timeout` I guessed as much, so in my PR I left it as it is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis closed issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis closed issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator
URL: https://github.com/apache/airflow/issues/29282


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] Taragolis commented on issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Posted by "Taragolis (via GitHub)" <gi...@apache.org>.
Taragolis commented on issue #29282:
URL: https://github.com/apache/airflow/issues/29282#issuecomment-1411765848

   > ```from airflow.contrib.operators.ssh_operator import SSHOperator```
   
   BTW `airflow.contrib` deprecated since Airflow 2.0: use `from airflow.providers.ssh.operators.ssh import SSHOperator`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] punx120 commented on issue #29282: Ssh connection extra parameter conn_timeout doesn't work with ssh operator

Posted by "punx120 (via GitHub)" <gi...@apache.org>.
punx120 commented on issue #29282:
URL: https://github.com/apache/airflow/issues/29282#issuecomment-1473863561

   I wrote the PR you refer too. One issue I have now is that the timeout cannot be specified per SSH Operator, but only per Hook.
   
   In my  dag, I usually create the hook once, and then use it in different operators:
   ```
   hook = SSHHook(...)
   op1 = SSHOperator(hook, cmd_timeout=20)
   op2 = SSHOperator(hook, cmd_timeout=60)
   ```
   I think there's value in having cmd_timeout per operator. You also mention `conn_timeout `, not sure if it's used or not, but from the name, it should only be used to establish the connection.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org