You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Kiran Kumar Maturi (Jira)" <ji...@apache.org> on 2022/04/20 06:10:00 UTC

[jira] [Assigned] (HBASE-18045) Add ' -o ConnectTimeout=10' to the ssh command we use in ITBLL chaos monkeys

     [ https://issues.apache.org/jira/browse/HBASE-18045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kiran Kumar Maturi reassigned HBASE-18045:
------------------------------------------

    Assignee: Kiran Kumar Maturi

> Add ' -o ConnectTimeout=10' to the ssh command we use in ITBLL chaos monkeys
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-18045
>                 URL: https://issues.apache.org/jira/browse/HBASE-18045
>             Project: HBase
>          Issue Type: Improvement
>          Components: integration tests
>            Reporter: Michael Stack
>            Assignee: Kiran Kumar Maturi
>            Priority: Trivial
>
> Monkeys hang on me in long running tests. I've not spent too much time on it since it rare enough but I just went through a spate of them. When monkey kill ssh hangs, all killing stops which can give a false sense of victory when you wake up in the morning and your job 'passed'. I also see monkeys kill all servers in a cluster and fail to bring them back which causes job fail as no one is serving data. The latter may actually be another issue but for the former, I've  had some success adding  -o ConnectTimeout=10 as an option on ssh. You can do it easily enough via config but this issue is to suggest that we add it in code.
> Here is how you add it via config if interested:
> <property >
> <name>hbase.it.clustermanager.ssh.opts</name>
> <value> -o ConnectTimeout=10 </value>
> </property >



--
This message was sent by Atlassian Jira
(v8.20.7#820007)