You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "WangMeng (JIRA)" <ji...@apache.org> on 2016/04/01 10:55:25 UTC
[jira] [Updated] (OOZIE-2495) change action status from
ErrorType.NON_TRANSIENT to TRANSIENT when SSH action occurs AUTH_FAILED
occasionally
[ https://issues.apache.org/jira/browse/OOZIE-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
WangMeng updated OOZIE-2495:
----------------------------
Flags: Patch
> change action status from ErrorType.NON_TRANSIENT to TRANSIENT when SSH action occurs AUTH_FAILED occasionally
> ---------------------------------------------------------------------------------------------------------------
>
> Key: OOZIE-2495
> URL: https://issues.apache.org/jira/browse/OOZIE-2495
> Project: Oozie
> Issue Type: Improvement
> Components: action
> Affects Versions: 4.2.0
> Reporter: WangMeng
>
> For SSH action , it failed occasionally with the following exception :
> AUTH_FAILED: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 user@XXX.XX.XX.XXX mkdir -p oozie-oozi/0000067-130808155814753-oozie-oozi-W/sshjob--ssh/ ] | EErrorStream: Warning: Permanently added (RSA) to the list of known hosts.
> While I execute the same command by hand in Oozie server host , it worked.
> Except the incorrect ssh settings ,the reason causing the exception may also be SSH client load is too high when connected, network jitter or others.
> Once connect failed ,oozie will change its status to ErrorType.NON_TRANSIENT ,suspend this action and do not retry it although I have set up retry times.
> When it occurs ,I think changing the action status from ErrorType.NON_TRANSIENT to TRANSIENT may be better , thiscan let action retry automaticly before it be suspended, which can deal with occasionally connect error .
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)