You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@airavata.apache.org by "Eroma (JIRA)" <ji...@apache.org> on 2018/06/13 20:30:00 UTC

[jira] [Updated] (AIRAVATA-2826) Helix participant server was stopped and started while experiments are launched and job submissions to Jetstream cluster failed

     [ https://issues.apache.org/jira/browse/AIRAVATA-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Eroma updated AIRAVATA-2826:
----------------------------
    Environment: https://staging.seagrid.org/

> Helix participant server was stopped and started while experiments are launched and job submissions to Jetstream cluster failed
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRAVATA-2826
>                 URL: https://issues.apache.org/jira/browse/AIRAVATA-2826
>             Project: Airavata
>          Issue Type: Bug
>          Components: helix implementation
>    Affects Versions: 0.18
>         Environment: https://staging.seagrid.org/
>            Reporter: Eroma
>            Assignee: Dimuthu Upeksha
>            Priority: Major
>             Fix For: 0.18
>
>
> # Experiments started launching while helix participant stopped and started.
>  # When the helix participant was started particularly jobs to Jetstream failed.
>  # Job submission failed due to environment set up failed in jetstream with error [1] 
> [1]
> org.apache.airavata.helix.impl.task.TaskOnFailException: Error Code : 658d46e9-b08b-46c0-9701-4bf5eeb23134, Task TASK_f4e3eccf-3e03-4d34-9cf0-7028efd09a40 failed due to Failed to setup environment of task TASK_f4e3eccf-3e03-4d34-9cf0-7028efd09a40, net.schmizz.sshj.connection.ConnectionException: [CONNECTION_LOST] Did not receive any keep-alive response for 25 seconds at org.apache.airavata.helix.impl.task.AiravataTask.onFail(AiravataTask.java:102) at org.apache.airavata.helix.impl.task.env.EnvSetupTask.onRun(EnvSetupTask.java:55) at org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:311) at org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:90) at org.apache.helix.task.TaskRunner.run(TaskRunner.java:71) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.airavata.agents.api.AgentException: net.schmizz.sshj.connection.ConnectionException: [CONNECTION_LOST] Did not receive any keep-alive response for 25 seconds at org.apache.airavata.helix.adaptor.SSHJAgentAdaptor.createDirectory(SSHJAgentAdaptor.java:146) at org.apache.airavata.helix.impl.task.env.EnvSetupTask.onRun(EnvSetupTask.java:51) ... 10 more Caused by: net.schmizz.sshj.connection.ConnectionException: [CONNECTION_LOST] Did not receive any keep-alive response for 25 seconds at net.schmizz.keepalive.KeepAliveRunner.checkMaxReached(KeepAliveRunner.java:64) at net.schmizz.keepalive.KeepAliveRunner.doKeepAlive(KeepAliveRunner.java:56) at net.schmizz.keepalive.KeepAlive.run(KeepAlive.java:63)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)