You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@airavata.apache.org by "Eroma (Jira)" <ji...@apache.org> on 2020/03/26 18:45:00 UTC
[jira] [Updated] (AIRAVATA-2941) Experiments fail to submit jobs to
HPC cluster queues due to queue reaching the max job limit per user.
[ https://issues.apache.org/jira/browse/AIRAVATA-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eroma updated AIRAVATA-2941:
----------------------------
Labels: gsoc2020 (was: )
> Experiments fail to submit jobs to HPC cluster queues due to queue reaching the max job limit per user.
> -------------------------------------------------------------------------------------------------------
>
> Key: AIRAVATA-2941
> URL: https://issues.apache.org/jira/browse/AIRAVATA-2941
> Project: Airavata
> Issue Type: Bug
> Components: GFac, helix implementation
> Affects Versions: 0.18
> Environment: https://staging.ultrascan.scigap.org & https://ultrascan.scigap.org/
> Reporter: Eroma
> Assignee: Shameera
> Priority: Major
> Labels: gsoc2020
> Fix For: 0.18
>
>
> Currently experiments fail when
> # HPC queue reaches the max job number for the queue.
> # When the job submission fails and HPC sent job submission response [1]airavata tags the experiment as FAILED.
> # The only option for gateway user is to submit the experiment again.
> Fix required is to Airavata to have internal queues or a way to manage such experiments until the HPC queue is available for jobs and not to FAIL the experiment.
>
> [1]
> This example os from stampede2
> ----------------------------------------------------------------- Welcome to the Stampede2 Supercomputer ----------------------------------------------------------------- No reservation for this job --> Verifying valid submit host (login3)...OK --> Verifying valid jobname...OK --> Enforcing max jobs per user...FAILED [*] Too many simultaneous jobs in queue. --> Max job limits for us3 = 50 jobs
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)