You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "Cameron Goodale (Updated) (JIRA)" <ji...@apache.org> on 2012/02/24 06:14:48 UTC

[jira] [Updated] (OODT-383) Workflow Manager Client - Add Connection Limit Option

     [ https://issues.apache.org/jira/browse/OODT-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cameron Goodale updated OODT-383:
---------------------------------

    Attachment: modscag-v2-job-runner.py

Example of a python script that submits jobs to workflow, and checks the number of ESTABLISHED connections to the workflow manager running on localhost:9001.
                
> Workflow Manager Client - Add Connection Limit Option
> -----------------------------------------------------
>
>                 Key: OODT-383
>                 URL: https://issues.apache.org/jira/browse/OODT-383
>             Project: OODT
>          Issue Type: New Feature
>          Components: workflow manager
>         Environment: centOS 5/6
>            Reporter: Cameron Goodale
>            Assignee: Cameron Goodale
>            Priority: Minor
>         Attachments: modscag-v2-job-runner.py
>
>
> When using the wmgr-client to run thousands of jobs it is pretty easy to overwhelm the xml-rpc connection pool to the workflow manager.  I was using a simple python script to submit 10K jobs and the workflow manager couldn't handle the jobs quickly enough and many jobs were dropped as a result.
> One fix I implemented in my Python code was to use lsof to check the number of ESTABLISHED connections to the workflow manager.  If the workflow manager had more than say 30 connections, my program would go to sleep and try submitting jobs later.
> I would like to enhance the wmgr-client shell script with an option to limit the number of connections to the wmgr, by default this limit would not be set.
> If the connection limit is reached the wmgr-client would sleep for 10 seconds, and re-check the number of connections.  This loop would continue until the number of connections dropped below the specified limit.  Once the connection count drops below the target number, the wmgr-client would resume submitting jobs to the wmgr.
> On my production server I was using lsof to gather the number of connections to the wmgr.  I am not sure if we can always rely on lsof being installed on all machines, so we might need to use a more universal method (maybe in Java).
> here is the lsof command I used with some grep and wc sprinkled in:
> {{/usr/sbin/lsof -i :9001 | grep ESTABLISHED | wc}}
> This assumes you are running wmgr on localhost:9001 and lsof is installed at /usr/sbin/lsof
> Any other thoughts or ideas to work this out would be appreciated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira