You are viewing a plain text version of this content. The canonical link for it is here.
Posted to infrastructure-issues@apache.org by "Robbie Gemmell (JIRA)" <ji...@apache.org> on 2013/05/25 17:41:23 UTC

[jira] [Commented] (INFRA-6177) Many builds encountering starvation on Jenkins

    [ https://issues.apache.org/jira/browse/INFRA-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667113#comment-13667113 ] 

Robbie Gemmell commented on INFRA-6177:
---------------------------------------

I noticed this issue while searching for something, and thought I should comment on the misconception that https://builds.apache.org/job/Qpid-Java-Java-MMS-TestMatrix/ job is hogging resources. While the job can indeed take a very long time in total to complete, it doesn't use an executor for anywhere that time and was specifically set up in such a way as to *not* hog resources, despite this meaning it takes longer to complete than it would otherwise.

It is a matrix job which runs four test combinations one after the other, where each takes 15-45mins depending on the particular test and where it actually runs (they usually 15-25mins on a machine that isnt overloaded at the time). The matrix itself does not take up an executor spot, and once it starts the tests it runs actually then get scheduled seperately (meaning they have to wait further until *they* hit the front of the queue again) and with each test run only being scheduled after the previous finishes (instead of being scheduled in parallel as we could have chosen to do). As a result it is effectively being treated as 4 seperate test jobs that get triggered by the previous completing, and most of the elapsed time for the matrix is simply it waiting on the real tests being scheduled. If we had wanted to hog resources, we could have set all the tests up in a single job which takes 80-180mins to run in one go, but we felt that it would be nicer to other projects to do it this way (since the jobs that actually do take 10+ hrs of executor time annoy us as much as anyone) despite the fact that as you mentioned it can mean the job takes an entire day to run to completion from the point it begins, rather than simply 80-180mins.
                
> Many builds encountering starvation on Jenkins
> ----------------------------------------------
>
>                 Key: INFRA-6177
>                 URL: https://issues.apache.org/jira/browse/INFRA-6177
>             Project: Infrastructure
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>          Components: Jenkins
>            Reporter: Rob Vesse
>
> I've noticed that lately there seems to be an increasing amount of build starvation happening on Jenkins, from what I can glean by what I can see of the build queue this is down to two main issues:
> 1 - General lack of slaves for generic tags
> 2 - Builds running ridiculously long and monopolizing resources
> For example for builds tagged ubuntu there should be 6 servers (ubuntu1-ubuntu6) but right now only ubuntu4 and ubuntu6 are running
> In terms of long running builds there is one in particular I always seem to see hogging resources (https://builds.apache.org/job/Qpid-Java-Java-MMS-TestMatrix/) which when it succeeds takes 22 hours to run which seems excessive.
> While I appreciate the ASF has limited build and infrastructure resources having the build servers up more reliably and addressing excessively long builds (whether by talking to the responsible projects or by imposing some limit on build time) would go a long way to making things smoother for Apache developers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira