You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@continuum.apache.org by Jared Richardson <Ja...@sas.com> on 2005/04/13 20:40:33 UTC

Continuous Integration at SAS

Hi all,

I spoke with Jason earlier today and he asked me to type up how we are currently using CruiseControl at SAS. The hope is to share the issues that we've run into here so they can be addressed in Continuum.

First, the scope of our code base. We have ~five million lines of code, nearly 300 projects, and currently more than 50 branches. Unlike a SourceForge style code base, our code is very tightly coupled between projects. We have low-level projects, then our mid-tier and finally, we have out end-user solutions. Each level contains products that we sell.

The first level of Continuous Integration we rolled out was just for compiles. We were covering 3 branches for all 300 projects. The CVS server couldn't take the load of diffing 5 million lines of code every five minutes. Actually, it could, but the CVS admins noticed us, so we had to find an alternative because we were slowing down the entire company. 

So the biggest issue was CVS load. 

We setup CVS triggers that create a text file (we call them trigger files) for each project. If any file within a given project tree changes, the text file gets touched (I think it writes the date/time stamp). CruiseControl then monitors the trigger file for changes. If a change has occurred, CC then goes to CVS to get the changes.

This keeps the load on the CVS servers to a minimum. It also keeps the CVS commits decoupled from the CI process. If the CI server is down, it will see the trigger files and start processing the appropriate projects when it restarts.

We considered "live" notification (via sockets for instance) we would've built a much more brittle system. Especially in the first few months, we took the CC box down to redeploy with new options, etc. When the box was down, build notifications would have been missed. Not having the build notifications tightly coupled turned out to be a very robust way to handle the problem. The CVS triggers pile up regardless of whether or not the CI box is available to consume them. There are now other processes at SAS that use the trigger files as well.

Distributed Builds...

We already have an in-house build system that can cluster builds. You ask the system to perform a build and it'll find a box to run your build on. The parallelism is awesome. CruiseControl was able to drive that system via Ant scripts, so we were able to take advantage of that system. 

In looking at Maven 1, we were hoping to be able to cluster a group of Maven servers and let CC distribute the builds to boxes as needed. A JavaSpaces based plugin is in development (on the CC mailing list) that we had hoped to use.

When jobs are sent to a JavaSpace, the client machines consume them when they are ready. This type of model is a very elegant form of load balancing. Faster machines consume build requests faster and so are ready to request another build sooner than a slow machine. Over the course of the day, the faster machine will process many more builds than the slow machine. Builds are "slow" enough that you don't need to queue them up on the client box or try to predict who should get more. Just let them consume them when they are ready.

JavaSpaces has concepts like transactions, so it's fairly mature. I also understand that Sun recently released Jini/JavaSpaces under a more acceptable license.

Let me know if you've got any questions about these.

Jared

-----------------------------------------
Jared Richardson
Jared.Richardson@sas.com
919-531-9136
http://www.sas.com <http://www.sas.com/> 
SAS... The Power to Know(r)
-----------------------------------------

 
"The plan is nothing; the planning is everything."
 
Dwight Eisenhower