You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by Dan Fabulich <da...@fabulich.com> on 2009/11/10 22:50:13 UTC

Multi-machine support was: Proposal after-the-fact: Experimental multithreading support

Jorg Heymans wrote:

> Would the multithreading feature make it easier or harder to implement 
> something like [1], ie distribute the different modules of a reactor 
> build across different machines ? Or is it completely unrelated you 
> think ?

No harder, but not much easier.  With that said, I do have some ideas for 
you.

Take a look at 
http://svn.apache.org/repos/asf/maven/maven-3/branches/MNG-3004/maven-core/src/main/java/org/apache/maven/lifecycle/DefaultLifecycleExecutor.java

Currently the unit of work is a CallableBuilder.  In principle you could 
imagine serializing it, launching it on a remote machine and gathering the 
artifacts locally.  But its members include some objects that may be very 
tricky to serialize, including MavenSession and MavenProject.  This seems 
like a challenge.

Also, in the thread you linked, Kohsuke points out a related problem that 
will make your job hard.  Today, Maven treats compiled artifacts in a 
multi-module build very differently from artifacts built one at a time.

Suppose you have two projects X and Y where X depends on Y. X --> Y

If you use Maven to run a build in the Y directory, then launch a second 
Maven process to run a build in the X directory, X will depend on Y.jar in 
the local repository.

But if you do a reactor build from the root and build both projects at 
once, X will depend directly on ../Y/target/classes; the artifact will be 
resolved from the reactor.

This behavior is required so you can run "mvn compile" from the root; no 
jar is created when you do this, so there's no way for X to depend on 
Y.jar.  If you're running "mvn install" from the root, this behavior is 
probably unnecessary.

As Kohsuke points out, you'd have to do something rather tricky for Maven 
to resolve artifacts from the reactor when the artifacts are on another 
machine!

But that leads me to think that the moral of the story is not to run a 
multi-machine reactor build, but to simply build the isolated projects on 
multiple machines.

Clearly you won't be able to run a "mvn compile" on box 1 and have box 2 
consume those classes, so you'll need to do "mvn deploy", deploying those 
artifacts to a repository that all of the boxes can see.  (You might use a 
repository manager like Nexus or Archiva to handle this.)

It should be possible to orchestrate this by using Maven's 
ProjectDependencyGraph.  You can get a copy of the project dependency 
graph off of the MavenSession object in Maven 3.0.  Any Maven plugin can 
get access to the MavenSession object via the ${session} plugin parameter 
expression; just call session.getProjectDependencyGraph().

The ProjectDependencyGraph will give you a list of sorted projects, as 
well as the getDownstreamProjects and getUpstreamProjects methods.  You 
can start by launching all projects with no upstream projects on N 
machines.  When each project succeeds, you can launch all downstream 
projects whose upstream projects are finished.  (See the code linked above 
for an example of this.)

(Note that you can't just launch all downstream projects, because they may 
have unfinished upstream projects.  Suppose X depends on Y and Z, and Y 
finishes first.  You can't launch X yet, because Z isn't done, so just 
skip X.  When Z is done, all of X's dependencies are done, so now you can 
launch X.)

Hope that helps!

-Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org