You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@maven.apache.org by Dan Fabulich <da...@fabulich.com> on 2009/11/10 22:50:13 UTC

Multi-machine support was: Proposal after-the-fact: Experimental multithreading support

Jorg Heymans wrote:

> Would the multithreading feature make it easier or harder to implement
> something like [1], ie distribute the different modules of a reactor
> build across different machines ? Or is it completely unrelated you
> think ?

No harder, but not much easier. With that said, I do have some ideas for
you.

Take a look at
http://svn.apache.org/repos/asf/maven/maven-3/branches/MNG-3004/maven-core/src/main/java/org/apache/maven/lifecycle/DefaultLifecycleExecutor.java

Currently the unit of work is a CallableBuilder. In principle you could
imagine serializing it, launching it on a remote machine and gathering the
artifacts locally. But its members include some objects that may be very
tricky to serialize, including MavenSession and MavenProject. This seems
like a challenge.

Also, in the thread you linked, Kohsuke points out a related problem that
will make your job hard. Today, Maven treats compiled artifacts in a
multi-module build very differently from artifacts built one at a time.

Suppose you have two projects X and Y where X depends on Y. X --> Y

If you use Maven to run a build in the Y directory, then launch a second
Maven process to run a build in the X directory, X will depend on Y.jar in
the local repository.

But if you do a reactor build from the root and build both projects at
once, X will depend directly on ../Y/target/classes; the artifact will be
resolved from the reactor.

This behavior is required so you can run "mvn compile" from the root; no
jar is created when you do this, so there's no way for X to depend on
Y.jar. If you're running "mvn install" from the root, this behavior is
probably unnecessary.

As Kohsuke points out, you'd have to do something rather tricky for Maven
to resolve artifacts from the reactor when the artifacts are on another
machine!

But that leads me to think that the moral of the story is not to run a
multi-machine reactor build, but to simply build the isolated projects on
multiple machines.

Clearly you won't be able to run a "mvn compile" on box 1 and have box 2
consume those classes, so you'll need to do "mvn deploy", deploying those
artifacts to a repository that all of the boxes can see. (You might use a
repository manager like Nexus or Archiva to handle this.)

It should be possible to orchestrate this by using Maven's
ProjectDependencyGraph. You can get a copy of the project dependency
graph off of the MavenSession object in Maven 3.0. Any Maven plugin can
get access to the MavenSession object via the ${session} plugin parameter
expression; just call session.getProjectDependencyGraph().

The ProjectDependencyGraph will give you a list of sorted projects, as
well as the getDownstreamProjects and getUpstreamProjects methods. You
can start by launching all projects with no upstream projects on N
machines. When each project succeeds, you can launch all downstream
projects whose upstream projects are finished. (See the code linked above
for an example of this.)

(Note that you can't just launch all downstream projects, because they may
have unfinished upstream projects. Suppose X depends on Y and Z, and Y
finishes first. You can't launch X yet, because Z isn't done, so just
skip X. When Z is done, all of X's dependencies are done, so now you can
launch X.)

Hope that helps!

-Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org