You are viewing a plain text version of this content. The canonical link for it is here.
Posted to m2-dev@maven.apache.org by jv...@apache.org on 2005/01/22 17:11:57 UTC
cvs commit: maven-components/maven-core/src/site/apt/lifecycle goal-resolution.apt lifecycle-phases.apt lifecycle-phases.xml lifecycle.apt
jvanzyl 2005/01/22 08:11:57
Added: maven-core/src/site/apt/lifecycle goal-resolution.apt
lifecycle-phases.apt lifecycle-phases.xml
lifecycle.apt
Removed: maven-core/src/site/apt goal-resolution.apt
lifecycle-phases.apt lifecycle.apt
Log:
Revision Changes Path
1.1 maven-components/maven-core/src/site/apt/lifecycle/goal-resolution.apt
Index: goal-resolution.apt
===================================================================
---
Goal Resolution Discussion
---
The Maven Team
---
21-Sept-2004
---
This document is intended to be a working memory of sorts for the goal
resolution algorithm used in Maven-2. It is not intended to supplant the
living bible of Maven, the lifecycle.apt document. Instead it is meant to
add depth to the goal resolution step of the lifecycle and provide a place
for discussing the dirty details of its implementation.
*Conceptual Resolution
Conceptually, goal resolution must take place in a particular order in order
to preserve the encapsulation of the goal and it's implied requirements as a
single operation. Obviously, since implied requirements are themselves goals,
this is a recursive definition. This model is further complicated by the fact
that such implied requirements can be derived from two sources: a mojo's
declared pre-requisites (non-optional, these are required for correct
operation of the mojo itself), and any goal decorations which may have been
declared in a project-specific manner via the project POM.
In general, the following should be the combined outcome of all implied
requirements.
+-----+
[ main-preGoal* ] [ prereq ]* main-goal [ main-postGoal* ]
+-----+
Note that each of the elements in this formula is a goal in its own right,
and will therefore be subject to resolution (taking the place of <<<main-goal>>>
above) and substitution into the parent (replace <<<main-preGoal>>> with the
list of goals implied by <<<main-preGoal>>> <in correct order>).
*Functional Resolution
The fact we must merge two sources of implied-goal information in order to
construct a complete execution chain complicates goal resolution beyond simple
use of a directed acyclical graph (DAG). This is especially true because of the
differing lifecycle and scope of the two sources. One is the plugin
descriptor, which has a lifecycle arguably longer than the JVM lifecycle
itself, and system-wide scope (provided plugin versioning is tracked). The
other is the project's POM, which has an extremely short lifecycle (probably
roughly equivalent to the maven-session lifecycle, which is <not> the same as
the JVM lifecycle in some embedded use cases), and only project scope (don't
want one POM's decorations polluting the builds of other POMs). While the
plugin descriptor information may be cached - even to disk - the POM goal
information must be rebuilt for each maven-session.
On the other hand, one very important distinction between information derived
from plugin descriptors and information from POMs is that POMs lack the
ability to declare new goals. This means that in theory we should be able to
clone a DAG that describes all of the conventional requirements for all of the
plugins referenced, and simply add inter-goal relationships to account for any
extra decoration from the POM. When the session is over, we can then discard
the modified DAG, retaining the plugin-derived DAG for future use in memory or
on disk.
<NOTE:> There are other reasons to be very careful when caching the DAG to
disk, notably the handling and updating of -SNAPSHOT plugins.
*Algorithm
Here is the current algorithm implemented by the GoalResolutionPhase:
<NOTE:> The separation of the plugin resolution step from the process of
actually building the execution chain is new, and has not yet been implemented.
However, this appears to be a required separation since the DAG cannot
function properly until all plugins - and consequently the prereqs implied by
them - are resolved. So, we'll take a second pass later to actually build the
execution list; for now, we'll just resolve plugins.
<<Resolve Plugins:>>
<NOTE:> Can we re-separate this as a plugin-resolution phase, and provide
some sort of reusable tree-visit logic (akin to the topo sorter) where we
could pass in some sort of visitation command to execute per visited goal?
[[1]] Initialize recursion
[[a]] Instantiate Set for caching resolved goals. <resolved>
[[b]] Set current goal to be resolved. <goal>
[]
[[2]] If <goal> is contained in <resolved>, Return.
[[3]] Verify plugin for <goal>.
[[4]] Add <goal> to resolved.
[[5]] Foreach <preGoal> of <goal>,
[[a]] Set <goal> = <preGoal>
[[b]] Call [2]
[]
[[6]] Foreach <prereq> of <goal>,
[[a]] Set <goal> = <prereq>
[[b]] Call [2]
[]
[[7]] Foreach <postGoal> of <goal>,
[[a]] Set <goal> = <postGoal>
[[b]] Call [2]
[]
[[8]] Return.
[]
<<Build Execution Chain:>>
<NOTE:> Visitation logic is eerily similar to the above recursive process.
Can we create a graph visitation utility and pass in some sort of command
to be executed per visited node?
[[1]] Initialize recursion
[[a]] Instantiate LinkedList to hold execution chain. <chain>
[[b]] Instantiate Set for caching visited goals. <visited>
[[c]] Set current goal to be executed. <goal>
[]
[[2]] If <visited> contains <goal>, Return.
[[3]] Process preGoals of <goal>
[[a]] Retrieve List of preGoals bound to <goal>
[[b]] Foreach <preGoal> in <preGoals>
[[i]] Set <goal> = <preGoal>
[[ii]] Call [1]
[]
[]
[[4]] Process prereqs of <goal>
[[a]] Retrieve List of prereqs bound to <goal>
[[b]] Foreach <prereq> in <prereqs>
[[i]] Set <goal> = <prereq>
[[ii]] Call [1]
[]
[]
[[5]] Add <goal> to <chain>
[[6]] Add <goal> to <visited>
[[7]] Process postGoals of <goal>
[[a]] Retrieve List of postGoals bound to <goal>
[[b]] Foreach <postGoal> in <postGoals>
[[i]] Set <goal> = <postGoal>
[[ii]] Call [1]
[]
[]
[[8]] Return.
[]
<NOTE:> Since the user's intent most likely aligned with separate, serial
execution of all goals listed on the command line <in order>, the above
algorithm must be repeated for each <goal> in <user-goals>, with the execution
chains of each being appended to a single list in order to resolve a complete,
end-to-end picture of the current build session.
1.1 maven-components/maven-core/src/site/apt/lifecycle/lifecycle-phases.apt
Index: lifecycle-phases.apt
===================================================================
-----
Maven Lifecycle Phases
-----
The Maven Team
-----
Maven Lifecycle Phases
* generate sources [modello, antlr, javacc]
* process sources [qdox, xdoclet, jam]
* generate resources [modello -> persistence mappings]
* process resources [process persistence mappings]
* compile [plexus component or something else]
* process classes
* generate test sources [generating junit tests]
* process test sources
* generate test resources [plexus component or something else]
* process test resources
* test compile
* test [surefire, testNG, dbunit ...]
* package [jar or making a dist]
* integration tests which required the entire app/assembly to be finished
* install
* deploy
o phases are used as join points (before/around/after)
o mojos specify what phase they are contributing to
o use goal names to protect from changes in the goal specifics themselves
o the phases are really placeholders for various goals to be executed and it would be the phases that the user
could specify on the CLI so something like
+-----+
m2 compile
+-----+
This would invoke the lifecycle up to, and including, the compile phase.
This may seem a bit lengthly but I think it covers anything that could possibly be done and there is enough flexibility
within the lifecycle to accommodate users i.e. we don't need to have a boundless set of lifecycle phases. What people
do in a project is, in fact, pretty limited.
Using the lifecycle also makes it easier to call out to the artifact handlers.
+-----+
<lifecycle>
<phases>
<phase>
<id>generate-sources</id>
</phase>
<phase>
<id>process-sources</id>
</phase>
<phase>
<id>generate-resources</id>
</phase>
<phase>
<id>process-resources</id>
<goal>
<id>resources</id>
</goal>
</phase>
<phase>
<id>compile</id>
<goal>
<id>compiler:compile</id>
</goal>
</phase>
<phase>
<id>process-classes</id>
</phase>
<phase>
<id>generate-test-sources</id>
</phase>
<phase>
<id>process-test-sources</id>
</phase>
<phase>
<id>process-test-resources</id>
</phase>
<phase>
<id>test-compile</id>
<goal>
<id>compiler:testCompile</id>
</goal>
</phase>
<phase>
<id>test</id>
<goal>
<id>surefire:test</id>
</goal>
</phase>
<phase>
<id>package</id>
<goal>
<id>jar:jar</id>
</goal>
</phase>
<phase>
<id>install</id>
<goal>
<id>${type}:install</id>
</goal>
</phase>
<phase>
<id>deploy</id>
<goal>
${type}:deploy
</goal>
</phase>
</phases>
</lifecycle>
+-----+
Here we are using antlr and it is known to maven that this plugin contributes to the generate-sources phase. In
m1 the antlr plugin, in fact, worked by magic i.e. if there was a grammar present antlr would try to fire. This
is a little too tricky and it would make sense to allow users to specify what they want fired as part of their
build process.
+-----+
<project>
...
<plugins>
<id>antlr</id>
<configuration>
<grammars>
<grammar>src/grammars/confluence.g</grammar>
</grammars>
</configuration>
</plugins>
...
</project>
+-----+
So here the user specifies the use of the antlr mojo that takes a grammar and generates sources and maven knows
that this mojo contributes to the <<<generate-sources>>> phase and so executes the antlr mojo inside the
the <<<generate-sources>>> phase. In cases where there is a possible domination problem we can state that the order
in which the the configurations are listed is the dominant order. I think in most cases there will no be any
domination ordering problems but when there could be you have to be able to explicity state what the order
would be.
notes to finish copying:
-> mojos will contain @tags for parameters and the phase they contribute to, a mojo will be limited to
contributing to one phase in the lifecycle.
-> goal resolution within phases
-> file dependency timestamp checking (mhw)
-> strict use of artifact handlers for things like package/install/deploy
this again would be a mapping so a handler could delegate to another utility for packaging
-> how users decorate or completely override the lifecycle, but most of this should be alleviated by a mojo
having a defined place in the lifecycle by telling maven what phase it contributes too. in this way maven
can probably assemble the entire execution chain by looking at the mojos the user has specified for use
in the build process.
1.1 maven-components/maven-core/src/site/apt/lifecycle/lifecycle-phases.xml
Index: lifecycle-phases.xml
===================================================================
<lifecycle>
<phases>
<phase>
<id>generate-sources</id>
</phase>
<phase>
<id>process-sources</id>
</phase>
<phase>
<id>generate-resources</id>
</phase>
<phase>
<id>process-resources</id>
</phase>
<phase>
<id>compile</id>
</phase>
<phase>
<id>process-classes</id>
</phase>
<phase>
<id>generate-test-sources</id>
</phase>
<phase>
<id>process-test-sources</id>
</phase>
<phase>
<id>process-test-resources</id>
</phase>
<phase>
<id>test-compile</id>
</phase>
<phase>
<id>test</id>
</phase>
<phase>
<id>package</id>
</phase>
<phase>
<id>install</id>
</phase>
<phase>
<id>deploy</id>
</phase>
</phases>
</lifecycle>
1.1 maven-components/maven-core/src/site/apt/lifecycle/lifecycle.apt
Index: lifecycle.apt
===================================================================
-----
Maven Lifecycle
-----
The Maven Team
-----
Maven Lifecycle
*Lifecycle Permutations
* Single goal
- session lifecycle[1]
- dep resolution (if POM present)
- dep download[2]
- each downloaded dep is registered in MavenSession to track
snapshot downloads
- goal lifecycle
- goal resolution (we assume here only one goal is resolved)[3]
- download plugin for goal if necessary[2]
- goal execution
+-----+
+------------> goalX
| |
| |
| is the plugin for
| goalX present?
| |
| |
| yes ---+--- no
| | |
| | v
| | download the plugin which
| | contains code for this goal
| | |
| | |
| | v
| | process the plugin descriptor
| | and cache the results
| | |
| | |
| |<---------+
| |
| |
| |
| v
| does goalX have any preGoals?
| |
| yes ---+----- no
| | |
| v |
/----------------/ | foreach(pregoal) |
|each pregoal is | | (1)| (2)| |
|a goal which | +----+ | |
|may be in | | +------->|
|another plugin | | |
/----------------/ | |
| v
| does goalX have any prereqs?
| |
| yes ---+----- no
| | |
| v |
/---------------/ | foreach(prereq) |
|each prereq is | | (1)| (2)| |
|a goal which | | | | |
|may be in | | | |<-------+
|another plugin | | | |
/---------------/ | | |
+------------+ |
| |
| v
| does goalX have any postGoals?
| |
| yes ---+------ no
| | |
| v |
/---------------/ | foreach(postgoal) |
|each postgoal | | (1)| (2)| |
|is a goal which| +-----------+ | |
|may be in | +-------->|
|another plugin | |
/---------------/ |
v
/-----------------------------------------------------------------------------/
| The general form for goal resolution is: |
| |
| [ main-preGoal* ] [ prereq ]* main-goal [ main-postGoal* ] |
| |
| where each goal (whether it be preGoal, prereq, goal, or postGoal) is |
| subject to the same recursive resolution process. |
/-----------------------------------------------------------------------------/
+-----+
* Multiple goals
- session lifecycle[1]
- dep resolution (if POM present)
- dep download[2]
- goal lifecycle (assume that multiple goals are supplied by user)
- goal resolution (assume multiple goals)
- download plugin for goal(s) as necessary[2]
- goal execution
* Implied goals (prereq, preGoal, postGoal)
* Reactor with a single goal
* Reactor with multiple goals
*Use Cases
* where one goal needs to use and modify another goal
eg jcoverage:report does jcoverage:instrument [compiler:compile],
then calls surefire:test (adding classpath entries and changing the test
classes directory parameter)
the does the report
* where a goal does not require a pom.xml (pom-writer, for example).
Here, the pom should NOT be read.
* is there a case where a goal should in fact be executed twice in a
session? Something like "m2 clean foo clean"
- DAG (or DAG-like process) is used for goal resolution, which
should ensure that only explicit multi-calls execute multiple
times.
*Notes
[[1]] POM reading: we always attempt to read the POM but it may not be
present because some goals don't require the presence of a POM like
stub generation, project setup or whatever. So we can flag this state
and throw an exception down the line if a goal in fact requires a
project, or if there is a POM when there shouldn't be one.
[[2]] Artifact downloading: all artifacts downloaded during session
execution should be registered in a session-scope collection of id's
to avoid multiple download of -SNAPSHOT and other artifacts.
[[3]] Goal Resolution: Involves resolving all implied goals (via prereqs,
pre- and post-goal decorations, and resolving/downloading any associated
plugins. Plugin download must be a utility used by this, to ensure that
any goals - specified or resolved - have their associated plugins
downloaded.