You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ant.apache.org by "Frank E. Weiss" <fr...@well.com> on 2001/09/22 23:10:19 UTC

Is there an Ant Processing Theory?

I really think ant is a good thing. I like the fact that is uses XML,
but I suspect that some users don't really understand XML and where they
can go with it. I also like its cross-platform extensibility via Java
plug-ins.

Aside from that, I've been wondering what depth there is to ant, or if
it is falls into the genre of ad-hoc of tools. Looking through the
documentation I'm a bit disappointed. For example, the assertion "Ant
resolves these dependencies" is a bit of hand waving, IMHO. Isn't there
some science to it?

Perhaps somebody has worked this out. I'd like to get pointers to that
work. In the meantime, I've attached a rough draft to give you an idea
of what I'm looking for.

Re: Is there an Ant Processing Theory?

Posted by Peter Donald <do...@apache.org>.

On Sun, 23 Sep 2001 07:10, Frank E. Weiss wrote:
> Aside from that, I've been wondering what depth there is to ant, or if
> it is falls into the genre of ad-hoc of tools. Looking through the
> documentation I'm a bit disappointed. For example, the assertion "Ant
> resolves these dependencies" is a bit of hand waving, IMHO. Isn't there
> some science to it?

Not really at this stage. Unfortunately not only is some of the documentation 
a bit on the "hand waving" side, so is some of the code. A lot of these 
things will be clarified in Ant2 I hope ... 

maybe if you were to help us get the documentation straight that would be 
easier ;)

> Perhaps somebody has worked this out. I'd like to get pointers to that
> work. In the meantime, I've attached a rough draft to give you an idea
> of what I'm looking for.

looks good... here are my comments inline ...

> <h1>Ant Processing Theory - a draft</h1>
> <p>Date: 22 Sept 2001<br>
> Author: Frank Weiss <a href="mailto:frankw@well.com">frankw@well.com</a></p>
> <p>Ant performs a number of operations. Part of the philosophy of Ant is to
> minimize the set of operations it is responsible for.</p>

I would also mention that mostly the philosophy is to push as much complexity 
as possible into Tasks rather than keeping it in runtime.

> <ol>
>   <li>parameterization - the property tag</li>
>   <li>configuration management - selecting targets to be built - also
>     parameterizing builds</li>
>   <li>dependency cache management - deciding which tasks have up-to-date 
> outputs
>     cached - can also be called build process optimization</li>

Currently Ant doesn't really do anything at this level. We have certain tasks 
that do it but no unified model. Eventually Ant2 will have a unified model 
(hopefully) and that will include "coloring" (see below).

>   <li>task scheduling - sequencing and concurrently performing tasks</li>
> </ol>
> <h3>Graph theory - generating the dependency digraph D</h3>
> <p>Use target[@name] and target[@depends] to construct a digraph D. Each 
> node in D is a target, identified by the target's name. The depends 
> attribute of each target defines the edges, where each&nbsp; edge is 
> incident from (for all targets t1)&nbsp; target t1 to the targets named in 
> t1's depends list).
> Ultimately, this possibly cyclical digraph needs to be reduced to a simple 
> tree, the depth-first traversal of which is the basis for task 
> scheduling.</p>
> <p>Questions:</p>
> <ol>
>   <li>Is D rooted at the default or specified target?</li>
>   <li>If multiple targets are given in the command line, does D have 
>    multiple roots?</li>

Ant doesn't really have a root per-se in it's DAG. However we could say that 
the initial node for a specified run is sort-of a root node ;) As each 
separate target on commandline is a separate starting point it could be 
considered to have multiple roots.

However in the current implementation each run in same JVM will carryover 
certain properties and settings. Hopefully in Ant2 this will not be the case.

> <li>Is D constructed for the entire project or only the smallest subset that
>     contains all targets?</li>

Nope the full DAG is currently created. In Ant2 we will hopefully allow lazy 
construction of DAGs (mainly because the DAG will hopefully span multiple 
build files).

>  <li>Are conditions evaluated while constructing D or only before a target 
>  is fired?</li>

only before target is fired.

> <li>Are conditions used to prune D at construction time?</li>

nope - see above ;)

> <li>Can properties be used to parameterize the construction of D? That is, 
> say there is a function D(p) which constructs a digraph using p as a 
> parameter.
>   Then there exists a p1 and p2 such that p1!=p2 and D(p1)!=D(p2).</li>

No - it has been asked for but rejected a few times. Ant2 will kinda allow 
this if someone wants to implement it but such implementations will never 
be kept in Ants CVS.

>   <li>Since the depends attribute of a target defines a subset of edges in 
> D, is  there a way to create this subset of edges dynamically instead of 
> statically by the set of targets listed in the depends attribute? Is that 
> what filesets do?</li>

Filesets are completely different and just dynamically create a set of data. 
While depends specifies a *static* set of operation groups/targets to execute 
(where operations roughly corespond to tasks).

> </ol>
> <h3>Remove cycles from D.</h3>
> <p>Questions:</p>
> <ol>
>   <li>Are cycles removed statically, before dependency reduction, or as a 
> result of dependency reduction?</li>

Currently cycles are detected before anything executes and an exception is 
thrown. Sub graphs that have already been traversed are detected during 
traversal. In Ant2, detection of cycles and identification of common 
sub-graphs will be done during graph traversal. This is mainly to support 
lazy construction of DAGs.

> <li>Result of removing cycles is a meshed forest or a simple forest?</li>

No idea what that means - sorry. I haven't done graph theory for about 5 
years ;)

>   <li>Is the goal really a tree rooted at the project, whose children are 
> subset of the explicitly given targets?</li>

Well as arcs are directed (being a Directed Acyclic Graph) and assuming you 
start at one node, a tree will be formed that has those properties. Not sure 
if that answers your question? ;)

> </ol>
> <h3>Dependency reduction</h3>
> <p>Dependencies can be used for two purposes:</p>
> <ol>
>   <li>To define a caching scheme whereby the output of a task can be stored 
> to avoid having to run the task repeatedly</li>

this is unfortunately completely up to the task. We have talked about having 
a general dependency "service" which tasks could hook into and managing this 
dependency all centrally. This may happen in long run but not in near future. 

We have also talked about the ability to "color" dependencies with specific 
attributes. For instance .class files have static dependency (ie mapping 
between .java and .class file) but they also may have been colored by options 
used to compile file. For instance you may color the dependency with 
"debug=true". This would mean that if you later set debug to false then the 
javac task would detect that .class file is out of date and recompile it.

I prototyped part of this functionality ages ago but it is a large amount of 
hardish programming (at least for graph-theory weenies like myself). It would 
also require extensive refactoring throughout ants code base and a lot of 
work. So unless someone else steps up this wont be done for ages.

>   <li>To declare the order in which tasks need to be performed</li>
> </ol>
> <p>There are really two rationales for this. Primarily, the <i>make</i> 
> pattern is used to speed the incremental building of a project. 
> Intermediate files are basically caches of a task. The trick is to figure 
> out if the cached output can be used instead of performing the task. 

Right. Currently we requre each task to do it's own dependency management. In 
the future we will be able to centralize dependency management (hopefully) so 
you could do something like the following from within a task.

DependencyService ds = ...;

ds.reset();
ds.addColor( "debug", "true" );
ds.addMapping( new FileExtMapping( ".java", ".class" ) );
ds.addInputSet( (ItemSet)myFileSet );
ds.setScanner( new FileScanner( myDest ) );

ResultSet rs = ds.compile();

if( !rs.isUptodate() )
{
  List filesToCompile = rs.getOutofDateItems();
  ...compile files here...
  rs.updateDependencyCache();
}

or something like that.

> However there is also a secondary rationale, which is modularity. Instead 
> of writing a script that procedurally defines the process, a set of 
> declarations are maintained that define the relationships between the 
> subtasks.&nbsp; (There is a parallel with SQL, where a query, such as 
> SELECT, defines <i>what</i> data is requested, but the RDBMS
> creates the query <i>plan</i> depending on what indexes, etc., are 
> available.)

The problem with this is that tasks are extensible and in reality the runtime 
can not guarentee that it knows all the effects of every task and thus can 
not create a Plan as such.

> The second rational also permits using the same document to build 
> subprojects and to parameterize the build.</p>

The problem with defining reuse of operations in this manner is that it 
virtually mandates other significant changes to ants core. Recursive 
resolution of properties, mutable properties, scope mutability and several 
other complexities would need to be managed by ant. 

The way I would prefer to reuse operations is to define rules in another 
document using a mechanism such as XSLT to perform reuse.

> <h3>Task scheduling</h3>
> <p>In general , this is a DFS of the final D, after removing cycles and 
> reducing dependencies</p>
> <p>Questions:</p>
> <ol>
>   <li>Is any implicit concurrency in the final D used to schedule 
>   tasks?</li>

Unfortunately many people assume that the order of targets in depends target 
is the order in which they are implemented and this is how it executes 
*unless* the targets are a common subgraph that has already been executed. So 
in a way there is almost an implicit set of dependencies between targets 
specified in depends attribute ;( We thought about cleaning this so that 
dependendencies are all explicit but currently we are unsure whether this 
should occur. Until it does occur this is not really viable as many build 
files will break.

>   <li>If the final D is a meshed forest, have cross-links been removed and 
> by what rules?</li>
> </ol>

no idea - whats that mean ? ;)

Anyways if you would be able to document these things better I am sure 
ant-dev would be *very* appreciative. It will probably highlight 
issues/faults with our current model aswell ;) 

-- 
Cheers,

Pete

*------------------------------------------------------*
|  Hlade's Law: If you have a difficult task, give it  |
|     to a lazy person -- they will find an easier     |
|                    way to do it.                     |
*------------------------------------------------------*