You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by Vitaly Stulsky <vi...@yahoo.com> on 2000/05/10 16:29:51 UTC

Patch: JAVAC dependency tracking and multiple src paths handling

At glance.
==========
1) Here is enclosed patch for the JavaC dependency tracking.
2) Also this patch allows using multiple source location in the JAVAC srcdir
option.
3) Implemented new JAVAC option SOURCEPATH.

Full description
================
1) Dependency tracking.
   To switch on this feature you have to use dependence="on" JAVAC option.

   Algorithm performs class files analyzing and backward dependency tracking.
   Example:
     a) A.java depends on B.java. With applied patch if we change B.java, ANT
automatically
     recompiles A.java. In current realization compile list contains only
B.java.
     b) If we will change Task.java in the ant sources, current ant realization
     recompiles only one file. In new realization compile list contains >= 33
files
     (the number of files depends on your own Task extensions).

   For this functionality you have to modify manifest.mf. It is necessary to
include
   lib.jar to the Class-path.

   In next few weeks I'll comment lib.jar source, make proper indentation and
   send sources to the community. Also it is possible to include lib.jar code
   to the ant.jar. If anybody wants to view the sources immediately I'll send
   them on first demand directly to he or she.

2) With applied patch it is possible to use such src dirs:
		<property   name="src.dir"
					value= "com/company/progr/mainprogr;
							com/company/utils/protocol;
							com/company/utils/db;
							com/company/utils/diag;
							com/company/utils/html;
							com/company/utils/sort;
							com/company/utils/string;
							com/company/utils/thread;
							com/company/utils/test;
							com/company/utils/timer
		/>
   All paths must be separated by semicolon ( I estimate Unix users indignation,
but now current
   approach easily extended to usage semicolon on NT platform and colon on
Unix).

3) For every project it is necessary to specify sourcepath option. This option
will be passed
directly to javac. This functionality is follows from multiple src paths. Now
you can
specify not root of all packages in the srcdir option. Root dir has to be
specified in
sourcepath option.
For our project this functionality is very useful, cause we have many reusable
utilities and
want to distribute parts of them with different projects. All utilities classes
are stored
in one upper-level package com.company.utils. For every project we have to
select from 3 to 10
lower-level packages and include them to the distribution. Approach with
multiple srcdirs makes
build file easiest and better understandable.

We'llbe glad to answer all of your questions.

Regards,
Vitaly Stulsky
mailto:Vitaly_Stulsky@yahoo.com

RE: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Vitaly Stulsky <vi...@yahoo.com>.

> -----Original Message-----
> From: Roger Vaughn [mailto:rvaughn@seaconinc.com]
> Sent: Thursday, May 11, 2000 5:53 PM
> To: ant-dev@jakarta.apache.org
> Subject: Re: Patch: JAVAC dependency tracking and multiple src paths handling
>
>
> Conor MacNeill wrote:
> >
> > Vitaly,
> >
> > Interestingly I have also been working on dependency tracking and multiple
> > source paths.
> >
> >...
> >
> > I also only follow direct relationships. Say you have a scenario of A
> > depends on B and B depends on C but A does not depend on C. If you change C,
> > my approach will only compile C and B but not A. My feeling is that if A
> > does not depend directly on C, changes to C cannot affect A through B. I'm
> > still thinking about that :-)
> >
> > I can't really say which approach is better. I include my source code (and a
> > set of diffs) here to let people see an alternative approach.
> >
> > Cheers
> >
>
> I was about to start on the same problem when I saw this series of
> messages appear.  My approach to the dependencies was to define a
> separate task, as Conor did.  My approach, however, involves no changes
> to the javac task - the dependency generator/checker would delete any
> out of date .class files, thus forcing javac to recompile them
> naturally.  It could instead touch the affected .java files, but this
> would make cvs do extra work on commits and updates, affect backup
> systems, etc.  As I had planned on using Jikes as the dependency
> generator, this approach was guaranteed to be a bit slower.
>
> I haven't done the work yet, but if we want to see a *third* solution,
> I'm still set to do it.

I think will be better if we'll discuss all advantages and disadvantages here
and specify our demands for dependency tracking. If we will identify them,
we'll make the work much faster than releasing different approaches.
It is my opinion.

> My comments - I really like the separate element approach of Conor's
> srcdirs.  On the other hand, I kinda like the integrated dependency
> checks in Vitaly's solution.  It's likely to be faster since it seems to
> show less file system access than either Conor's or my approach.
>
> BTW, on the direct/indirect question - I spent quite a bit of time
> thinking this one through last year while using jikes/make.  Direct
> dependencies are going to be the biggest problem, but it is possible to
> show *real* indirect dependencies.  Consider this:
>
> public interface A {
> 	static final int x = 1;
> }
>
> public interface B extends A {
> 	static final int y = x + 2;
> }
>
> public class C implements B {
> 	private int z = y;
> }
>
> Ok, it's unlikely, but possible. :-)  Because of the way constants are
> handled by Java, C is definitely dependent on A.

Hm, I did think about such cases before ...
I have to analyse this code and will send you the results of my investigation
later.

> The only complaint I ever had about the jikes dependency system was
> that, even though I know c.class depends on a.java, b.java, and c.java,
> I can't see that c.class would ever depend on a.class or b.class (which
> dependencies jikes *does* generate).
>
> Anyway, it seems indirect dependencies are unlikely, but possible.  I
> would vote for including an "indirect" switch in any of the solutions so
> it can be turned off or on.  Adding indirect recompiles where they
> aren't needed adds a *lot* of build time.
>
>
> Roger Vaughn


__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com

Re: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Roger Vaughn <rv...@seaconinc.com>.
Conor MacNeill wrote:
> 
> Vitaly,
> 
> Interestingly I have also been working on dependency tracking and multiple
> source paths.
> 
>...
>
> I also only follow direct relationships. Say you have a scenario of A
> depends on B and B depends on C but A does not depend on C. If you change C,
> my approach will only compile C and B but not A. My feeling is that if A
> does not depend directly on C, changes to C cannot affect A through B. I'm
> still thinking about that :-)
> 
> I can't really say which approach is better. I include my source code (and a
> set of diffs) here to let people see an alternative approach.
> 
> Cheers
> 

I was about to start on the same problem when I saw this series of
messages appear.  My approach to the dependencies was to define a
separate task, as Conor did.  My approach, however, involves no changes
to the javac task - the dependency generator/checker would delete any
out of date .class files, thus forcing javac to recompile them
naturally.  It could instead touch the affected .java files, but this
would make cvs do extra work on commits and updates, affect backup
systems, etc.  As I had planned on using Jikes as the dependency
generator, this approach was guaranteed to be a bit slower.

I haven't done the work yet, but if we want to see a *third* solution,
I'm still set to do it.

My comments - I really like the separate element approach of Conor's
srcdirs.  On the other hand, I kinda like the integrated dependency
checks in Vitaly's solution.  It's likely to be faster since it seems to
show less file system access than either Conor's or my approach.

BTW, on the direct/indirect question - I spent quite a bit of time
thinking this one through last year while using jikes/make.  Direct
dependencies are going to be the biggest problem, but it is possible to
show *real* indirect dependencies.  Consider this:

public interface A {
	static final int x = 1;
}

public interface B extends A {
	static final int y = x + 2;
}

public class C implements B {
	private int z = y;
}

Ok, it's unlikely, but possible. :-)  Because of the way constants are
handled by Java, C is definitely dependent on A.

The only complaint I ever had about the jikes dependency system was
that, even though I know c.class depends on a.java, b.java, and c.java,
I can't see that c.class would ever depend on a.class or b.class (which
dependencies jikes *does* generate).

Anyway, it seems indirect dependencies are unlikely, but possible.  I
would vote for including an "indirect" switch in any of the solutions so
it can be turned off or on.  Adding indirect recompiles where they
aren't needed adds a *lot* of build time.


Roger Vaughn

RE: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Vitaly Stulsky <vi...@yahoo.com>.

> -----Original Message-----
> From: Eugene Bekker [mailto:ebekker@powervision.com]
> Sent: Thursday, May 11, 2000 5:56 PM
> To: ant-dev@jakarta.apache.org
> Subject: Re: Patch: JAVAC dependency tracking and multiple src paths handling
>
>
>
>
> Conor MacNeill wrote:
> >
> > For multiple source paths, I submitted a patch for multiple source paths a
> > while back. I chose to use a comma to separate the source paths since it
> > seemed likely to offend the Windows folks and the Unix folks equally :-) The
> > patch was not applied because of the use of that comma. There was also some
> > desire to support elements instead of just an attribute, something like
> >   <javac>
> >     <srcdir name="xyz">
> >     <srcdir name="abc">
> >   </javac>
>
> I think something along these lines is probably the cleanest/safest
> approach.  But does Ant support nested elements within tasks?  Perhaps not
> yet...  Another solution is to list the files out in a separate file (like
> java.lst) separated by newlines.  Then simply have some sort of an include
> reference.  Although \n is a legal filename character in Unix it's probably
> unlikely to be big problem.  And if it is, the Properties class provides a
> way of escaping any characters.
>
> >
> > I also only follow direct relationships. Say you have a scenario of A
> > depends on B and B depends on C but A does not depend on C. If you change C,
> > my approach will only compile C and B but not A. My feeling is that if A
> > does not depend directly on C, changes to C cannot affect A through B. I'm
> > still thinking about that :-)
>
> But since modifying C, would recompile B, wouldn't A be recompiled since it
> depends on B?

No. Class A directly depends on B and uses only B interfaces (here under
interface
I mean public variables and functions), which didn't change.


__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com

Re: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Eugene Bekker <eb...@powervision.com>.

Conor MacNeill wrote:
> 
> For multiple source paths, I submitted a patch for multiple source paths a
> while back. I chose to use a comma to separate the source paths since it
> seemed likely to offend the Windows folks and the Unix folks equally :-) The
> patch was not applied because of the use of that comma. There was also some
> desire to support elements instead of just an attribute, something like
>   <javac>
>     <srcdir name="xyz">
>     <srcdir name="abc">
>   </javac>

I think something along these lines is probably the cleanest/safest
approach.  But does Ant support nested elements within tasks?  Perhaps not
yet...  Another solution is to list the files out in a separate file (like
java.lst) separated by newlines.  Then simply have some sort of an include
reference.  Although \n is a legal filename character in Unix it's probably
unlikely to be big problem.  And if it is, the Properties class provides a
way of escaping any characters.

> 
> I also only follow direct relationships. Say you have a scenario of A
> depends on B and B depends on C but A does not depend on C. If you change C,
> my approach will only compile C and B but not A. My feeling is that if A
> does not depend directly on C, changes to C cannot affect A through B. I'm
> still thinking about that :-)

But since modifying C, would recompile B, wouldn't A be recompiled since it
depends on B?


-- 
Eugene Bekker
Chief Architect
PowerVision Corporation
http://www.powervision.com
tel://410/312.7243 cel://443/838.6330

RE: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Vitaly Stulsky <vi...@yahoo.com>.
> Interestingly I have also been working on dependency tracking and multiple
> source paths.
Yeah, it is intersting.

> For multiple source paths, I submitted a patch for multiple source paths a
> while back. I chose to use a comma to separate the source paths since it
> seemed likely to offend the Windows folks and the Unix folks equally :-) The
> patch was not applied because of the use of that comma. There was also some
> desire to support elements instead of just an attribute, something like
>   <javac>
>     <srcdir name="xyz">
>     <srcdir name="abc">
>   </javac>

This is better than my approach. I agree with you.

> I have changed my approach now to use the more "forgiving" ways of
> Project.translatePath. I have added a method to Project called
> translatePathInternal. Whereas translatePath will translate a path to the
> local platform conventions, this new method translates paths, whether
> specified with ":/" or ";\", to a platform independent format based on ':'
> and '/' but supporting DOS drive style paths (C:\...). This path can then be
> tokenized on the ':' character to build the source paths vector. So you can
> specify the path in either Windows or Unix style and it should work on
> either platform. I'm not sure how other platforms react :-) I still haven't
> added the element approach yet.

> For dependency tracking I took a different approach from you. I separated
> dependency analysis into a new task <depend>. The depend task builds a
> dependency file by analysing the given set of class files (either a
> directory or a jar). Javac uses the dependency file to determine which
> classes are affected by classes being recompiled. In theory the dependency
> file could be built in some other way but still be used by Javac. The
> dependency format is pretty simple and could be changed easily.

It is good, but I tried to avoid intermediate files. Of course, intermediate
files
creation significiantly increase overall dependency tracking performance.
In theory better to analyze .java files and in intermediate files store
information not only about affected classes, but about classes internals.

For example:
  A.java:
     public class A

     ...
       public void meth() {
         ....
         B.call();
         ....
       }
     ...
     }
     Assume that only meth use class B methods. If we change somthing besides
meth()
     we shouldn't recompile B.java.

Java files analysis approach covers compiler optimization issues too.


> Example usage
>  <javac srcdir="${src1.dir}:${src2.dir};${src3.dir}"
>            destdir="${build.classes}"
>            classpath="${build.classpath}"
>            debug="on"
>            deprecation="on"
>            depfile="${build.dir}/depfile.txt">
>  </javac>
>  <depend src="${build.classes}" depfile="${build.dir}/depfile.txt" />
>
> I also only follow direct relationships. Say you have a scenario of A
> depends on B and B depends on C but A does not depend on C. If you change C,
> my approach will only compile C and B but not A. My feeling is that if A
> does not depend directly on C, changes to C cannot affect A through B. I'm
> still thinking about that :-)

I think that you've selected the right approach. It isn't necessary to recompile
A without direct relation to changed classes.

> I can't really say which approach is better. I include my source code (and a
> set of diffs) here to let people see an alternative approach.

Thanks.

Regards,
Vitaly


__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://im.yahoo.com

RE: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Conor MacNeill <co...@m64.com>.
Vitaly,

Interestingly I have also been working on dependency tracking and multiple
source paths.

For multiple source paths, I submitted a patch for multiple source paths a
while back. I chose to use a comma to separate the source paths since it
seemed likely to offend the Windows folks and the Unix folks equally :-) The
patch was not applied because of the use of that comma. There was also some
desire to support elements instead of just an attribute, something like
  <javac>
    <srcdir name="xyz">
    <srcdir name="abc">
  </javac>

I have changed my approach now to use the more "forgiving" ways of
Project.translatePath. I have added a method to Project called
translatePathInternal. Whereas translatePath will translate a path to the
local platform conventions, this new method translates paths, whether
specified with ":/" or ";\", to a platform independent format based on ':'
and '/' but supporting DOS drive style paths (C:\...). This path can then be
tokenized on the ':' character to build the source paths vector. So you can
specify the path in either Windows or Unix style and it should work on
either platform. I'm not sure how other platforms react :-) I still haven't
added the element approach yet.


For dependency tracking I took a different approach from you. I separated
dependency analysis into a new task <depend>. The depend task builds a
dependency file by analysing the given set of class files (either a
directory or a jar). Javac uses the dependency file to determine which
classes are affected by classes being recompiled. In theory the dependency
file could be built in some other way but still be used by Javac. The
dependency format is pretty simple and could be changed easily.

Example usage
 <javac srcdir="${src1.dir}:${src2.dir};${src3.dir}"
           destdir="${build.classes}"
           classpath="${build.classpath}"
           debug="on"
           deprecation="on"
           depfile="${build.dir}/depfile.txt">
 </javac>
 <depend src="${build.classes}" depfile="${build.dir}/depfile.txt" />

I also only follow direct relationships. Say you have a scenario of A
depends on B and B depends on C but A does not depend on C. If you change C,
my approach will only compile C and B but not A. My feeling is that if A
does not depend directly on C, changes to C cannot affect A through B. I'm
still thinking about that :-)

I can't really say which approach is better. I include my source code (and a
set of diffs) here to let people see an alternative approach.


Cheers

--
Conor MacNeill
conor@m64.com


Re: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Tom Cook <tc...@ardec.com.au>.
On Wed, 10 May 2000, Eugene Bekker wrote:

> I think the problem is not so much discovering the path separator for the
> current platform, the problem is that you want to write a generic build.xml
> for all platforms, so a standard separator needs to be used.

Ah... didn't think of that...

> One alternative is to add an attribute that specifies what the separator is
> for tasks that need to parse list of dirs/files.  Or perhaps a global
> property that defines one.

Niether of these approaches is really appealing; having to add an extra
attribute to _every_ task which uses a list is something that writers of
build files don't really want to know about, and I believe the general
move is away from global properties.

So how to do it? AFAIK niether ';' or ':' is a valid character in paths on
either M$ or U*IX systems, so list code could just check for both. As for
constructing lists programatically, then you would use the one found in
java.io.File.pathSeparatorChar.

There's probably a flaw in this, also, though.

Just a quick question while I'm here. I've been working with a fairly
archaic version of ant for some time, and have only recently gotten the
new sources out and started to play, so I'm not sure how some of this new
fang-dangled stuff works. This should also be a RTFM, but the FM seems a
bit thin on the ground :-)

We have a source tree which is in a CVS repository. When we do a 'cvs
edit' on a file called file_path/file_name.java it creates a file
file_path/CVS/Base/file_name.java which ant compiles _before_ it compiles
the original, which causes all sorts of problems to do with re-defining
classes and so on. I also tend to believe that at least most minor
revisions of a file should compile before the changes are committed. I
think I should be able to exclude all files below a CVS directory with the
'excludes' attribute, but can't work out how. I think this should work:

...
<javac srcdir="${src.dir}" destdir="${build.dir}" classpath="${classpath}"
	debug="off" derecation="off" excludes="${src.dir}/CVS"/>
...

but it doesn't seem to.

Cheerio
--
Tom Cook - Software Engineer

"The brain is a wonderful organ. It starts functioning the moment you get
up in the morning, and does not stop until you get into the office."
	- Robert Frost

LISAcorp - www.lisa.com.au

--------------------------------------------------
38 Greenhill Rd.          Level 3, 228 Pitt Street
Wayville, SA, 5034        Sydney, NSW, 2000

Phone:   +61 8 8272 1555  Phone:   +61 2 9283 0877
Fax:     +61 8 8271 1199  Fax:     +61 2 9283 0866
--------------------------------------------------



Re: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Eugene Bekker <eb...@powervision.com>.
I think the problem is not so much discovering the path separator for the
current platform, the problem is that you want to write a generic build.xml
for all platforms, so a standard separator needs to be used.

One alternative is to add an attribute that specifies what the separator is
for tasks that need to parse list of dirs/files.  Or perhaps a global
property that defines one.


Tom Cook wrote:
> 
> On Wed, 10 May 2000, Vitaly Stulsky wrote:
> 
> [snip]
> > 2) With applied patch it is possible to use such src dirs:
> >               <property   name="src.dir"
> >                                       value= "com/company/progr/mainprogr;
> >                                                       com/company/utils/protocol;
> >                                                       com/company/utils/db;
> >                                                       com/company/utils/diag;
> >                                                       com/company/utils/html;
> >                                                       com/company/utils/sort;
> >                                                       com/company/utils/string;
> >                                                       com/company/utils/thread;
> >                                                       com/company/utils/test;
> >                                                       com/company/utils/timer
> >               />
> >    All paths must be separated by semicolon ( I estimate Unix users indignation,
> > but now current
> >    approach easily extended to usage semicolon on NT platform and colon on
> > Unix).
> 
> There is a very standard mechanism in Java for discovering the path
> separator:
> 
> package java.io;
> ...
> public class File
> {
>         public static String pathSeparator;
>         public static char pathSeparatorChar;
>         ...
> }
> 
> This is hardly a hard interface to use, instead of hard-coding...
> 
> --
> Tom Cook - Software Engineer
> 
> "The brain is a wonderful organ. It starts functioning the moment you get
> up in the morning, and does not stop until you get into the office."
>         - Robert Frost
> 
> LISAcorp - www.lisa.com.au
> 
> --------------------------------------------------
> 38 Greenhill Rd.          Level 3, 228 Pitt Street
> Wayville, SA, 5034        Sydney, NSW, 2000
> 
> Phone:   +61 8 8272 1555  Phone:   +61 2 9283 0877
> Fax:     +61 8 8271 1199  Fax:     +61 2 9283 0866
> --------------------------------------------------

-- 
Eugene Bekker
Chief Architect
PowerVision Corporation
http://www.powervision.com
tel://410/312.7243 cel://443/838.6330

Re: Patch: JAVAC dependency tracking and multiple src paths handling

Posted by Tom Cook <tc...@ardec.com.au>.
On Wed, 10 May 2000, Vitaly Stulsky wrote:

[snip]
> 2) With applied patch it is possible to use such src dirs:
> 		<property   name="src.dir"
> 					value= "com/company/progr/mainprogr;
> 							com/company/utils/protocol;
> 							com/company/utils/db;
> 							com/company/utils/diag;
> 							com/company/utils/html;
> 							com/company/utils/sort;
> 							com/company/utils/string;
> 							com/company/utils/thread;
> 							com/company/utils/test;
> 							com/company/utils/timer
> 		/>
>    All paths must be separated by semicolon ( I estimate Unix users indignation,
> but now current
>    approach easily extended to usage semicolon on NT platform and colon on
> Unix).

There is a very standard mechanism in Java for discovering the path
separator:

package java.io;
...
public class File
{
	public static String pathSeparator;
	public static char pathSeparatorChar;
	...
}

This is hardly a hard interface to use, instead of hard-coding...

--
Tom Cook - Software Engineer

"The brain is a wonderful organ. It starts functioning the moment you get
up in the morning, and does not stop until you get into the office."
	- Robert Frost

LISAcorp - www.lisa.com.au

--------------------------------------------------
38 Greenhill Rd.          Level 3, 228 Pitt Street
Wayville, SA, 5034        Sydney, NSW, 2000

Phone:   +61 8 8272 1555  Phone:   +61 2 9283 0877
Fax:     +61 8 8271 1199  Fax:     +61 2 9283 0866
--------------------------------------------------