You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ant.apache.org by "Marijan (Mario) Madunic" <ha...@imag.net> on 2008/10/16 20:09:02 UTC

Cached results?

(Using ANT 1.7.1 on a Windows XP SP3 machine)

Sorry about the not to obvious subject line but I'll do my best in 
explaining the problem I've been having.

I'm using ant to pipeline a series of transforms on content extracted 
from a db.

There is one main XSLT that does the bulk of the transforms. It's built 
using called templates. Here it is in ANT:

<echo>fixArtistType.xsl</echo>
<xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml" 
style="../xslChunks/fixArtistType.xsl" />

<echo>fixCountries.xsl</echo>
<xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml" 
style="../xslChunks/fixCountries.xsl" />

<echo>getArtists.xsl</echo>
<xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml" 
style="../xslChunks/getArtists.xsl" />

The last one is the main XSLT transform. I am overwriting the output doc 
as I don't need it. I use it keep.xml as a dummy doc as I parse a series 
of other XML docs to get my desired output.

What is happening is that even though I make changes to the called 
templates that getArtists.xsl uses I get the following

G:\XML\cdCollection>step2

G:\XML\cdCollection>ant -f G:\XML\cdCollection\antTasks\getArtists.xml
Buildfile: G:\XML\cdCollection\antTasks\getArtists.xml

main:
     [echo] fixArtistType.xsl
     [xslt] Processing G:\XML\cdCollection\doNotDelete\keep.xml to 
G:\XML\cdCollection\temp\deleteMe.xml
     [xslt] Loading stylesheet 
G:\XML\cdCollection\xslChunks\fixArtistType.xsl
     [xslt] START fixArtistType
     [xslt] END fixArtistType
     [echo] fixCountries.xsl
     [echo] getArtists.xsl
   [delete] Deleting directory G:\XML\cdCollection\temp

BUILD SUCCESSFUL
Total time: 0 seconds

When I make a change to the getArtists.xsl directly the total time is 15 
seconds and I usually get the output I expected, but now I don't. The 
funny thing was if I added a space to the getArtists.xsl anywhere I 
usually get the output I expected. I used to be able to halt this by 
deleting all temp files, moving the result files to a location on a 
server. In other words have a clean structure every time I ran an ANT 
process. This all runs fine when using Saxon from the command line minus 
ANT but don't want to really go that route.

So my questions are: Is ANT checking for a cached result set? Does it 
check the files it is to use for the transform and check if they are 
different from a cached set? How do I stop this behaviour?

Any help would be appreciated.

Thanks

Marijan (Mario) Madunic

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by "Marijan (Mario) Madunic" <ha...@imag.net>.
Thanks, Dominique and Robert. I've set force="true" but am going to 
check out Dominique's suggestions as a learning exercise.

Marijan (Mario) Madunic

Dominique Devienne wrote:
> On Thu, Oct 16, 2008 at 1:19 PM, Robert Koberg <ro...@koberg.com> wrote:
>   
>> Ant's XSL only checks the primary XSL file for changes (and the source XML -
>> not xi:includes or file entities). It does not parse that XSL to find
>> xsl:import/includes to check if they have changed.
>>
>> You could put the import/includes in the primary XSL or use force=true on
>> the xslt task.
>>     
>
> Or you explicitly check on the included/imported style sheets via
> <uptodate> or AntContrib's <outofdate> separately from <xslt> itself.
> This forces you to put in the build knowledge about the XSL, which is
> not great, but often an acceptable compromise.
>
> Or you patch <xslt> to parse the XSL to discover it's includes/imports
> (recursively in those too), to check on the timestamps of those as
> well ;-) That's not too difficult, and since they must appear in the
> "prolog" before the templates themselves, you can stop parsing fairly
> early too. But that involves some programming ;-)  --DD
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>
>
>   


Re: Cached results?

Posted by Dominique Devienne <dd...@gmail.com>.
On Thu, Oct 16, 2008 at 4:57 PM, Robert Koberg <ro...@koberg.com> wrote:
>> While you're at it, why not find out which result document(s) were
>> generated, especially is XSLT 2.0 where it's no longer an extension
>> and part of the language?
>
> That is really not a problem either, at least for saxon (and we really don't
> have any other choice...).

Indeed, that was my parser of choice in Java land. Now in C++ land,
it's Qt's Patternist, which is inspired in no small part on Saxon I
believe.

> Saxon exposes a a similar resolver interface for output :)
> http://saxonica.com/documentation/javadoc/net/sf/saxon/OutputURIResolver.html

Can't use in Ant since Saxon-specific. But Dr. Kay has been working on
his own <xslt>-extending task recently, so we should at least have an
extension point such that his <saxon> task could record output
documents as well.

Yet I suspect it's not going to be a be priority for Dr. Kay ;) --D

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by Robert Koberg <ro...@koberg.com>.
On Oct 16, 2008, at 5:47 PM, Dominique Devienne wrote:

> On Thu, Oct 16, 2008 at 1:48 PM, Robert Koberg <ro...@koberg.com> wrote:
>>
>> Actually, there is more. You will want to check all document  
>> functions to
>> see what other  is being used and if it has changed. I actually have
>> something like that, but it works off a hierarchical config file  
>> (something
>> like what apache forrest uses but more recursive in orientation)  
>> that is not
>> really.
>
> Ah ah, indeed, I didn't think about doc() calls. But aren't these
> dependent on variables, and the primary XML document's content
> possibly, while the import/include must use URLs only, possibly
> dependent on params only?
>
> I'm no expert at XSLs, but just dealing with import/include would
> already be an improvement.
>
>> Instead of parsing the XSL, I use 2 custom URIResolvers to gather the
>> dependent files and put that into a cache entry object.
>
> This is smart! you're intercepting the calls to discover which
> resources are used. You must then keep a cache in the filesystem then
> though. (there's already a cache impl linked to <uptodate> I think).
>
> While you're at it, why not find out which result document(s) were
> generated, especially is XSLT 2.0 where it's no longer an extension
> and part of the language?

That is really not a problem either, at least for saxon (and we really  
don't have any other choice...). Saxon exposes a a similar resolver  
interface for output :)

http://saxonica.com/documentation/javadoc/net/sf/saxon/OutputURIResolver.html

best,
-Rob


> Just kidding, <javac> doesn't compile a file
> if the generated .class for an inner class are been removed, that's
> the same kind of limitation.
>
> Thanks for the precision Rob. --DD
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by Dominique Devienne <dd...@gmail.com>.
On Thu, Oct 16, 2008 at 1:48 PM, Robert Koberg <ro...@koberg.com> wrote:
>
> Actually, there is more. You will want to check all document functions to
> see what other  is being used and if it has changed. I actually have
> something like that, but it works off a hierarchical config file (something
> like what apache forrest uses but more recursive in orientation) that is not
> really.

Ah ah, indeed, I didn't think about doc() calls. But aren't these
dependent on variables, and the primary XML document's content
possibly, while the import/include must use URLs only, possibly
dependent on params only?

I'm no expert at XSLs, but just dealing with import/include would
already be an improvement.

> Instead of parsing the XSL, I use 2 custom URIResolvers to gather the
> dependent files and put that into a cache entry object.

This is smart! you're intercepting the calls to discover which
resources are used. You must then keep a cache in the filesystem then
though. (there's already a cache impl linked to <uptodate> I think).

While you're at it, why not find out which result document(s) were
generated, especially is XSLT 2.0 where it's no longer an extension
and part of the language? Just kidding, <javac> doesn't compile a file
if the generated .class for an inner class are been removed, that's
the same kind of limitation.

Thanks for the precision Rob. --DD

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by Robert Koberg <ro...@koberg.com>.
On Oct 16, 2008, at 2:26 PM, Dominique Devienne wrote:

> On Thu, Oct 16, 2008 at 1:19 PM, Robert Koberg <ro...@koberg.com> wrote:
>> Ant's XSL only checks the primary XSL file for changes (and the  
>> source XML -
>> not xi:includes or file entities). It does not parse that XSL to find
>> xsl:import/includes to check if they have changed.
>>
>> You could put the import/includes in the primary XSL or use  
>> force=true on
>> the xslt task.
>
> Or you explicitly check on the included/imported style sheets via
> <uptodate> or AntContrib's <outofdate> separately from <xslt> itself.
> This forces you to put in the build knowledge about the XSL, which is
> not great, but often an acceptable compromise.
>
> Or you patch <xslt> to parse the XSL to discover it's includes/imports
> (recursively in those too), to check on the timestamps of those as
> well ;-) That's not too difficult, and since they must appear in the
> "prolog" before the templates themselves, you can stop parsing fairly
> early too. But that involves some programming ;-)  --DD

Actually, there is more. You will want to check all document functions  
to see what other  is being used and if it has changed. I actually  
have something like that, but it works off a hierarchical config file  
(something like what apache forrest uses but more recursive in  
orientation) that is not really.

Instead of parsing the XSL, I use 2 custom URIResolvers to gather the  
dependent files and put that into a cache entry object. One resolver  
is set on the factory to resolve import/includes and one is set on the  
transformer to catch all the document() resolves. So at task init the  
factory's cache entry gets checked to see if the actual stylesheet has  
changed. Then for each transform, the relevant entry is checked to see  
if the transform should proceed.

I suppose it could be setup to allow for user URIResolvers as long as  
they implement some yet to be determined interface that extends  
URIResolver.

Make sense and/or sound good?

I might have some time in the next few weeks to make it more generic  
for Ant.

best,
-Rob 

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by Dominique Devienne <dd...@gmail.com>.
On Thu, Oct 16, 2008 at 1:19 PM, Robert Koberg <ro...@koberg.com> wrote:
> Ant's XSL only checks the primary XSL file for changes (and the source XML -
> not xi:includes or file entities). It does not parse that XSL to find
> xsl:import/includes to check if they have changed.
>
> You could put the import/includes in the primary XSL or use force=true on
> the xslt task.

Or you explicitly check on the included/imported style sheets via
<uptodate> or AntContrib's <outofdate> separately from <xslt> itself.
This forces you to put in the build knowledge about the XSL, which is
not great, but often an acceptable compromise.

Or you patch <xslt> to parse the XSL to discover it's includes/imports
(recursively in those too), to check on the timestamps of those as
well ;-) That's not too difficult, and since they must appear in the
"prolog" before the templates themselves, you can stop parsing fairly
early too. But that involves some programming ;-)  --DD

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by Robert Koberg <ro...@koberg.com>.
Ant's XSL only checks the primary XSL file for changes (and the source  
XML - not xi:includes or file entities). It does not parse that XSL to  
find xsl:import/includes to check if they have changed.

You could put the import/includes in the primary XSL or use force=true  
on the xslt task.

best,
-Rob


On Oct 16, 2008, at 2:09 PM, Marijan (Mario) Madunic wrote:

> (Using ANT 1.7.1 on a Windows XP SP3 machine)
>
> Sorry about the not to obvious subject line but I'll do my best in  
> explaining the problem I've been having.
>
> I'm using ant to pipeline a series of transforms on content  
> extracted from a db.
>
> There is one main XSLT that does the bulk of the transforms. It's  
> built using called templates. Here it is in ANT:
>
> <echo>fixArtistType.xsl</echo>
> <xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml"  
> style="../xslChunks/fixArtistType.xsl" />
>
> <echo>fixCountries.xsl</echo>
> <xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml"  
> style="../xslChunks/fixCountries.xsl" />
>
> <echo>getArtists.xsl</echo>
> <xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml"  
> style="../xslChunks/getArtists.xsl" />
>
> The last one is the main XSLT transform. I am overwriting the output  
> doc as I don't need it. I use it keep.xml as a dummy doc as I parse  
> a series of other XML docs to get my desired output.
>
> What is happening is that even though I make changes to the called  
> templates that getArtists.xsl uses I get the following
>
> G:\XML\cdCollection>step2
>
> G:\XML\cdCollection>ant -f G:\XML\cdCollection\antTasks\getArtists.xml
> Buildfile: G:\XML\cdCollection\antTasks\getArtists.xml
>
> main:
>    [echo] fixArtistType.xsl
>    [xslt] Processing G:\XML\cdCollection\doNotDelete\keep.xml to G: 
> \XML\cdCollection\temp\deleteMe.xml
>    [xslt] Loading stylesheet G:\XML\cdCollection\xslChunks 
> \fixArtistType.xsl
>    [xslt] START fixArtistType
>    [xslt] END fixArtistType
>    [echo] fixCountries.xsl
>    [echo] getArtists.xsl
>  [delete] Deleting directory G:\XML\cdCollection\temp
>
> BUILD SUCCESSFUL
> Total time: 0 seconds
>
> When I make a change to the getArtists.xsl directly the total time  
> is 15 seconds and I usually get the output I expected, but now I  
> don't. The funny thing was if I added a space to the getArtists.xsl  
> anywhere I usually get the output I expected. I used to be able to  
> halt this by deleting all temp files, moving the result files to a  
> location on a server. In other words have a clean structure every  
> time I ran an ANT process. This all runs fine when using Saxon from  
> the command line minus ANT but don't want to really go that route.
>
> So my questions are: Is ANT checking for a cached result set? Does  
> it check the files it is to use for the transform and check if they  
> are different from a cached set? How do I stop this behaviour?
>
> Any help would be appreciated.
>
> Thanks
>
> Marijan (Mario) Madunic
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Cached results?

Posted by "Marijan (Mario) Madunic" <ha...@imag.net>.
Never mind found the solution.

 force="true" on the xslt task

Marijan (Mario) Madunic

Marijan (Mario) Madunic wrote:
> (Using ANT 1.7.1 on a Windows XP SP3 machine)
>
> Sorry about the not to obvious subject line but I'll do my best in 
> explaining the problem I've been having.
>
> I'm using ant to pipeline a series of transforms on content extracted 
> from a db.
>
> There is one main XSLT that does the bulk of the transforms. It's 
> built using called templates. Here it is in ANT:
>
> <echo>fixArtistType.xsl</echo>
> <xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml" 
> style="../xslChunks/fixArtistType.xsl" />
>
> <echo>fixCountries.xsl</echo>
> <xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml" 
> style="../xslChunks/fixCountries.xsl" />
>
> <echo>getArtists.xsl</echo>
> <xslt in="../doNotDelete/keep.xml" out="../temp/deleteMe.xml" 
> style="../xslChunks/getArtists.xsl" />
>
> The last one is the main XSLT transform. I am overwriting the output 
> doc as I don't need it. I use it keep.xml as a dummy doc as I parse a 
> series of other XML docs to get my desired output.
>
> What is happening is that even though I make changes to the called 
> templates that getArtists.xsl uses I get the following
>
> G:\XML\cdCollection>step2
>
> G:\XML\cdCollection>ant -f G:\XML\cdCollection\antTasks\getArtists.xml
> Buildfile: G:\XML\cdCollection\antTasks\getArtists.xml
>
> main:
>     [echo] fixArtistType.xsl
>     [xslt] Processing G:\XML\cdCollection\doNotDelete\keep.xml to 
> G:\XML\cdCollection\temp\deleteMe.xml
>     [xslt] Loading stylesheet 
> G:\XML\cdCollection\xslChunks\fixArtistType.xsl
>     [xslt] START fixArtistType
>     [xslt] END fixArtistType
>     [echo] fixCountries.xsl
>     [echo] getArtists.xsl
>   [delete] Deleting directory G:\XML\cdCollection\temp
>
> BUILD SUCCESSFUL
> Total time: 0 seconds
>
> When I make a change to the getArtists.xsl directly the total time is 
> 15 seconds and I usually get the output I expected, but now I don't. 
> The funny thing was if I added a space to the getArtists.xsl anywhere 
> I usually get the output I expected. I used to be able to halt this by 
> deleting all temp files, moving the result files to a location on a 
> server. In other words have a clean structure every time I ran an ANT 
> process. This all runs fine when using Saxon from the command line 
> minus ANT but don't want to really go that route.
>
> So my questions are: Is ANT checking for a cached result set? Does it 
> check the files it is to use for the transform and check if they are 
> different from a cached set? How do I stop this behaviour?
>
> Any help would be appreciated.
>
> Thanks
>
> Marijan (Mario) Madunic
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org