You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xml-commons-dev@xerces.apache.org by Brian Smith <br...@uiowa.edu> on 2002/11/13 08:52:43 UTC

Resolver Library and Ant Task

Hi,

I created an Ant task that is very similar to the current <style> task 
except that it uses the XML-Commons resolver library instead of the 
<xmlcatalog> Ant type (among other things). You use it like this:

<catalogset>
     <catalog file="doc/smt.cat"/>
     <catalog file="lib/xhtml.cat"/>
</catalogset>

where doc/smt.cat is an OASIS XML Catalog and xhtml.cat is the TR??? 
catalog that comes with the XHTML 1.1 spec download. There is no support 
for specifying any part of the catalog inside the build file; all 
catalogs must be in external files. The rationale is that if you put 
your catalogs inside the build script, you cannot share them with other 
tools like NetBeans that can make use of them.

It seems to work well for my (simple) use cases. I am using 
CatalogResolver as the entity/URI resolver for the transformation. I 
have noticed the following issues and I would like to know how I can 
solve them:

(1) The CatalogResolver repeatedly prints out "Cannot find 
CatalogManager.properties" even when I am using a private catalog. It 
seems that the only way to shut this warning off is to set a system 
property or include a catalog.properties file in the classpath. I don't 
think that either of these things is a good thing for an Ant task to do. 
  I need some way to tell the Catalog library not to output any 
messages, not to use any system catalogs, etc. without setting any 
system properties, etc.

(2) In general, it would be very good if the library worked in such a 
way that the system properties/catalogs were only used if they were 
explicitly designated to be used. For example, it seems like there could 
be a single instance of Catalog that represents all the system catalog 
files, and any application that wants to use the system catalog files 
would create the proper entry in their own catalog that delegates to 
this (shared) system catalog. For example (see below too):
      ResolvingXMLReader reader = new ResolvingXMLReader();
      // The reader's catalog is empty
      reader.getCatalog().addCatalog(Catalog.getSystemCatalog());
      // Now the reader has access to the system catalog

(3) I have found that I would like to have an instance of Catalog 
delegate to another instance of Catalog. Currently, this doesn't seem to 
be possible without subclassing. The idea is to group sets of catalogs 
together in different configurations, such that each catalog file is 
only read once, like this:
     Catalog a = new Catalog();
     a.parseCatalog(someFile);
     a.parseCatalog(someOtherFile);
     Catalog b = new Catalog();
     b.parseCatalog(yetAnotherFile);
     b.addCatalog(a);  // b will now delegate to a as ncessary

The current mechanism seems to require that the catalog files must be 
parsed for each catalog they belong to (i.e. "someFile" and 
"someOtherFile" would have to be parsed twice in the above example).

(4) CatalogResolver works fine, but you have to be careful when using 
the document() function in XSLT. The XSLT engine isn't guarenteed to use 
the same entity resolver that it uses for the main document. I had to 
subclass it like this:

final XMLReader reader = XMLReaderFactory.createXMLReader();
CatalogResolver resolver = new CatalogResolver(true) {
     public Source resolve(String base, String href)
         throws TransformerException {
         SAXSource result = (SAXSource) super.resolve(base, href);
         result.setXMLReader(reader);
         return result;
     }
}

In other words, I make sure that all documents use the same XMLReader so 
that they will use the same EntityResolver.

- Brian






Re: Resolver Library and Ant Task

Posted by Craeg Strong <vz...@verizon.net>.
Brian Smith wrote:

>  My XSLT task doesn't work the same was the current <style>
> task; I created it to handle a specialized set of documentation. I
> happened to use a different way (simpler) of handing catalogs), but also
> there are other small differences. For example, it uses a mapper to map
> from the source file name to the destination file name (instead of just
> changing extensions).

This is IMO a useful extension.  Would you consider making a patch
to the current Ant for this?
You could refactor the current <xslt> task to use an implicit mapper,
then include the ability to specify a mapper if desired.


> It passes some extra parameters to the stylesheet
> that give the stylesheet path information for the source/destination
> files that is helpful for my particular application.

<xslt> today supports passing arbitrary parameters to the stylesheet.
Why hardcode those into your ant task?  It seems like the default
Ant would work fine for this....

> I might also add
> the ability to do chained transformations if I ever have the need for them.

Work has been done here, check out: http://www.langdale.com.au/styler/
or other ant "external tasks" at http://jakarta.apache.org/ant/external.html

I think there is room for <xslt> to be enhanced to handle such things.
The question is: do you want the output of one transformation to be automatically
handed to another, without getting written to disk first?
That could be useful.

Of course it is your prerogative to build your own custom Ant tasks.
However, in the case where you require only minor enhancements/adjustments
to an existing task, I urge you to consider working with the Jakarta team
to make them happen in the Ant code base.
I would definately recommend that you float your ideas on the ant-dev
mailing list.   Every Ant enhancement has started out as a developer's itch
that needed to be scratched....

Regards,

--Craeg


Re: Resolver Library and Ant Task

Posted by Brian Smith <br...@uiowa.edu>.
Craeg Strong wrote:
> Hmm.  I am not sure if you are aware of it, but my patch that supports
> entity catalogs using the resolver library was just accepted into the Ant
> code base:

Yes, I noticed that when I was reading over the archives for this 
mailing list. My XSLT task doesn't work the same was the current <style> 
task; I created it to handle a specialized set of documentation. I 
happened to use a different way (simpler) of handing catalogs), but also 
there are other small differences. For example, it uses a mapper to map 
from the source file name to the destination file name (instead of just 
changing extensions). It passes some extra parameters to the stylesheet 
that give the stylesheet path information for the source/destination 
files that is helpful for my particular application. I might also add 
the ability to do chained transformations if I ever have the need for them.

>>The current mechanism seems to require that the catalog files must be
>>parsed for each catalog they belong to (i.e. "someFile" and
>>"someOtherFile" would have to be parsed twice in the above example).
> 
> Not sure about this one.  Are you talking about "nextCatalog" entries in the
> catalog files?

Similar to nextCatalog, but without requiring the creation of a "master" 
catalog file for every set of catalogs that are to be used. Although 
maybe it would be better to create these master catalog files, now that 
I think about it...

- Brian




Re: Resolver Library and Ant Task

Posted by Craeg Strong <vz...@verizon.net>.
Hello:

(see comments below)

Brian Smith wrote:

> Hi,
>
> I created an Ant task that is very similar to the current <style> task
> except that it uses the XML-Commons resolver library instead of the
> <xmlcatalog> Ant type (among other things).

Hmm.  I am not sure if you are aware of it, but my patch that supports
entity catalogs using the resolver library was just accepted into the Ant
code base:

http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-ant/docs/manual/CoreTypes/xmlcatalog.html?rev=1.3&content-type=text/plain

As you can see from the above, XMLCatalog now accepts both in-line and
external file entries.
I originally developed this code more than 6 months ago, just slightly too
late
for the 1.5 final release (by about two days :-(  )  So it had to wait
until 1.6.   It is now in the 1.6 (alpha, not yet released) Ant codebase,
obtainable via CVS snapshot or nightly build...

> It seems to work well for my (simple) use cases. I am using
> CatalogResolver as the entity/URI resolver for the transformation. I
> have noticed the following issues and I would like to know how I can
> solve them:
>
> (1) The CatalogResolver repeatedly prints out "Cannot find
> CatalogManager.properties" even when I am using a private catalog. It
> seems that the only way to shut this warning off is to set a system
> property or include a catalog.properties file in the classpath. I don't
> think that either of these things is a good thing for an Ant task to do.
>   I need some way to tell the Catalog library not to output any
> messages, not to use any system catalogs, etc. without setting any
> system properties, etc.

Yes.  I dealt with this problem in the same way Norm indicates:

http://cvs.apache.org/viewcvs/jakarta-ant/src/main/org/apache/tools/ant/types/resolver/ApacheCatalogResolver.java?rev=1.3&content-type=text/vnd.viewcvs-markup

public class ApacheCatalogResolver extends CatalogResolver {

    /** The XMLCatalog object to callback. */
    private XMLCatalog xmlCatalog = null;

    static
    {
        //
        // If you don't do this, you get all sorts of annoying
        // warnings about a missing properties file.  However, it
        // seems to work just fine with default values.  Ultimately,
        // we should probably include a "CatalogManager.properties"
        // file in the ant jarfile with some default property
        // settings.  See CatalogManager.java for more details.
        //
        CatalogManager.ignoreMissingProperties(true);

        //
        // Make sure CatalogResolver instantiates ApacheCatalog,
        // rather than a plain Catalog
        //
        System.setProperty("xml.catalog.className",
                           ApacheCatalog.class.getName());

        // debug
        // System.setProperty("xml.catalog.verbosity", "4");
    }
...etc...

>
>
> (2) In general, it would be very good if the library worked in such a
> way that the system properties/catalogs were only used if they were
> explicitly designated to be used. For example, it seems like there could
> be a single instance of Catalog that represents all the system catalog
> files, and any application that wants to use the system catalog files
> would create the proper entry in their own catalog that delegates to
> this (shared) system catalog. For example (see below too):
>       ResolvingXMLReader reader = new ResolvingXMLReader();
>       // The reader's catalog is empty
>       reader.getCatalog().addCatalog(Catalog.getSystemCatalog());
>       // Now the reader has access to the system catalog

See Norm's reply.. I used the default catalog, as per his message...

> (3) I have found that I would like to have an instance of Catalog
> delegate to another instance of Catalog. Currently, this doesn't seem to
> be possible without subclassing. The idea is to group sets of catalogs
> together in different configurations, such that each catalog file is
> only read once, like this:
>      Catalog a = new Catalog();
>      a.parseCatalog(someFile);
>      a.parseCatalog(someOtherFile);
>      Catalog b = new Catalog();
>      b.parseCatalog(yetAnotherFile);
>      b.addCatalog(a);  // b will now delegate to a as ncessary
>
> The current mechanism seems to require that the catalog files must be
> parsed for each catalog they belong to (i.e. "someFile" and
> "someOtherFile" would have to be parsed twice in the above example).

Not sure about this one.  Are you talking about "nextCatalog" entries in the

catalog files?

> (4) CatalogResolver works fine, but you have to be careful when using
> the document() function in XSLT. The XSLT engine isn't guarenteed to use
> the same entity resolver that it uses for the main document. I had to
> subclass it like this:
>
> final XMLReader reader = XMLReaderFactory.createXMLReader();
> CatalogResolver resolver = new CatalogResolver(true) {
>      public Source resolve(String base, String href)
>          throws TransformerException {
>          SAXSource result = (SAXSource) super.resolve(base, href);
>          result.setXMLReader(reader);
>          return result;
>      }
> }
>
> In other words, I make sure that all documents use the same XMLReader so
> that they will use the same EntityResolver.

Yes.  I went through exactly the same learnings, made the appropriate fixes,

and documented them in the code:

http://cvs.apache.org/viewcvs.cgi/jakarta-ant/src/main/org/apache/tools/ant/types/XMLCatalog.java?rev=1.17&content-type=text/vnd.viewcvs-markup

/**
     * <p>This is called from the URIResolver to set an EntityResolver
     * on the SAX parser to be used for new XML documents that are
     * encountered as a result of the document() function, xsl:import,
     * or xsl:include.  This is done because the XSLT processor calls
     * out to the SAXParserFactory itself to create a new SAXParser to
     * parse the new document.  The new parser does not automatically
     * inherit the EntityResolver of the original (although arguably
     * it should).  See below:</p>
     *
     * <tt>"If an application wants to set the ErrorHandler or
     * EntityResolver for an XMLReader used during a transformation,
     * it should use a URIResolver to return the SAXSource which
     * provides (with getXMLReader) a reference to the XMLReader"</tt>
     *
     * <p>...quoted from page 118 of the Java API for XML
     * Processing 1.1 specification</p>
     *
     */
    private void setEntityResolver(SAXSource source) throws
TransformerException {

        XMLReader reader = source.getXMLReader();
        if (reader == null) {
            SAXParserFactory spFactory = SAXParserFactory.newInstance();
            spFactory.setNamespaceAware(true);
            try {
                reader = spFactory.newSAXParser().getXMLReader();
            }
            catch (ParserConfigurationException ex) {
                throw new TransformerException(ex);
            }
            catch (SAXException ex) {
                throw new TransformerException(ex);
            }
        }
        reader.setEntityResolver(this);
        source.setXMLReader(reader);
    }

--Craeg

>
>
> - Brian


Re: Resolver Library and Ant Task

Posted by Norman Walsh <nd...@nwalsh.com>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

/ Brian Smith <br...@uiowa.edu> was heard to say:
| (1) The CatalogResolver repeatedly prints out "Cannot find
| CatalogManager.properties" even when I am using a private catalog. It
| seems that the only way to shut this warning off is to set a system
| property or include a catalog.properties file in the classpath. I
| don't think that either of these things is a good thing for an Ant
| task to do. I need some way to tell the Catalog library not to output
| any messages, not to use any system catalogs, etc. without setting any
| system properties, etc.

The 1.1 release will have a more flexible catalog manager system.
In the meantime, if you set the system property xml.catalog.ignoreMissing
to any non-null value, the messages should stop.

| (2) In general, it would be very good if the library worked in such a
| way that the system properties/catalogs were only used if they were
| explicitly designated to be used. For example, it seems like there
| could be a single instance of Catalog that represents all the system
| catalog files, and any application that wants to use the system
| catalog files would create the proper entry in their own catalog that
| delegates to this (shared) system catalog. For example (see below too):
|       ResolvingXMLReader reader = new ResolvingXMLReader();
|       // The reader's catalog is empty
|       reader.getCatalog().addCatalog(Catalog.getSystemCatalog());
|       // Now the reader has access to the system catalog

There is already a static catalog that uses all the system properties.
If you don't go to some lengths to avoid it, that's what you get.
As for "falling back" to the system catalogs,

| (3) I have found that I would like to have an instance of Catalog
| delegate to another instance of Catalog. Currently, this doesn't seem
| to be possible without subclassing. The idea is to group sets of
| catalogs together in different configurations, such that each catalog
| file is only read once, like this:
|      Catalog a = new Catalog();
|      a.parseCatalog(someFile);
|      a.parseCatalog(someOtherFile);
|      Catalog b = new Catalog();
|      b.parseCatalog(yetAnotherFile);
|      b.addCatalog(a);  // b will now delegate to a as ncessary

I think something like this is probably the right answer. I'll look into it.

| (4) CatalogResolver works fine, but you have to be careful when using
| the document() function in XSLT. The XSLT engine isn't guarenteed to
| use the same entity resolver that it uses for the main document. I had
| to subclass it like this:
|
| final XMLReader reader = XMLReaderFactory.createXMLReader();
| CatalogResolver resolver = new CatalogResolver(true) {
|      public Source resolve(String base, String href)
|          throws TransformerException {
|          SAXSource result = (SAXSource) super.resolve(base, href);
|          result.setXMLReader(reader);
|          return result;
|      }
| }
|
| In other words, I make sure that all documents use the same XMLReader
| so that they will use the same EntityResolver.

I seem to avoid this problem by using org.apache.xml.resolver.tools.ResolvingXMLReader
as my XMLReader class. Is that solution inappropriate for your application?

                                        Be seeing you,
                                          norm

- -- 
Norman.Walsh@Sun.COM    | All professional men are handicapped by not
XML Standards Architect | being allowed to ignore things which are
Web Tech. and Standards | useless.--Goethe
Sun Microsystems, Inc.  | 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Processed by Mailcrypt 3.5.7 <http://mailcrypt.sourceforge.net/>

iD8DBQE90lhIOyltUcwYWjsRAgt7AJsG87HEvCI3HmAU65I+ltBba94lrgCfWj/7
0BYDRaouL6i6Ox2ajeogS3w=
=j00g
-----END PGP SIGNATURE-----