You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Mark H. Wood" <mw...@IUPUI.Edu> on 2006/08/15 22:32:27 UTC

How to get an LSParser to use an XMLCatalogResolver?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm trying to parse some statistical data from a device that sends me back 
a document starting like this:

   <?xml version="1.0" ?>
   <!DOCTYPE statistics SYSTEM "statistics-1.0.dtd" >

Rather than have to have a copy of the DTD in the current working 
directory whenever I run my app., I'd like to slap a copy of it somewhere, 
point a catalog entry to it, and let a resolver find it.

After banging my head against the doco. for two days, I tracked down 
XMLCatalogResolver, which seems to be the sort of thing that an LSParser 
would want:

 	DOMImplementationLS impl;
 	LSParser builder;
 	LSInput input;

 	System.setProperty(DOMImplementationRegistry.PROPERTY,
 	"org.apache.xerces.dom.DOMImplementationSourceImpl");

 	try {
 		DOMImplementationRegistry registry =
 			DOMImplementationRegistry.newInstance();

 		impl = (DOMImplementationLS)registry.getDOMImplementation("LS");

 		builder = impl.createLSParser(
 			DOMImplementationLS.MODE_SYNCHRONOUS, null);
 	} catch (Exception e) {
 		System.err.println(e.getMessage());
 		return;
 	}

 	input = impl.createLSInput();
 	input.setByteStream(in);

 	builder.getDomConfig().setParameter("resource-resolver",
 						new XMLCatalogResolver());

 	Document document = builder.parse(input);

But it acts as though I'd never set a resource-resolver:  it still 
requires that the DTD be present in `cwd`.  I 'strace'd it and saw no 
attempt to open the catalog.  (I started java with:

   -Dxml.catalog.files=file:///etc/xml/catalog \
   -Dxml.catalog.verbosity=99

)

What have I missed?

- -- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
Typically when a software vendor says that a product is "intuitive" he
means the exact opposite.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (GNU/Linux)
Comment: pgpenvelope 2.10.2 - http://pgpenvelope.sourceforge.net/

iD8DBQFE4i9is/NR4JuTKG8RAq1HAJ0T0DajedBo9Y3xT5RwA3TwPCvAbACcCwGj
HmorNk8+QBhpoDx2Cnhra5k=
=MD/W
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: How to get an LSParser to use an XMLCatalogResolver?

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
"Eric J. Schwarzenbach" <Er...@wrycan.com> wrote on 
08/16/2006 11:13:09 AM:

> In my experience, Catalogs is a standard that hardly anyone actually 
uses.

I wouldn't go that far (I know of many individual users as well as 
applications which provide support for it; Ant [1], Cocoon [2] and 
<oXygen/> [3] to name a few), but I'd agree that the lack of a catalog 
resolution API in JAXP or elsewhere in Java SE has probably limited its 
usage.

> To do this in a JAXP compatible way, without using Xerces-specific
> functionality, I would stick to the JAXP parser interfaces
> 
> http://xerces.apache.org/xerces2-
> j/javadocs/api/javax/xml/parsers/package-summary.html
> 
> and instead of a Catalog, implement a custom EntityResolver  (see
> DocumentBuilder.setEntityResolver()) to recognize your DTD's system
> identifier and load it from a locally configured location.

Or implement an LSResourceResolver:

http://xerces.apache.org/xerces2-j/javadocs/api/org/w3c/dom/ls/LSResourceResolver.html

since Mark is using an LSParser and not the DocumentBuilder.

> Eric

[1] http://ant.apache.org/manual/CoreTypes/xmlcatalog.html
[2] http://cocoon.apache.org/2.1/userdocs/concepts/catalog.html
[3] http://www.oxygenxml.com/validation.html#xml_catalog

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: How to get an LSParser to use an XMLCatalogResolver?

Posted by "Eric J. Schwarzenbach" <Er...@wrycan.com>.
In my experience, Catalogs is a standard that hardly anyone actually
uses. To do this in a JAXP compatible way, without using Xerces-specific
functionality, I would stick to the JAXP parser interfaces

http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/parsers/package-summary.html

and instead of a Catalog, implement a custom EntityResolver  (see
DocumentBuilder.setEntityResolver()) to recognize your DTD's system
identifier and load it from a locally configured location.

Eric


Mark H. Wood wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> So, because I'm a *stubborn* idiot:
>
>     builder.getDomConfig().setParameter("resource-resolver",
>         new XMLCatalogResolver(
>             System.getProperty("xml.catalog.files").split("\\s")));
>
> This seems to work and keeps unnecessary magic out of the code.  Thanks!
>
> Follow-up question:  is there a better way?  Given that:
>
> o  I want a DOM Document from which to pluck things randomly;
> o  I want to avoid unnecessary use of methods specific to Xerces (and
>     the Commons Resolver);
> o  I want to keep local knowledge (like where to find a DTD) out of the
>     code;
> o  I have no control over the unhelpful composition of the input document
>
> is there another sequence of methods that someone would consider
> better choices?  I'm brand-new to XML processing in Java (and very
> nearly new to XML altogether) so I'd like to learn more of how to
> think about such problems.
>
> - -- Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
> Typically when a software vendor says that a product is "intuitive" he
> means the exact opposite.
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.4 (GNU/Linux)
> Comment: pgpenvelope 2.10.2 - http://pgpenvelope.sourceforge.net/
>
> iD8DBQFE4xwes/NR4JuTKG8RApxgAJwKdPySviFz9/Du8/DJgHjqiy+wDgCfVGTt
> S6mV4P4zTncecNYmF8IjClY=
> =ReoL
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: How to get an LSParser to use an XMLCatalogResolver?

Posted by "Mark H. Wood" <mw...@IUPUI.Edu>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So, because I'm a *stubborn* idiot:

 	builder.getDomConfig().setParameter("resource-resolver",
 		new XMLCatalogResolver(
 			System.getProperty("xml.catalog.files").split("\\s")));

This seems to work and keeps unnecessary magic out of the code.  Thanks!

Follow-up question:  is there a better way?  Given that:

o  I want a DOM Document from which to pluck things randomly;
o  I want to avoid unnecessary use of methods specific to Xerces (and
 	the Commons Resolver);
o  I want to keep local knowledge (like where to find a DTD) out of the
 	code;
o  I have no control over the unhelpful composition of the input document

is there another sequence of methods that someone would consider better 
choices?  I'm brand-new to XML processing in Java (and very nearly new to 
XML altogether) so I'd like to learn more of how to think about such 
problems.

- -- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
Typically when a software vendor says that a product is "intuitive" he
means the exact opposite.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.4 (GNU/Linux)
Comment: pgpenvelope 2.10.2 - http://pgpenvelope.sourceforge.net/

iD8DBQFE4xwes/NR4JuTKG8RApxgAJwKdPySviFz9/Du8/DJgHjqiy+wDgCfVGTt
S6mV4P4zTncecNYmF8IjClY=
=ReoL
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: How to get an LSParser to use an XMLCatalogResolver?

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Mark,

The settings of system properties for the xml-commons resolver have no 
effect on an XMLCatalogResolver. This is by design [1]. You need to set 
the catalog list [2] on each instance of XMLCatalogResolver you create.

Thanks.

[1] 
http://mail-archives.apache.org/mod_mbox/xerces-j-users/200401.mbox/%3cOF327AC41A.F5FB57C0-ON85256E21.007D52CE-85256E21.007FE173@ca.ibm.com%3e
[2] 
http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/XMLCatalogResolver.html#setCatalogList(java.lang.String[])

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

"Mark H. Wood" <mw...@IUPUI.Edu> wrote on 08/15/2006 04:32:27 PM:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> I'm trying to parse some statistical data from a device that sends me 
back 
> a document starting like this:
> 
>    <?xml version="1.0" ?>
>    <!DOCTYPE statistics SYSTEM "statistics-1.0.dtd" >
> 
> Rather than have to have a copy of the DTD in the current working 
> directory whenever I run my app., I'd like to slap a copy of it 
somewhere, 
> point a catalog entry to it, and let a resolver find it.
> 
> After banging my head against the doco. for two days, I tracked down 
> XMLCatalogResolver, which seems to be the sort of thing that an LSParser 

> would want:
> 
>     DOMImplementationLS impl;
>     LSParser builder;
>     LSInput input;
> 
>     System.setProperty(DOMImplementationRegistry.PROPERTY,
>     "org.apache.xerces.dom.DOMImplementationSourceImpl");
> 
>     try {
>        DOMImplementationRegistry registry =
>           DOMImplementationRegistry.newInstance();
> 
>        impl = (DOMImplementationLS)registry.getDOMImplementation("LS");
> 
>        builder = impl.createLSParser(
>           DOMImplementationLS.MODE_SYNCHRONOUS, null);
>     } catch (Exception e) {
>        System.err.println(e.getMessage());
>        return;
>     }
> 
>     input = impl.createLSInput();
>     input.setByteStream(in);
> 
>     builder.getDomConfig().setParameter("resource-resolver",
>                    new XMLCatalogResolver());
> 
>     Document document = builder.parse(input);
> 
> But it acts as though I'd never set a resource-resolver:  it still 
> requires that the DTD be present in `cwd`.  I 'strace'd it and saw no 
> attempt to open the catalog.  (I started java with:
> 
>    -Dxml.catalog.files=file:///etc/xml/catalog \
>    -Dxml.catalog.verbosity=99
> 
> )
> 
> What have I missed?
> 
> - -- 
> Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
> Typically when a software vendor says that a product is "intuitive" he
> means the exact opposite.
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.4 (GNU/Linux)
> Comment: pgpenvelope 2.10.2 - http://pgpenvelope.sourceforge.net/
> 
> iD8DBQFE4i9is/NR4JuTKG8RAq1HAJ0T0DajedBo9Y3xT5RwA3TwPCvAbACcCwGj
> HmorNk8+QBhpoDx2Cnhra5k=
> =MD/W
> -----END PGP SIGNATURE-----
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org