You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by bu...@apache.org on 2002/10/23 20:11:08 UTC
DO NOT REPLY [Bug 13897] New: -
Reuse parser and cache XML schema in XalanC
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13897>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=13897
Reuse parser and cache XML schema in XalanC
Summary: Reuse parser and cache XML schema in XalanC
Product: XalanC
Version: 1.4.x
Platform: All
OS/Version: All
Status: NEW
Severity: Enhancement
Priority: Other
Component: XalanC
AssignedTo: xalan-dev@xml.apache.org
ReportedBy: thomas.cherel@ascentialsoftware.com
It would be nice to expose in XalanC the latest Xerces features to cache
analyzed schema to be reused accross multiple parsing/validation.
It would also mean the reuse of the same parser instance for multiple XSLT
processing in XalanC and even within specific XSLT function such as the
document() one.
Here is a short version of an email exchange in the mailing list describing
the issue with more details as well as providing a "workaround" to do it.
-----Original Message-----
From: David N Bertoni/Cambridge/IBM [mailto:david_n_bertoni@us.ibm.com]
Sent: Tuesday, October 22, 2002 4:35 PM
To: xalan-c-users@xml.apache.org
Subject: RE: Schema validation performance
Hi Thomas,
You can use Xerces to parse a document without switching to the internal
interfaces. Here's some pseudo-code, which I haven't tested, but which
should give you an idea of what you need to do:
void
parse(
const InputSource& theInputSource,
XalanCompiledStylesheet* theStylesheet,
const XSLTResultTarget& theResultTarget)
{
SAX2XMLReader* const theReader = XMLReaderFactory::createXMLReader();
XalanTransformer theTransformer;
XalanDocumentBuilder* const theBuilder =
theTransformer.createDocumentBuilder();
theReader->setContentHandler(theBuilder.getContentHandler());
theReader->setLexicalHandler(theBuilder.getLexicalHandler());
theReader->setDTDHandler(theBuilder.getDTDHandler());
const XalanDOMString
reuseGrammar("http://apache.org/xml/features/validation/reuse-grammar");
const XalanDOMString
namespacePrefixes("http://xml.org/sax/features/namespace-prefixes");
theReader->setFeature(reuseGrammar.c_str(), true);
theReader->setFeature(namespacePrefixes.c_str(), true);
theReader->parse(theInputSource)
delete theReader;
theTransformer.transform(*theBuilder, theStylesheet, theResultTarget); }
Of course, since I'm not really re-using the parser, it doesn't used the
cached grammar, but it gives you an idea of how you can do this. The only
drawback is that document brought into the transformation through the
document() function will not use this parser instance, and so will not use
the cached grammar.
Dave
-----Original Message-----
From: Thomas Cherel
Until it gets added to Xalan, is there any way I can use the Xerces
interface directly? For example, today, I can provide to Xalan an already
parsed document (a DOM tree). Can I use the new Xerces API to generate such
a DOM tree (and reuse schema/grammar for the validation that will be done at
that time), and then pass it to Xalan (that will take care of the XSLT
processing only)?
Thomas
-----Original Message-----
From: David N Bertoni/Cambridge/IBM [mailto:david_n_bertoni@us.ibm.com]
Sent: Tuesday, October 22, 2002 1:19 PM
To: xalan-c-users@xml.apache.org
Subject: Re: Schema validation performance
Hi Thomas,
With the latest Xerces, you can prime a parser instance with a particular
schema, then have it re-use that schema over and over again. You can also
have it re-use a grammar for every document it parses. However, these
interfaces are new and still experimental, so I don't have much experience
using them.
We don't expose lots of the Xerces parser interfaces because it gets very
burdensome to do so. However, this one is probably worth doing, so you
might want to enter a Bugzilla request for an enhancement.
Dave
-----Original Message-----
From: Thomas Cherel
When processing an XML document (applying a style sheet), I can turn on the
validation of the XML document against its schema. Is there any way (or may
be this is already done under the cover) to cache the XML schema for
validation of other XML documents?
What I mean is that if I process a bunch of XML documents in sequence, and
all of them are using the same XML schema, it will be nice if the schema is
downloaded and analyzed only once instead of for each document.