You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Aleksandar Milanovic <am...@galdosinc.com> on 2002/08/13 05:49:06 UTC

DocumentBuilderFactory + grammar caching

Hi All,

This question is related a bit to Apache SOAP, but since its answer lies in
Xerces (and SOAP mailing list keeps rejecting my messages???) I am posting
it here.

My application needs to validate all incoming requests against XML schemas
stored in a database (most importantly they're NOT available in the file
system or on-line). I figured that the new grammar pool would be useful in
this context. I played with it a bit, and got some promising results.
However, my troubles started when I tried to incorporate validation in the
application and its messaging subsystem. The app uses SOAP for messaging.
Although I haven't tried it yet, I think I should have no problems
performing validation on a message that has been already parsed. However,
since this would require reparsing the message, and some of these messages
can be quite bulky (many megabytes), I decided to try to incorporate
validation in Apache SOAP so that the message would be validated as it is
parsed. Apache SOAP has a class org.apache.soap.util.xml.XMLParserUtil which
lets one set the DocumentBuilderFactory that will be then used throughout
this API. In Xerces FAQ I found that to enable grammar caching with JAXP one
has to write:

System.setProperty("org.apache.xerces.xni.parser.Configuration",
        "org.apache.xerces.parsers.XMLGrammarCachingConfiguration");
DocumentBuilder builder = // JAXP factory invocation
        //parse documents and store grammars

This is supposed to enable the "passive" grammar caching. Since this sets a
system-level property, I am afraid that it will affect all subsequently
instantiated parsers, even those not used in messaging that do not have to
perform any validation nor grammars caching. I wouldn't like to change
Apache SOAP code for this to work.

Is there anything I can do to create a DocumentBuilderFactory that creates
DocumentBuilders that are associated with a grammar cache/pool, so that this
influences only the client of this DocumentBuilderFactory (in this case
Apache SOAP) and not the rest of the application?

An equally important question related to this is if I can somehow instruct
Xerces to:
1) ignore schemaLocation attribute
2) always take the schema from a schema cache
3) do this in combination with the solution for the problem described above
(supposedly involving passive caching)
This is important because incoming messages may or may not have
schemaLocation set, and even if set it could contain wrong information.


Thanks,
Alex



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


RE: DocumentBuilderFactory + grammar caching

Posted by Aleksandar Milanovic <am...@galdosinc.com>.
Hi Loz,

Thanks for your help, but I was actually able to do this what you're
suggesting. The problems started when I tried to combine this with a
solution to the main problem.

Alex

> -----Original Message-----
> From: Loz [mailto:loz@flower.powernet.co.uk]
> Sent: Tuesday, August 13, 2002 3:38 AM
> To: xerces-j-user@xml.apache.org
> Subject: Re: DocumentBuilderFactory + grammar caching
>
>
> Aleksandar Milanovic wrote:
> > Hi All,
> >
> >
> > An equally important question related to this is if I can
> somehow instruct
> > Xerces to:
> > 1) ignore schemaLocation attribute
> > 2) always take the schema from a schema cache
>
> Most of that went over my head. But for this bit, the schema document is
> treated as an external entity (from the point of view of an instance
> document to be validated). This means you could use a custom
> EntityResolver with your DocumentBuilder to look the schema document up
> on your database (and cache it). I think the EntityResolver will get
> passed the schemaLocation attribute as the systemid.
>
> regards
>
> Loz
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: DocumentBuilderFactory + grammar caching

Posted by Loz <lo...@flower.powernet.co.uk>.
Aleksandar Milanovic wrote:
> Hi All,
> 
> 
> An equally important question related to this is if I can somehow instruct
> Xerces to:
> 1) ignore schemaLocation attribute
> 2) always take the schema from a schema cache

Most of that went over my head. But for this bit, the schema document is 
treated as an external entity (from the point of view of an instance 
document to be validated). This means you could use a custom 
EntityResolver with your DocumentBuilder to look the schema document up 
on your database (and cache it). I think the EntityResolver will get 
passed the schemaLocation attribute as the systemid.

regards

Loz


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org