You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Chris Rathjen <ch...@infinitetechnology.com> on 2002/02/19 17:26:01 UTC

High-speed XML processing from an external source

This is my first post to the Xerces Java User list, so please forgive me if
this email is misdirected (i.e. should be directed to the dev list) or
otherwise insufficient.


>From the Xerces-Java Performance FAQ:
"If you don't need validation, avoid using a DOCTYPE line in your XML
document. The current version of the parser will always read the DTD if the
DOCTYPE line is specified even when validation feature is set to false."


Unfortunately for me, the application we're developing is NOT the source of
some of the XML documents it will be processing -- and the source MAY elect
to include the DOCTYPE tag, and referencing a DTD on the Internet. This is
problematic for two reasons -- mainly, that fetching and parsing the DTD for
each data document would be prohibitively expensive, and also that the
application's host is not guaranteed to have access to the URI specified in
the DOCTYPE tag.

Is there a workaround for this dilemma, or is a forthcoming release of
Xerces going to allow complete disabling of the DTD fetch-and-parse section
of validation?

Thanks in advance,
Chris Rathjen
Infinite Technology, Inc.



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: High-speed XML processing from an external source

Posted by Jason Rizer <ja...@yahoo.com>.
give this a whirl:

parser.setEntityResolver(new EntityResolver() {
  public InputSource resolveEntity(String pubId,
String sysId)
    throws SAXException, IOException {
    if (sysId.endsWith(".dtd")) {
      Reader reader = new StringReader("");
      InputSource source = new InputSource(reader);
      source.setPublicId(pubId);
      source.setSystemId(sysId);
      return source;
    }
    return null;
  }
});

--- Chris Rathjen <ch...@infinitetechnology.com>
wrote:
> This is my first post to the Xerces Java User list,
> so please forgive me if
> this email is misdirected (i.e. should be directed
> to the dev list) or
> otherwise insufficient.
> 
> 
> From the Xerces-Java Performance FAQ:
> "If you don't need validation, avoid using a DOCTYPE
> line in your XML
> document. The current version of the parser will
> always read the DTD if the
> DOCTYPE line is specified even when validation
> feature is set to false."
> 
> 
> Unfortunately for me, the application we're
> developing is NOT the source of
> some of the XML documents it will be processing --
> and the source MAY elect
> to include the DOCTYPE tag, and referencing a DTD on
> the Internet. This is
> problematic for two reasons -- mainly, that fetching
> and parsing the DTD for
> each data document would be prohibitively expensive,
> and also that the
> application's host is not guaranteed to have access
> to the URI specified in
> the DOCTYPE tag.
> 
> Is there a workaround for this dilemma, or is a
> forthcoming release of
> Xerces going to allow complete disabling of the DTD
> fetch-and-parse section
> of validation?
> 
> Thanks in advance,
> Chris Rathjen
> Infinite Technology, Inc.
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail:
> xerces-j-user-help@xml.apache.org
> 


__________________________________________________
Do You Yahoo!?
Yahoo! Sports - Coverage of the 2002 Olympic Games
http://sports.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org