You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by pps <i-...@yandex.ru> on 2005/10/17 22:41:21 UTC

Re: [OBORONA-SPAM] Ignoring DTDs

Robert William Vesterman wrote:
> I am using Xerces-C SAX2 to parse some documents that I write.  When I write
> them, I would like to include a reference to an appropriate DTD in a
> doctypedecl.  However, I want to do so basically for the purposes of people
> looking at the documents, or trying to build their own.  I do *not* want
> Xerces to pay attention to the DTD while parsing the XML, and I *especially*
> do not want Xerces to try to fetch the DTD for any purpose (this application
> will be running on non-internet enabled machines, and speed is of the
> essence).
> 
> So, essentially, I would like to know if it's possible to set up Xerces-C
> SAX2 to ignore a well-formed doctypedecl, and if so, how.
> 
> Again, please note that by "ignore", I mean more than just (for example) "If
> I don't set a DTD handler, Xerces won't pass me notification of DTD events".
> Whether or not I get notified of DTD events is not the issue - it's only a
> side effect.  The issue is that I don't want Xerces going and trying to
> fetch the DTD in the first place.
> 
> Thanks in advance for any help.
> 
> Bob Vesterman.
>


As far as I understood you, you want some control on processing external 
dtd's. I needed similar control, so I did it this way - I overloaded 
resolveEntity of XercesDOMParser like this:

InputSource*
dom_parser::resolveEntity(XMLResourceIdentifier* rId){
   const XMLCh *pub_id = rId->getPublicId();
   if(pub_id){
     dtd_map_t::const_iterator i=dtds.find(pub_id);
     if(i!=dtds.end()){
       return new LocalFileInputSource(i->second.c_str());
     }
   }
   return 0;
}

where dtd_map_t dtds is map<string,string> that maps certain public ids 
to local files; something like this:
dtds["-//W3C//DTD XHTML 1.0 Strict//EN"] = "dtd/Strict1_0.dtd";
dtds["-//W3C//DTD XHTML 1.0 Frameset//EN"] = "dtd/Frameset1_0.dtd";
dtds["-//W3C//DTD XHTML 1.0 Transitional//EN"] = "dtd/Transitional1_0.dtd";


resolveEntity returns 0 for default processing (fetching file), you may 
return new LocalFileInputSource("dummy.dtd"); if you want to ignore 
remote dtd

Between, there might be better ways to do it, I tried to ask this list, 
but nobody even replied me, so I figured out this on my own how to do 
what I needed (after wasting a lot of time - no docs, tutorials, etc...)