You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@xml.apache.org by Jacob Kjome <ho...@visi.com> on 2006/04/07 06:33:53 UTC
Re: how do I detect internal subset when part of external
subset?
Hi Michael,
I just figured that out shortly after I sent the
email. Just didn't get a chance to reply before
you sent yours. Sorry about that. It always
seems that I figure it out right after I hit the
"send" button. Thanks for the references.
later,
Jake
At 10:32 PM 4/6/2006, you wrote:
>Hi Jacob,
>
><!ENTITY head SYSTEM "header.xml">
><!ENTITY foot SYSTEM "footer.xml">
><!ENTITY torso SYSTEM "body.xml">
>
>are external entity declarations [1][2]. They are reported by
>XMLDTDHandler.externalEntityDecl() in XNI and DeclHandler.
>externalEntityDecl() in SAX.
>
>Thanks.
>
>[1] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-entity-decl
>[2] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-external-ent
>
>Michael Glavassevich
>XML Parser Development
>IBM Toronto Lab
>E-mail: mrglavas@ca.ibm.com
>E-mail: mrglavas@apache.org
>
>Jacob Kjome <ho...@visi.com> wrote on 04/06/2006 11:07:57 PM:
>
>>
>> Thanks for the tip, Elliotte. I'll remember it
>> when I use SAX. I'm using XNI in this case. I
>> suppose I could use SAX, but I'm really just
>> trying to migrate from Xerces1 to Xerces2 for
>> XMLC. XMLC already depends directly on Xerces
>> because of the custom DOM's XMLC implements. I
>> also wanted to change as little as possible. I
>> may make more radical changes once I've proven
>> that I can make things work properly with minimal changes.
>>
>> In any case, I think I've got the internal subset
>> stuff working, except for one thing. Take the following document...
>>
>> <?xml version="1.0" standalone="no"?>
>> <!DOCTYPE document SYSTEM "document.dtd" [
>> <!ENTITY head SYSTEM "header.xml">
>> <!ENTITY foot SYSTEM "footer.xml">
>> <!ENTITY torso SYSTEM "body.xml">
>> <!ENTITY erh "Elliotte Rusty Harold">
>> ]>
>> <document>
>> &head; &torso; &foot;
>> </document>
>>
>> The only part of this that ends up in the
>> internal subset is the "erh" entity. That is,
>> the internalEntityDecl() method gets called only
>> for the "erh" entity and is not notified at all
>> for the other entities. Then, as I build up the
>> DOM, I create EntityReference's for "&head;
>> &torso; &foot;" in the <document>. Upon
>> serialization, they end up being there in the
>> document, but since I was never notified to
>> create the corresponding <!ENTITY> elements in
>> the internal subset, re-parsing of the serialized
>> document fails. So, how do I get notified about
>> these so I can get them into the DOM unparsed? I
>> want the serialized DOM to look as identical as
>> possible to the above. I must be missing something.
>>
>>
>> Jake
>>
>>
>> At 06:41 AM 4/4/2006, you wrote:
>> >The trick is to look for the entity name "[dtd]". XOM accomplishes
>this
>> >thusly using pure SAX:
>> >
>> >
>> > protected boolean inExternalSubset = false;
>> >
>> > // We have a problem here. Xerces gets this right,
>> > // but Crimson and possibly other parsers don't properly
>> > // report these entities, or perhaps just not tag them
>> > // with [dtd] like they're supposed to.
>> > public void startEntity(String name) {
>> > if (name.equals("[dtd]")) inExternalSubset = true;
>> > }
>> >
>> >
>> > public void endEntity(String name) {
>> > if (name.equals("[dtd]")) inExternalSubset = false;
>> > }
>> >
>> >You can just reverse the logic if you prefer inInternalSubset.
>> >
>> >--
>> >Elliotte Rusty Harold elharo@metalab.unc.edu
>> >XML in a Nutshell 3rd Edition Just Published!
>> >http://www.cafeconleche.org/books/xian3/
>> >http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
>> >
>> >---------------------------------------------------------------------
>> >To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
>> >For additional commands, e-mail: general-help@xml.apache.org
>> >
>> >
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
>> For additional commands, e-mail: general-help@xml.apache.org
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
>For additional commands, e-mail: general-help@xml.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org