You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Dr. Stefan Bettner" <St...@gmx.de> on 2021/09/21 17:48:58 UTC
PE Reference splitting in INCLUDE-section
Hi everybody,
I have a question regarding the Java xml parser. I have some behavior
that baffles me, regarding the replacement text of parameter entity
references within a conditional INCLUDE-section. And I would really be
grateful if someone could explain it to me.
When I look into the xml specification (fifth edition), they state:
"Well-formedness constraint: PE Between Declarations
The replacement text of a parameter entity reference in a DeclSep MUST
match the production extSubsetDecl."
That prevents for example to split a markup declaration into the
replacement text of two separate parameter entities. Like for example
the entity-declaration
<!ENTITY copyright '(C)'>
cannot be split into two parameter entity references like that:
<!ENTITY % A "<!ENTITY ">
<!ENTITY % B "copyright '(C)'>">
%A;%B;
because in that case the replacement text for %A; would not match the
production extSubsetDecl (as required), because it is incomplete.
And the Java xml-parser reports the above as a fatal error, in
validation and non-validation-mode alike. So far so good.
But strangely the situation changes when the expression %A;%B; is put
into a conditional INCLUDE-section, like that:
<!ENTITY % A "<!ENTITY ">
<!ENTITY % B "copyright '(C)'>">
<![INCLUDE[%A;%B;]]>
In that case the Java xml-parser has no problem at all with the above in
non-validation-mode, and in validation-mode only two validation errors
are given, but no fatal error. In both cases the entity "copyright" is
declared and can be used within the document.
How can that be? I would really be grateful if someone could explain
that to me.
When I look at the grammatical definition of a conditional include-section
[62] includeSect ::= '<![' S? 'INCLUDE' S? '['
extSubsetDecl ']]>'
the inner part should be an extSubsetDecl. So with looking at
[31] extSubsetDecl ::= ( markupdecl | conditionalSect
| DeclSep)*
in our case, when processing the INCLUDE-section, %A; can only match a
DeclSep, and with the well-formedness constraint meantioned above, I
would have assumed that the replacement text of %A; "MUST match the
production extSubsetDecl". But it does not, since the replacement text
of %A; is incomplete.
I would be grateful for any hint.
Thank you so much for your work.
Apache is such a great project.
Bye everybody.
Stay healthy you all.
Stefan Bettner.
PS: I appended my xml files and Java file, for comparison. I was using
the following Java-version on Windows 10:
java version "12.0.2" 2019-07-16
Java(TM) SE Runtime Environment (build 12.0.2+10)
splitEntity.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Root SYSTEM "splitEntity.dtd" [
<!ELEMENT Root ANY>
] >
<Root>©right;</Root>
splitEntity.dtd
<!ENTITY % A "<!ENTITY ">
<!ENTITY % B "copyright '(C)'>">
<![INCLUDE[%A;%B;]]>
XMLSplitEntity.java
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import java.io.File;
import java.io.IOException;
public class XMLSplitEntity
{
/*
* custom DocumentHandler
*/
public static class MyDocumentHandler extends DefaultHandler
{
@Override
public void characters (char[] ch, int start, int length) throws
SAXException
{
System.out.println ("characters: " + new String (ch, start, length));
}
@Override
public void warning (SAXParseException e) throws SAXException
{
System.out.println ("warning: " + e.getMessage ());
}
@Override
public void error (SAXParseException e) throws SAXException
{
System.out.println ("error: " + e.getMessage ());
}
@Override
public void fatalError (SAXParseException e) throws SAXException
{
System.out.println ("fatalError: " + e.getMessage ());
}
}
/*
* parse splitEntity.xml
* (with external splitEntity.dtd)
*/
public static void main (String[] args)
{
// create parser
SAXParserFactory factory = SAXParserFactory.newInstance ();
// factory.setValidating (true);
SAXParser saxParser;
try
{
saxParser = factory.newSAXParser ();
}
catch (ParserConfigurationException | SAXException e)
{
e.printStackTrace ();
return;
}
// parse
File file = new File ("C:\\splitEntity.xml");
try
{
saxParser.parse (file, new MyDocumentHandler ());
}
catch (SAXException | IOException e)
{
e.printStackTrace ();
}
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org