You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by "Erik Wright (JIRA)" <xe...@xml.apache.org> on 2008/08/13 16:36:44 UTC
[jira] Created: (XERCESC-1828) LexicalHandler startEntity/endEntity
events not paired and have incorrect arguments
LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
-----------------------------------------------------------------------------------
Key: XERCESC-1828
URL: https://issues.apache.org/jira/browse/XERCESC-1828
Project: Xerces-C++
Issue Type: Bug
Components: SAX/SAX2
Affects Versions: 2.8.0
Environment: OS/X, Win32
Reporter: Erik Wright
Attachments: test.xml
It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
(Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
startDocument
...
startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
...
startEntity: [dtd]
...
startEntity: [dtd]
...
startEntity: [dtd]
...
startEntity: [dtd]
...
endEntity: [dtd]
...
endDTD
...
endDocument
I expected something more like the following (as generated by the standard SAX parser in Java 6):
startDocument
startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
startEntity: '[dtd]'
startEntity: '%HTMLlat1'
endEntity: '%HTMLlat1'
startEntity: '%HTMLsymbol'
endEntity: '%HTMLsymbol'
startEntity: '%HTMLspecial'
endEntity: '%HTMLspecial'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%head.misc'
endEntity: '%head.misc'
startEntity: '%block'
endEntity: '%block'
startEntity: '%inline'
endEntity: '%inline'
startEntity: '%misc'
endEntity: '%misc'
startEntity: '%block'
endEntity: '%block'
startEntity: '%misc'
endEntity: '%misc'
startEntity: '%block'
endEntity: '%block'
startEntity: '%inline'
endEntity: '%inline'
startEntity: '%misc'
endEntity: '%misc'
endEntity: '[dtd]'
endDTD
startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
endPrefixMapping: ''
endDocument
At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
if (fDocTypeHandler && !inIncludeSect)
fDocTypeHandler->startExtSubset();
...
...
...
if (fDocTypeHandler && isDTD)
fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
[jira] Updated: (XERCESC-1828) LexicalHandler startEntity/endEntity
events not paired and have incorrect arguments
Posted by "Erik Wright (JIRA)" <xe...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Wright updated XERCESC-1828:
---------------------------------
Attachment: test.xml
A sample XHTML file used to reproduce the error.
> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
> Key: XERCESC-1828
> URL: https://issues.apache.org/jira/browse/XERCESC-1828
> Project: Xerces-C++
> Issue Type: Bug
> Components: SAX/SAX2
> Affects Versions: 2.8.0
> Environment: OS/X, Win32
> Reporter: Erik Wright
> Attachments: test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
> if (fDocTypeHandler && !inIncludeSect)
> fDocTypeHandler->startExtSubset();
> ...
> ...
> ...
> if (fDocTypeHandler && isDTD)
> fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
[jira] Resolved: (XERCESC-1828) LexicalHandler
startEntity/endEntity events not paired and have incorrect arguments
Posted by "Alberto Massari (JIRA)" <xe...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alberto Massari resolved XERCESC-1828.
--------------------------------------
Resolution: Fixed
Fix Version/s: 3.1.0
Assignee: Alberto Massari
I have fixed the notifications for start/end of an external DTD; as for the missing notifications of entity expansions inside a DTD, they are not supported by Xerces-C. If you feel that this (optional) feature should be exposed by Xerces-C, please open a different bug, marking it as a request for improvement.
Thanks,
Alberto
> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
> Key: XERCESC-1828
> URL: https://issues.apache.org/jira/browse/XERCESC-1828
> Project: Xerces-C++
> Issue Type: Bug
> Components: SAX/SAX2
> Affects Versions: 2.8.0
> Environment: OS/X, Win32
> Reporter: Erik Wright
> Assignee: Alberto Massari
> Fix For: 3.1.0
>
> Attachments: java.output, SAX2EventsSample.tgz, Test.java, test.output, test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
> if (fDocTypeHandler && !inIncludeSect)
> fDocTypeHandler->startExtSubset();
> ...
> ...
> ...
> if (fDocTypeHandler && isDTD)
> fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
[jira] Updated: (XERCESC-1828) LexicalHandler startEntity/endEntity
events not paired and have incorrect arguments
Posted by "Erik Wright (JIRA)" <xe...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Wright updated XERCESC-1828:
---------------------------------
Attachment: test.output
The events generated by parsing the previously attached sample XHTML file.
> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
> Key: XERCESC-1828
> URL: https://issues.apache.org/jira/browse/XERCESC-1828
> Project: Xerces-C++
> Issue Type: Bug
> Components: SAX/SAX2
> Affects Versions: 2.8.0
> Environment: OS/X, Win32
> Reporter: Erik Wright
> Attachments: test.output, test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
> if (fDocTypeHandler && !inIncludeSect)
> fDocTypeHandler->startExtSubset();
> ...
> ...
> ...
> if (fDocTypeHandler && isDTD)
> fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
[jira] Updated: (XERCESC-1828) LexicalHandler startEntity/endEntity
events not paired and have incorrect arguments
Posted by "Erik Wright (JIRA)" <xe...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Wright updated XERCESC-1828:
---------------------------------
Attachment: java.output
Here is the output generated by parsing the same file with the standard SAX parser in Java 6. A diff of the two files is quite illuminating.
> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
> Key: XERCESC-1828
> URL: https://issues.apache.org/jira/browse/XERCESC-1828
> Project: Xerces-C++
> Issue Type: Bug
> Components: SAX/SAX2
> Affects Versions: 2.8.0
> Environment: OS/X, Win32
> Reporter: Erik Wright
> Attachments: java.output, SAX2EventsSample.tgz, Test.java, test.output, test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
> if (fDocTypeHandler && !inIncludeSect)
> fDocTypeHandler->startExtSubset();
> ...
> ...
> ...
> if (fDocTypeHandler && isDTD)
> fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
[jira] Updated: (XERCESC-1828) LexicalHandler startEntity/endEntity
events not paired and have incorrect arguments
Posted by "Erik Wright (JIRA)" <xe...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Wright updated XERCESC-1828:
---------------------------------
Attachment: Test.java
Here is the Java file used to generate the Java comparison output.
> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
> Key: XERCESC-1828
> URL: https://issues.apache.org/jira/browse/XERCESC-1828
> Project: Xerces-C++
> Issue Type: Bug
> Components: SAX/SAX2
> Affects Versions: 2.8.0
> Environment: OS/X, Win32
> Reporter: Erik Wright
> Attachments: java.output, SAX2EventsSample.tgz, Test.java, test.output, test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
> if (fDocTypeHandler && !inIncludeSect)
> fDocTypeHandler->startExtSubset();
> ...
> ...
> ...
> if (fDocTypeHandler && isDTD)
> fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
[jira] Updated: (XERCESC-1828) LexicalHandler startEntity/endEntity
events not paired and have incorrect arguments
Posted by "Erik Wright (JIRA)" <xe...@xml.apache.org>.
[ https://issues.apache.org/jira/browse/XERCESC-1828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Erik Wright updated XERCESC-1828:
---------------------------------
Attachment: SAX2EventsSample.tgz
This is a modification of the SAX2Print sample shipped with xerces-c 2.8.0. Instead of printing a formatted XML file, it prints all of the events sent on the LexicalHandler, DeclHandler, and ContentHandler interfaces.
It should compile correctly if you expand it into the xerces-c samples directory... you may need to add it to your samples/configure.in file in order to generate an appropriate Makefile for your build environment.
It might be useful to add this sample to the distribution, since it is a pretty useful illustration of what the SAX parser is doing.
> LexicalHandler startEntity/endEntity events not paired and have incorrect arguments
> -----------------------------------------------------------------------------------
>
> Key: XERCESC-1828
> URL: https://issues.apache.org/jira/browse/XERCESC-1828
> Project: Xerces-C++
> Issue Type: Bug
> Components: SAX/SAX2
> Affects Versions: 2.8.0
> Environment: OS/X, Win32
> Reporter: Erik Wright
> Attachments: SAX2EventsSample.tgz, test.output, test.xml
>
>
> It appears that the LexicalHandler events startEntity and endEntity are not sent correctly when parsing a document with a DTD that itself references external entities.
> (Note: I will attach sample XML, repro code, and the full output of the code. The following is a summary.)
> For example, I have been parsing a valid XHTML document. The strict XHTML DTD includes 4 other files with entity declarations. I see the following events on my LexicalHandler (ignoring elements, characters, whitespace, external entity declarations, and comments):
> startDocument
> ...
> startDTD: html, -//W3C//DTD XHTML 1.0 Strict//EN, http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> startEntity: [dtd]
> ...
> endEntity: [dtd]
> ...
> endDTD
> ...
> endDocument
> I expected something more like the following (as generated by the standard SAX parser in Java 6):
> startDocument
> startDTD: 'html', '-//W3C//DTD XHTML 1.0 Strict//EN', 'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
> startEntity: '[dtd]'
> startEntity: '%HTMLlat1'
> endEntity: '%HTMLlat1'
> startEntity: '%HTMLsymbol'
> endEntity: '%HTMLsymbol'
> startEntity: '%HTMLspecial'
> endEntity: '%HTMLspecial'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%head.misc'
> endEntity: '%head.misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%misc'
> endEntity: '%misc'
> startEntity: '%block'
> endEntity: '%block'
> startEntity: '%inline'
> endEntity: '%inline'
> startEntity: '%misc'
> endEntity: '%misc'
> endEntity: '[dtd]'
> endDTD
> startPrefixMapping: '', 'http://www.w3.org/1999/xhtml'
> endPrefixMapping: ''
> endDocument
> At a minimum, the mismatch of startEntity/endEntity events appears to be caused by the following code from DTDScanner::scanExtSubsetDecl (notice that the conditions are not the same):
> if (fDocTypeHandler && !inIncludeSect)
> fDocTypeHandler->startExtSubset();
> ...
> ...
> ...
> if (fDocTypeHandler && isDTD)
> fDocTypeHandler->endExtSubset();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org