You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by bu...@apache.org on 2002/11/19 12:21:00 UTC

DO NOT REPLY [Bug 14672] New: - SAXParserException while parsing valid comments

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14672>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14672

SAXParserException while parsing valid comments

           Summary: SAXParserException while parsing valid comments
           Product: Xerces2-J
           Version: 2.2.0
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: SAX
        AssignedTo: xerces-j-dev@xml.apache.org
        ReportedBy: tim@xrefer.com
                CC: tim@xrefer.com


xerces 2.2.1 can throw a SAXParserException while parsing valid
comments. This can lead to severe problems with JSP deployment, where the
exception is thrown while parsing standard JSTL DTDs. In Tomcat 4.1.12, the
error is reported but deployment succeeds. In Jetty under JBoss 3.0.3,
deployment fails consistently.


The test system is running RedHat Linux version 7.3.


Procedure to reproduce problem in Tomcat
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Install j2sdk1.4.1.

2. Install jakarta-tomcat-4.1.12-LE-jdk14.

3. Run tomcat's startup.sh script and verify that pages are served
   correctly from http://localhost:8080/

4. Shutdown tomcat.

5. Obtain and build xerces 2.2.1 from cvs. You need to build the java
   jar and apijar ant targets.

6. To switch jdk over to use xerces rather than inbuilt crimson
   parser, create a jaxp.properties file in JAVA_HOME/jre/lib/ with the
   following contents:

javax.xml.parsers.SAXParserFactory=org.apache.xerces.jaxp.SAXParserFactoryImpl
javax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl
javax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl


7. Copy your newly-built xercesImpl.jar and xmlParserAPIs.jar into
   TOMCAT_HOME/common/lib

8. Restart tomcat. You should observe output similar to the following
    in your log directory:

==> localhost_admin_log.2002-11-19.txt <==
2002-11-19 10:31:23 action: null
org.xml.sax.SAXParseException: The string "--" is not permitted within comments.
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:314)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:89)
        at org.apache.struts.digester.Digester.parse(Digester.java:755)
        at
org.apache.struts.action.ActionServlet.initServlet(ActionServlet.java:1434)
        at org.apache.struts.action.ActionServlet.init(ActionServlet.java:474)
        at
org.apache.webapp.admin.ApplicationServlet.init(ApplicationServlet.java:152)
        at javax.servlet.GenericServlet.init(GenericServlet.java:256)
        at
org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:924)
        at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:813)


Cause
~~~~~
The problem lies in the org.apache.xerces.impl.XMLEntityManager class,
when we are scanning through a buffer in search of a specified
delimiter.

Within the scanData method, a check is made to see how close we are
to the end of the currently-loaded data. If we are within a
delimiter's length of the end of the buffer, we move the remaining
data down to the start of the buffer and read another chunk.

There is a subsequent check to ensure we now have at least a delimiter's
length-worth of data to scan. If we have not then we assume that the
remaining data must match the delimiter and return.

This is not necessarily the case, however - the call to read() may read
fewer than the requested number of characters, but this does not
indicate the end of the stream has been reached.

The supplied patch replaces the single "if" clause with a "while"
loop. This continues to call read() until we have acquired more than
a delimiter's length-worth of data, or until we have definitely
reached the end of the current entity.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org