You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by ji...@apache.org on 2004/04/12 23:28:08 UTC

[jira] Updated: (XERCERJ-520) SAXParserException while parsing valid comments

The following issue has been updated:

    Updater: Serge Knystautas (mailto:sergek@lokitech.com)
       Date: Mon, 12 Apr 2004 2:27 PM
    Changes:
             Attachment changed from XMLEntityManager_scanData.patch.gz
    ---------------------------------------------------------------------
For a full history of the issue, see:

  http://issues.apache.org/jira/browse/XERCERJ-520?page=history

---------------------------------------------------------------------
View the issue:
  http://issues.apache.org/jira/browse/XERCERJ-520

Here is an overview of the issue:
---------------------------------------------------------------------
        Key: XERCERJ-520
    Summary: SAXParserException while parsing valid comments
       Type: Bug

     Status: Resolved
 Resolution: DUPLICATE

    Project: Xerces2-J

   Assignee: Xerces-J Developers Mailing List
   Reporter: Tim Bruce

    Created: Tue, 19 Nov 2002 11:20 AM
    Updated: Mon, 12 Apr 2004 2:27 PM
Environment: Operating System: Linux
Platform: PC

Description:
xerces 2.2.1 can throw a SAXParserException while parsing valid
comments. This can lead to severe problems with JSP deployment, where the
exception is thrown while parsing standard JSTL DTDs. In Tomcat 4.1.12, the
error is reported but deployment succeeds. In Jetty under JBoss 3.0.3,
deployment fails consistently.


The test system is running RedHat Linux version 7.3.


Procedure to reproduce problem in Tomcat
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Install j2sdk1.4.1.

2. Install jakarta-tomcat-4.1.12-LE-jdk14.

3. Run tomcat's startup.sh script and verify that pages are served
   correctly from http://localhost:8080/

4. Shutdown tomcat.

5. Obtain and build xerces 2.2.1 from cvs. You need to build the java
   jar and apijar ant targets.

6. To switch jdk over to use xerces rather than inbuilt crimson
   parser, create a jaxp.properties file in JAVA_HOME/jre/lib/ with the
   following contents:

javax.xml.parsers.SAXParserFactory=org.apache.xerces.jaxp.SAXParserFactoryImpl
javax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl
javax.xml.transform.TransformerFactory=org.apache.xalan.processor.TransformerFactoryImpl


7. Copy your newly-built xercesImpl.jar and xmlParserAPIs.jar into
   TOMCAT_HOME/common/lib

8. Restart tomcat. You should observe output similar to the following
    in your log directory:

==> localhost_admin_log.2002-11-19.txt <==
2002-11-19 10:31:23 action: null
org.xml.sax.SAXParseException: The string "--" is not permitted within comments.
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:314)
        at javax.xml.parsers.SAXParser.parse(SAXParser.java:89)
        at org.apache.struts.digester.Digester.parse(Digester.java:755)
        at
org.apache.struts.action.ActionServlet.initServlet(ActionServlet.java:1434)
        at org.apache.struts.action.ActionServlet.init(ActionServlet.java:474)
        at
org.apache.webapp.admin.ApplicationServlet.init(ApplicationServlet.java:152)
        at javax.servlet.GenericServlet.init(GenericServlet.java:256)
        at
org.apache.catalina.core.StandardWrapper.loadServlet(StandardWrapper.java:924)
        at org.apache.catalina.core.StandardWrapper.load(StandardWrapper.java:813)


Cause
~~~~~
The problem lies in the org.apache.xerces.impl.XMLEntityManager class,
when we are scanning through a buffer in search of a specified
delimiter.

Within the scanData method, a check is made to see how close we are
to the end of the currently-loaded data. If we are within a
delimiter's length of the end of the buffer, we move the remaining
data down to the start of the buffer and read another chunk.

There is a subsequent check to ensure we now have at least a delimiter's
length-worth of data to scan. If we have not then we assume that the
remaining data must match the delimiter and return.

This is not necessarily the case, however - the call to read() may read
fewer than the requested number of characters, but this does not
indicate the end of the stream has been reached.

The supplied patch replaces the single "if" clause with a "while"
loop. This continues to call read() until we have acquired more than
a delimiter's length-worth of data, or until we have definitely
reached the end of the current entity.


---------------------------------------------------------------------
JIRA INFORMATION:
This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa

If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-dev-help@xml.apache.org