You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by bu...@apache.org on 2003/05/09 17:34:36 UTC
DO NOT REPLY [Bug 19778] - utf-8-encoded source gets mangled in xml-mode

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19778>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19778

utf-8-encoded source gets mangled in xml-mode

marcello@perathoner.de changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|Normal                      |Major
          Component|Jasper 2                    |Jasper
         OS/Version|Other                       |All
           Priority|Other                       |Medium
           Platform|Other                       |All



------- Additional Comments From marcello@perathoner.de  2003-05-09 15:34 -------
ParserController.java:245

    // Figure out the encoding of the page
    // FIXME: We assume xml parser will take care of
    // encoding for page in XML syntax. Correct?

Wrong. The xml parser gets called with an InputStreamReader as 
InputSource. From the SAX docs we get:

http://xml.apache.org/xerces2-j/javadocs/api/org/xml/sax/InputSource.html

    The SAX parser will use the InputSource object to determine how to
    read XML input. If there is a character stream available, the
    parser will read that stream directly, disregarding any text
    encoding declaration found in that stream. If there is no
    character stream, but there is a byte stream, the parser will use
    that byte stream, using the encoding specified in the InputSource
    or else (if no encoding is specified) autodetecting the character
    encoding using an algorithm such as the one in the XML
    specification. If neither a character stream nor a byte stream is
    available, the parser will attempt to open a URI connection to the
    resource identified by the system identifier.

InputStreamReader provides a character stream. So the xml parser will 
do no encoding whatsoever. 

This patch works for me:

--- ParserController-original.java	Wed Mar 19 10:20:58 2003
+++ ParserController.java	Fri May  9 17:17:29 2003
@@ -188,11 +188,16 @@
 
             // dispatch to the proper parser
 	    
-            reader = getReader(absFileName, encoding);
             if (isXml) {
+		InputStream in;
+		in = ctxt.getResourceAsStream(absFileName);
+		if (in == null) {
+		    throw new FileNotFoundException(absFileName);
+		}
                 parsedPage = JspDocumentParser.parse(this, absFileName,
-						     reader, parent);
+						     in, parent);
             } else {
+		reader = getReader(absFileName, encoding);
 		JspReader r = new JspReader(ctxt, absFileName, encoding,
 					    reader,
 					    compiler.getErrorDispatcher());
--- JspDocumentParser.java	Fri May  9 17:17:44 2003
+++ JspDocumentParser-original.java	Wed Mar 19 10:20:58 2003
@@ -120,13 +120,13 @@
      */
     public JspDocumentParser(ParserController pc,
 			     String path,
-			     InputStream in) {
+			     InputStreamReader reader) {
 	this.parserController = pc;
 	this.ctxt = pc.getJspCompilationContext();
 	this.taglibs = pc.getCompiler().getPageInfo().getTagLibraries();
 	this.err = pc.getCompiler().getErrorDispatcher();
 	this.path = path;
-	this.inputSource = new InputSource(in);
+	this.inputSource = new InputSource(reader);
     }
 
     /*
@@ -136,9 +136,9 @@
      */
     public static Node.Nodes parse(ParserController pc,
 				   String path,
-				   InputStream in,
+				   InputStreamReader reader,
 				   Node parent) throws JasperException {
-	JspDocumentParser handler = new JspDocumentParser(pc, path, in);
+	JspDocumentParser handler = new JspDocumentParser(pc, path, reader);
 	handler.current = parent;
 	Node.Nodes pageNodes = null;

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org