You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by David N Bertoni/Cambridge/IBM <da...@us.ibm.com> on 2003/01/22 22:20:21 UTC

Re: Bug or Feature: unescaping & in href attribute while in HTML output method




Hi Robert,

For a long time, Xalan-C++ followed the behavior of Xalan-J, which is the
behavior you're seeing with Xalan-C++ 1.4.  The latest CVS code escapes the
&, breaking with the Xalan-J behavior.

Although this is definitely a bug, I'm guessing Xalan-J does this for
compatibility with broken browsers.  The next version of Xalan-C will fix
this bug.

Dave



                                                                                                                            
                      Robert Schiele                                                                                        
                      <rschiele@uni-ma         To:      xalan-dev <xa...@xml.apache.org>                                
                      nnheim.de>               cc:      (bcc: David N Bertoni/Cambridge/IBM)                                
                                               Subject: Bug or Feature: unescaping &amp; in href attribute while in HTML    
                      01/22/2003 12:22         output method                                                                
                      PM                                                                                                    
                      Please respond                                                                                        
                      to xalan-dev                                                                                          
                                                                                                                            



Hi.

I am not sure whether the following is correct and such intended
behaviour, or it is a bug in Xalan-c 1.4:

Take the following XSLT script:

---
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="html"/>

  <xsl:template match="/">
    <a href="&amp;"/>
  </xsl:template>
</xsl:stylesheet>
---

Take any XML file and run Xalan on them.

You get:

# Xalan xalanbug.xml xalanbug.xsl
<a href="&"></a>
#

As c/src/XMLSupport/FormatterToHTML.cpp says:

        // http://www.ietf.org/rfc/rfc2396.txt says:
        // A URI is always in an "escaped" form, since escaping or
unescaping a
        // completed URI might change its semantics.  Normally, the only
time
        // escape encodings can safely be made is when the URI is being
created
        // from its component parts; each component may have its own set of
        // characters that are reserved, so only the mechanism responsible
for
        // generating or interpreting that component can determine whether
or
        // not escaping a character will change its semantics. Likewise, a
URI
        // must be separated into its components before the escaped
characters
        // within those components can be safely decoded.
        //
        // ...So we do our best to do limited escaping of the URL, without
        // causing damage.      If the URL is already properly escaped, in
theory, this
        // function should not change the string value.

I would have expected "&amp;" to stay in escaped mode, but this is not
the case.

So the question is for me, which behaviour is correct?  The one
Xalan-c 1.4 works or the one not touching the URI?

Robert

--
Robert Schiele                                   Tel.: +49-621-181-2517
Dipl.-Wirtsch.informatiker           mailto:rschiele@uni-mannheim.de
(See attached file: attpvhk1.dat)