You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@xalan.apache.org by matthew denner <ma...@evtechnology.com> on 2000/12/13 14:03:57 UTC

problems with Xalan & Xerces

Dear all,

[ sorry for the mail to both xerces and xalan but this may be a bug in      ]
[ xalan (both versions) or a bug / feature request for xerces.              ]

i've got a simple XSL transform that changes all of the <a>...</a> elements
in an XHTML document to <h1>...</h1>.  however, it would appear to fail if
the XHTML document references a DTD as the DTD is included in the final 
output.  can i turn this inclusion off (i notice that the feature
"http://xml.org/sax/features/external-general-entities" is unsupported!)?

the XSL transform, and two example input files are included here:

============================= transform.xsl =================================
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
                version="1.0">
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="a">
    <xsl:message>changed an anchor</xsl:message>
    <h1><xsl:value-of select="."/></h1>
  </xsl:template>
</xsl:stylesheet>
=============================================================================
============================ withadtd.xhtml =================================
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
  <body>
    <a href="link">link to somewhere</a>
  </body>
</html>
=============================================================================
=========================== withoutadtd.xhtml ===============================
<html>
  <body>
    <a href="link">link to somewhere</a>
  </body>
</html>
=============================================================================

running:

java org.apache.xalan.xslt.Process -xsl transform.xsl -in withoutadtd.html

generates a correct output (all <a>...</a> replaced correctly and a message
seen on the display).

running:

java org.apache.xalan.xslt.Process -xsl transform.xsl -in withadtd.html

generates incorrect output.  no <a>...</a> are replaced, no message is seen,
and the output contains the referenced DTD.

i believe the inclusion of the DTD in the HTML file is causing a problem and
removing this (from the real world files) is impossible as you get parser
errors when the XHTML contains "&nbsp;" (missing referenced entities).

i've tried Xalan-J 1.2.1 and Xalan-J 2.0D01 with Xerces-J 1.2.1 (and i've
checked the Xerces-J 1.2.3 source) both with the same results.  i'm running
on a linux system with 1.3rc1 JDK.

i'd appreciate it if this could be resolved in some manner as this is
delaying a project i'm working on, and any help is gratefully received.

Cheers,
Matt

-----------------------------------------------------------------------------
Sessami is a trademark of Escape Velocity Technology Mobile Services Limited.
All information contained in this e-mail is confidential and for the use of
the addressee only.  If you receive this message in error please notify.

user error! [Was: problems with Xalan & Xerces]

Posted by matthew denner <ma...@evtechnology.com>.
matthew denner wrote:
> 
> matthew denner wrote:
> >
> > i believe the inclusion of the DTD in the HTML file is causing a problem and
> > removing this (from the real world files) is impossible as you get parser
> > errors when the XHTML contains "&nbsp;" (missing referenced entities).
> 
> further info:
> 
> it looks like the stuff that appears in the Xalan output are comments from
> the DTD, not the DTD itself.  also, the <!DOCTYPE...> element at the top of
> the documents is removed which may explain why the XSL doesn't work (straw
> clutching there i think!).

in trying to track down the "bug" i turned on ResultTreeHandler.DEBUG in the
source and found that the namespace for the <a> needs specifying on the
match ... i did warn you that i was new to all this XSL!  now everything
works, apologies to everyone.

side note though: would it not be better to separate out the ContentHandler
and DTDHandler stuff?  i would assume that comments in the DTD are not
relevent to the XSL and therefore should not pass through.  by separating
(or hacking like i did!) you could remove the comments from the DTD in the
final output.

once again, sorry for assuming this was a bug and not user error, next time
i'll be a little more careful!

Cheers,
Matt

-----------------------------------------------------------------------------
Sessami is a trademark of Escape Velocity Technology Mobile Services Limited.
All information contained in this e-mail is confidential and for the use of
the addressee only.  If you receive this message in error please notify.

user error! [Was: problems with Xalan & Xerces]

Posted by matthew denner <ma...@evtechnology.com>.
matthew denner wrote:
> 
> matthew denner wrote:
> >
> > i believe the inclusion of the DTD in the HTML file is causing a problem and
> > removing this (from the real world files) is impossible as you get parser
> > errors when the XHTML contains "&nbsp;" (missing referenced entities).
> 
> further info:
> 
> it looks like the stuff that appears in the Xalan output are comments from
> the DTD, not the DTD itself.  also, the <!DOCTYPE...> element at the top of
> the documents is removed which may explain why the XSL doesn't work (straw
> clutching there i think!).

in trying to track down the "bug" i turned on ResultTreeHandler.DEBUG in the
source and found that the namespace for the <a> needs specifying on the
match ... i did warn you that i was new to all this XSL!  now everything
works, apologies to everyone.

side note though: would it not be better to separate out the ContentHandler
and DTDHandler stuff?  i would assume that comments in the DTD are not
relevent to the XSL and therefore should not pass through.  by separating
(or hacking like i did!) you could remove the comments from the DTD in the
final output.

once again, sorry for assuming this was a bug and not user error, next time
i'll be a little more careful!

Cheers,
Matt

-----------------------------------------------------------------------------
Sessami is a trademark of Escape Velocity Technology Mobile Services Limited.
All information contained in this e-mail is confidential and for the use of
the addressee only.  If you receive this message in error please notify.

Re: problems with Xalan & Xerces

Posted by matthew denner <ma...@evtechnology.com>.
matthew denner wrote:
> 
> i believe the inclusion of the DTD in the HTML file is causing a problem and
> removing this (from the real world files) is impossible as you get parser
> errors when the XHTML contains "&nbsp;" (missing referenced entities).

further info:

it looks like the stuff that appears in the Xalan output are comments from
the DTD, not the DTD itself.  also, the <!DOCTYPE...> element at the top of
the documents is removed which may explain why the XSL doesn't work (straw
clutching there i think!).

hope this helps,
Matt

-----------------------------------------------------------------------------
Sessami is a trademark of Escape Velocity Technology Mobile Services Limited.
All information contained in this e-mail is confidential and for the use of
the addressee only.  If you receive this message in error please notify.

Re: problems with Xalan & Xerces

Posted by matthew denner <ma...@evtechnology.com>.
matthew denner wrote:
> 
> i believe the inclusion of the DTD in the HTML file is causing a problem and
> removing this (from the real world files) is impossible as you get parser
> errors when the XHTML contains "&nbsp;" (missing referenced entities).

further info:

it looks like the stuff that appears in the Xalan output are comments from
the DTD, not the DTD itself.  also, the <!DOCTYPE...> element at the top of
the documents is removed which may explain why the XSL doesn't work (straw
clutching there i think!).

hope this helps,
Matt

-----------------------------------------------------------------------------
Sessami is a trademark of Escape Velocity Technology Mobile Services Limited.
All information contained in this e-mail is confidential and for the use of
the addressee only.  If you receive this message in error please notify.