You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Zhaohua Meng <me...@hotlens.com> on 2001/05/31 23:37:11 UTC

XHTMLSerializer

I have a source file with content as following:

============= the source file content =================
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>DOM Tree Testing</title>
<script>
//<!--
function test2() {
  var a="<this is a test>";
  return;
}
//-->
</script>
</head>
<body>
</body>
</html>
============= the source file content =================

I used a HTML parser to parse the source, constructed a DOM using 
xerces1.4.0, and serialized it to a dest file using XHTMLSerializer.
The content of <script> is constructed as TextNode.

However, the <script> part is serialized as following. 

============= the dest <script> content =================
<script language="javascript"><![CDATA[
//<!--
function test2() {
  var a="<this is a test>";
  return;
}
//-->
]]></script>
============= the dest <script> content =================
Please note the
"<![CDATA[" and "]]" right after <script> and before </script>.

Is this expected behaviour? HTMLSerializer does output a
valid HTML file. If XHTML is intented for "correcting" lousy
html, isn't it resonable to output a valid html?

Thanks,
Zhaohua Meng

Hotlens.com Inc.
http://www.hotlens.com

350 Fifth AVE
Suite 3113
New York, NY 10118
Phone: 212-465-1700
Fax:   212-465-1710
email: mengzh@hotlens.com



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: XHTMLSerializer

Posted by arkin <ar...@intalio.com>.
> Is this expected behaviour? HTMLSerializer does output a
> valid HTML file. If XHTML is intented for "correcting" lousy
> html, isn't it resonable to output a valid html?

XHTML is a valid XML document that uses elements originally defined for
HTML. Therefore it must be a valid XML document first.

The way some (most?) browsers work, a valid XHTML document may not be a
valid HTML document. This is mostly a problem with script and style
which have a special handling in HTMLSerializer. XHTMLSerializer forces
them to conform to the XML document encoding.

arkin

> 
> Thanks,
> Zhaohua Meng
> 
> Hotlens.com Inc.
> http://www.hotlens.com
> 
> 350 Fifth AVE
> Suite 3113
> New York, NY 10118
> Phone: 212-465-1700
> Fax:   212-465-1710
> email: mengzh@hotlens.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org

-- 
----------------------------------------------------------------------
Assaf Arkin                                          arkin@intalio.com
CTO,  Intalio Inc.                                     www.intalio.com
The Business Process Management Company                 (650) 345 2777

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org