You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by kr...@mmm.com on 2003/05/30 02:33:16 UTC
special character handling problem in xslt / xsl:fo

I am having a problem displaying special utf characters in Acrobat Reader.
An &nbsp character is inserted as a utf character into an xml document in
Javasript code that is transformed by xslt. The javascript sets the char:


oCell.appendChild(oXMLDoc.createTextNode(String.fromCharCode(160)));

I print the xml doc to System.out in Java and see that a bogus character is
printed. I'm thinking this is because the command prompt doesn't know how
to interpret the utf character or how to print it.

Now the document that contains this character is transformed by xsl:fo
using FOP 2.30.5rc. I suspect that either the resulting PDF file contains a
bogus character interpretation, not interpreting the utf character properly
or that the Acrobat Reader is not interpreting the char correctly. From
what I've read, I think the Javascript code should have inserted the char
correctly.

The stylesheet that contains the java script is defined:

<?xml version="1.0"  encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl
="http://www.w3.org/1999/XSL/Transform">
  <!-- import stylesheets with common variables, templates and css styles
-->
  <xsl:import href="common_parameters.xsl" />
  <xsl:import href="common_variables.xsl" />
  <xsl:import href="common_templates.xsl" />
  <xsl:import href="common_css.xsl" />

  <!-- import common code for Alert Module -->
  <xsl:import href="alert_templates.xsl"/>

  <xsl:output method="html" indent="no" doctype-public="//W3C/DTD HTML 4.0
Transitional//EN/"  encoding="UTF-8"/>
  <xsl:strip-space elements="*"/>

.....

So the stylesheet that is setting the special character to an xml document
is set to UTF-8.

Next a servlet is called to run FOP. The xml doc created in javascript is
passed in as an encoded parameter.

An xslt transformation is performed on the xml doc and the xsl:fo
stylesheet. This stylesheet is defined:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0"
      xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
      xmlns:fo="http://www.w3.org/1999/XSL/Format">
...

There is no output tag. I'm assuming this is not needed. The char set is
declared as UTF-8.

The servlet prints the results of the FOP transformation as follows:


StringWriter strWriter = new StringWriter();
// Perform xslt transformation of FOP stylesheet.
foxsl.process(processor, xmlDoc, new StreamResult(strWriter));
driverSource = new InputSource(new StringReader(strWriter.toString()));

// Make PDF output

ByteArrayOutputStream out = new ByteArrayOutputStream();
Driver driver = new Driver(driverSource, out);
driver.run();

byte[] content = out.toByteArray();
resp.setContentType("application/pdf");
resp.setContentLength(content.length);
resp.getOutputStream().write(content);
resp.getOutputStream().flush();

This displays the PDF in a browser / Acrobat window.

I'm not sure where the character interpretation is failing. My first
suspicion is with Acrobat. Is there a way to specify the character set in
Acrobat Reader? Does xslt process special utf characters differently in
xalan-2?

Any ideas would be appreciated.

Thanks
keith....