You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Dingjun Jia <di...@gmail.com> on 2006/07/10 20:10:03 UTC

How to avoid the encoded text Ü in pdf generation

Hallo!

Before i posted my problem two days ago about my problem with pdf-generation
by using fo2pdf serializer from xml file, i have searched in the mail list,
naturely in google and didn’t find a solution. So I try to post it again and
ask for help.

My cocoon 2.1.9 runs on a suse linux computer with java version "1.4.2_11".
I use fo2pdf serializer to generate pdf files from xml files, which I use
xsp to generate from a MySQL (Version 4.1.13) database. The pdf-generation
runs without errors, but in the generated pdf files there are text like
“&#220;bung”, which should be “Übung”.

For the sake of simplicity, I give a short example to describe my situation
and forget the step of generating xml from xsp intentionally. 

Given that, we have already a xml file like this:

<?xml version="1.0" encoding="UTF-8"?>
<doc>
      <text>&amp;#220;bung</text>
</doc>

A snippet of my xsl file (I have successfully configured fop2pdf serializer
to find the font “Arial Unicode MS”):

<fo:block font-family="Arial Unicode MS" text-align="left" font-size="14pt">
     <fo:inline font-family="Arial Unicode MS">
           <xsl:value-of select="doc/text" disable-output-escaping="yes"/>
     </fo:inline>
</fo:block>
 
And a snippet from my sitemap:
 <map:match pattern="test.pdf">
<map:generate src="test/test2.xml" type="file"/>
 <map:transform src="test/test2pdf.xsl" type="xslt" />
 <map:serialize type="fo2pdf"/>
 </map:match> 

Then I got a pdf file with text &#220;bung, not text Übung, which I expect.

I changed the serializer to xml like this:
<map:match pattern="test.fo">
 <map:generate src="test/test2.xml" type="file"/>
 <map:transform src="test/test2pdf.xsl" type="xslt" />
 <map:serialize type="xml"/>
</map:match> 

Then i got a fo file and a snippnet from the fo file:

<fo:block font-size="14pt" text-align="left" font-family="Arial Unicode MS">
<fo:inline font-family="Arial Unicode MS">&#220;bung</fo:inline>
</fo:block> 

I have tested, if I save the fo file in the local file system and use the
following match, I can get a correct pdf file, in which the text is Übung.
 
<map:match pattern="testfo2pdf.pdf">
    <map:generate src="test/test2.fo" type="file"/>
    <map:serialize type="fo2pdf"/>
</map:match>

Now the question, why can’t I generate the correct pdf file in a pipeline
and why can I get the right pdf file, if I use the saved fo file. I think,
the solution, saving the fo file and then generate in two steps is stupid,
when the concept pipeline is available. Can anyone tell me, what’s wrong I
have done?

Many many thanks in advance and any advice is appreciative!

Dingjun Jia


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


AW: How to avoid the encoded text Ü in pdf generation

Posted by Dingjun Jia <di...@gmail.com>.
I have this in my sitemap:

1. <map:match pattern="test.pdf">
2. <map:generate src="test/test.xsp" type="serverpages"/>
3. <map:transform src="test/pdf.xsl" type="xslt" /> <!-- in this xsl file I
habe used disable-output-escaping="yes" -->
4. <map:serialize type="fo2pdf"/>
5. </map:match> 

In the pdf.xsl (in the 4. row) i have used disable-output-escaping="yes" and
in the generated fo file i saw only &#220; not the &amp;#220; . In the
database there is &#220;, but the xsp page generate default &amp;#220;, i
don't how to teak it to just output &#220. I have tested your second
solution, unfortunately, it doesn't work. 

-----Ursprüngliche Nachricht-----
Von: Toby [mailto:tobia.conforto@linux.it] 
Gesendet: Dienstag, 11. Juli 2006 15:43
An: users@cocoon.apache.org
Betreff: Re: How to avoid the encoded text &#220 in pdf generation

Dingjun Jia wrote:
> <?xml version="1.0" encoding="UTF-8"?> <doc>
>       <text>&amp;#220;bung</text>
> </doc>

This won't work.  It should read &#220; instead of &amp;#220;

You have 2 options.  The right one: understand what generates it that way
and fix the problem earlier in the pipeline; or the quick one: give it a run
of disable-output-escaping, using an xsl transformer just before
serialization, with a stylesheet like this:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*">
    <xsl:attribute name="{name()}">
      <xsl:value-of select="." disable-output-escaping="yes"/>
    </xsl:attribute>
  </xsl:template>
  <xsl:template match="text()">
    <xsl:value-of select="." disable-output-escaping="yes"/>
  </xsl:template>
  <xsl:template match="node()" priority="-1">
    <xsl:copy><xsl:apply-templates select="node()"/></xsl:copy>
  </xsl:template>
</xsl:stylesheet>

(adjust to your needs)


Toby

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: How to avoid the encoded text Ü in pdf generation

Posted by Toby <to...@linux.it>.
Dingjun Jia wrote:
> <?xml version="1.0" encoding="UTF-8"?>
> <doc>
>       <text>&amp;#220;bung</text>
> </doc>

This won't work.  It should read &#220; instead of &amp;#220;

You have 2 options.  The right one: understand what generates it that
way and fix the problem earlier in the pipeline; or the quick one: give
it a run of disable-output-escaping, using an xsl transformer just
before serialization, with a stylesheet like this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*">
    <xsl:attribute name="{name()}">
      <xsl:value-of select="." disable-output-escaping="yes"/>
    </xsl:attribute>
  </xsl:template>
  <xsl:template match="text()">
    <xsl:value-of select="." disable-output-escaping="yes"/>
  </xsl:template>
  <xsl:template match="node()" priority="-1">
    <xsl:copy><xsl:apply-templates select="node()"/></xsl:copy>
  </xsl:template>
</xsl:stylesheet>

(adjust to your needs)


Toby

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org