You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Theresa Jayne Forster <th...@inbrand.co.uk> on 2011/09/08 13:48:23 UTC
Problem with foreign characters,
I have a minor issue and would like some help if I can,
Before I start there are a couple of pointers here.
1) I cannot change the java code nor the version of FOP (modified 0.23)
2) I have a partial resolution already in place
3) I am just looking for the way to get the information I need.
I have code which scrapes a web page and rips out text turning it into the
downloadable pdf.
Some characters like é do not display correctly so I am doing a replace in a
template,
I need to find what the characters are coming in as so I can convert them in
the replace,
For instance the é character comes in as the character codes é
How can I find the character codes coming in for all the other characters
(or convert them on the fly within xsl)
My template currently is as follows:
<xsl:template name="loose_nasty_entities">
<xsl:param name="thisstring" select="."/>
<xsl:variable name="thisstring1">
<xsl:call-template
name="replace">
<xsl:with-param name="str" select="$thisstring"/>
<xsl:with-param name="search-for" select="'–'"/>
<xsl:with-param name="replace-with" select="'-'"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="thisstring2">
<xsl:call-template
name="replace">
<xsl:with-param name="str" select="$thisstring1"/>
<xsl:with-param name="search-for" select="''"/>
<xsl:with-param name="replace-with" select="''"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="thisstring3">
<xsl:call-template
name="replace">
<xsl:with-param name="str" select="$thisstring2"/>
<xsl:with-param name="search-for" select="'Â'"/>
<xsl:with-param name="replace-with" select="''"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="thisstring4">
<xsl:call-template
name="replace">
<xsl:with-param name="str" select="$thisstring3"/>
<xsl:with-param name="search-for" select="'é'"/>
<xsl:with-param name="replace-with" select="'é'"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="thisstring5">
<xsl:call-template
name="replace">
<xsl:with-param name="str" select="$thisstring4"/>
<xsl:with-param name="search-for" select="'Ö'"/>
<xsl:with-param name="replace-with" select="'Ö'"/>
</xsl:call-template>
</xsl:variable>
<xsl:value-of select="$thisstring5"/>
</xsl:template>
Kindest regards
Theresa Forster
Senior Software Developer
RE: Problem with foreign characters,
Posted by Theresa Jayne Forster <th...@inbrand.co.uk>.
Well what happens is my xslt is calling in a html webpage via tagsoup
So I have no visibility of it until it gets to me in the xsl...
Kindest regards
Theresa Forster
Senior Software Developer
-----Original Message-----
From: Pascal Sancho [mailto:pascal.sancho@takoma.fr]
Sent: 08 September 2011 14:02
To: fop-users@xmlgraphics.apache.org
Subject: Re: Problem with foreign characters,
Hi theresa,
é is an UTF-8 sequence (0xC3 0xA9) that encode EACUTE as UTF-8;
 is an UTF-8 sequence (0xEF 0xBB 0xBB) that encode The
BOM as UTF-8 (this is the UTF-8 signature);
You should have a look on how char encoding is handled in your app, it
that seems to be an issue there.
That said, to convert a string in XSLT I imagine to ways:
either in pure XSLT, using a recursive template (see below),
or using embedded script (see [1] for Xalan).
<xsl:template match="text()">
<xsl:call-template name="text"/>
</xsl:template>
<xsl:template name="text">
<xsl:param name="str" select="."/>
<xsl:param name="find" select="' '"/>
<xsl:param name="replace" select="' '"/>
<xsl:choose>
<xsl:when test="contains($str,$find)">
<xsl:value-of select="substring-before($str,$find)"/>
<xsl:value-of select="$replace"/>
<xsl:call-template name="text">
<xsl:with-param name="str"
select="substring-after($str,$find)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$str"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
[1] http://xml.apache.org/xalan-j/extensions.html
Le 08/09/2011 13:48, Theresa Jayne Forster a écrit :
> I have a minor issue and would like some help if I can,
>
> Before I start there are a couple of pointers here.
> 1) I cannot change the java code nor the version of FOP (modified 0.23)
> 2) I have a partial resolution already in place
> 3) I am just looking for the way to get the information I need.
>
> I have code which scrapes a web page and rips out text turning it into
> the downloadable pdf.
> Some characters like é do not display correctly so I am doing a replace
> in a template,
> I need to find what the characters are coming in as so I can convert
> them in the replace,
> For instance the é character comes in as the character codes é
> How can I find the character codes coming in for all the other
> characters (or convert them on the fly within xsl)
>
> My template currently is as follows:
> <xsl:template name="loose_nasty_entities">
> <xsl:param name="thisstring" select="."/>
> <xsl:variable name="thisstring1">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring"/>
> <xsl:with-param name="search-for" select="'–'"/>
> <xsl:with-param name="replace-with" select="'-'"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring2">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring1"/>
> <xsl:with-param name="search-for" select="''"/>
> <xsl:with-param name="replace-with" select="''"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring3">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring2"/>
> <xsl:with-param name="search-for" select="'Â'"/>
> <xsl:with-param name="replace-with" select="''"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring4">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring3"/>
> <xsl:with-param name="search-for" select="'é'"/>
> <xsl:with-param name="replace-with" select="'é'"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring5">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring4"/>
> <xsl:with-param name="search-for" select="'Ö'"/>
> <xsl:with-param name="replace-with" select="'Ö'"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:value-of select="$thisstring5"/>
> </xsl:template>
>
> Kindest regards
> Theresa Forster
--
Pascal
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1392 / Virus Database: 1520/3880 - Release Date: 09/06/11
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
Re: Problem with foreign characters,
Posted by Pascal Sancho <pa...@takoma.fr>.
Hi theresa,
é is an UTF-8 sequence (0xC3 0xA9) that encode EACUTE as UTF-8;
 is an UTF-8 sequence (0xEF 0xBB 0xBB) that encode The
BOM as UTF-8 (this is the UTF-8 signature);
You should have a look on how char encoding is handled in your app, it
that seems to be an issue there.
That said, to convert a string in XSLT I imagine to ways:
either in pure XSLT, using a recursive template (see below),
or using embedded script (see [1] for Xalan).
<xsl:template match="text()">
<xsl:call-template name="text"/>
</xsl:template>
<xsl:template name="text">
<xsl:param name="str" select="."/>
<xsl:param name="find" select="' '"/>
<xsl:param name="replace" select="' '"/>
<xsl:choose>
<xsl:when test="contains($str,$find)">
<xsl:value-of select="substring-before($str,$find)"/>
<xsl:value-of select="$replace"/>
<xsl:call-template name="text">
<xsl:with-param name="str"
select="substring-after($str,$find)"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$str"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
[1] http://xml.apache.org/xalan-j/extensions.html
Le 08/09/2011 13:48, Theresa Jayne Forster a écrit :
> I have a minor issue and would like some help if I can,
>
> Before I start there are a couple of pointers here.
> 1) I cannot change the java code nor the version of FOP (modified 0.23)
> 2) I have a partial resolution already in place
> 3) I am just looking for the way to get the information I need.
>
> I have code which scrapes a web page and rips out text turning it into
> the downloadable pdf.
> Some characters like é do not display correctly so I am doing a replace
> in a template,
> I need to find what the characters are coming in as so I can convert
> them in the replace,
> For instance the é character comes in as the character codes é
> How can I find the character codes coming in for all the other
> characters (or convert them on the fly within xsl)
>
> My template currently is as follows:
> <xsl:template name="loose_nasty_entities">
> <xsl:param name="thisstring" select="."/>
> <xsl:variable name="thisstring1">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring"/>
> <xsl:with-param name="search-for" select="'–'"/>
> <xsl:with-param name="replace-with" select="'-'"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring2">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring1"/>
> <xsl:with-param name="search-for" select="''"/>
> <xsl:with-param name="replace-with" select="''"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring3">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring2"/>
> <xsl:with-param name="search-for" select="'Â'"/>
> <xsl:with-param name="replace-with" select="''"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring4">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring3"/>
> <xsl:with-param name="search-for" select="'é'"/>
> <xsl:with-param name="replace-with" select="'é'"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:variable name="thisstring5">
> <xsl:call-template name="replace">
> <xsl:with-param name="str" select="$thisstring4"/>
> <xsl:with-param name="search-for" select="'Ö'"/>
> <xsl:with-param name="replace-with" select="'Ö'"/>
> </xsl:call-template>
> </xsl:variable>
> <xsl:value-of select="$thisstring5"/>
> </xsl:template>
>
> Kindest regards
> Theresa Forster
--
Pascal
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org