You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by Claus Kick <cl...@googlemail.com> on 2009/09/21 10:01:30 UTC
Catch a Character ...
Hello everyone,
I am trying to catch a special character with the following style sheet:
<xsl:param name="specChar" select="'\u201C'" />
<xsl:output indent="yes" method="xml"/>
<xsl:strip-space elements="*"/>
<xsl:template
match="CATALOOM-OPENENGINE/PRODUCTS/PRODUCT/PRODUCTREVISION">
<xsl:variable name="primKey2">
<xsl:value-of select="substring-before(@primarykey, '/')"/>
</xsl:variable>
<xsl:for-each select="FEATURE/VALUE">
<xsl:variable name="cdata">
<xsl:value-of select="FEATURE/VALUE/text()"/>
</xsl:variable>
<xsl:if test="contains($cdata, $specChar)">
<xsl:text>Found: </xsl:text>
<xsl:value-of select="$primKey2" />
<xsl:text> </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>
I am not getting any hits, though there should be a couple of thousands.
A few questions:
Is there anything overly wrong with this stylesheet?
The XML is like
<PRODUCTREVISION>
<FEATURE><VALUE>...</VALUE></FEATURE>
</PRODUCTREVISION>
How do I have to mask a unicode char inside a string inside a stylesheet?
Re: Catch a Character ...
Posted by Claus Kick <cl...@googlemail.com>.
2009/9/22 Michael Ludwig <ml...@as-guides.com>
> Claus Kick schrieb:
>
>> 2009/9/21 Michael Ludwig <ml...@as-guides.com>
>>
>>> Claus Kick schrieb:
>>>
>>> <xsl:param name="specChar" select="'\u201C'" />
>>>>
>>>> That's the Java syntax. Doesn't work in XML. Use a numerical
>>> character reference as per the XML spec.
>>>
>>> <xsl:param name="specChar" select="'“'" /> in hex, or
>>> <xsl:param name="specChar" select="'“'" /> in decimal
>>>
>>
>> OK, I completely forgot about that. That actually was the issue ...
>>
>
> Good! (BTW, this list doesn't set the Reply-To header to the list,
> which I think it should really do.)
>
> Ok, thank you so much for your pointers, I have actually quite a few
>> transformations to work on, so this will indeed help me deepening my
>> knowledge!
>>
>
> Okay then, here are some more pointers :-) It helps to get familiarized
> with the weird XML and XSLT terminology. As for XML:
>
> * numerical character reference - as above
> * entity reference - < (built-in), &myEnt; (user-defined) - same
> syntax, but not exactly the same thing
> * entities (XML/DTD)
> * general entity
> * external [general] parsed entity (EGPE)
> * external [general] unparsed entity
> * parameter entity
> * internal subset (DTD)
> * external subset (DTD)
>
> You can read up on those in the XML recommendation (specification). The
> terminology is a bit weird. The thing to keep in mind is that the stuff
> is easier than the terminology. As for XSLT:
>
> * attribute value template (AVT)
> * result tree fragment (RTF)
> * node set
> * literal result element
> * match pattern
> * node test
>
> See this page [1] on Dave Pawson's site, which is a great resource for
> XSLT. Also, see Jeni Tennison's site [2], which has very nice tutorials.
> Also, see a recent thread on XSL-List [3] for more pointers.
>
> Finally, Xalan is a 1.0 processor. XSLT 2.0 is much more powerful than
> 1.0. Personally, I find it quite okay to get started with 1.0, which is
> a much smaller language, and therefore easier to learn. But 1.0 has its
> limits, and when reaching those, it's good to know about (a) EXSLT [4],
> (b) extension functions (for example, JavaScript in Xalan), (c) the
> possibility to upgrade to 2.0 by switching to Saxon.
>
> [1] http://www.dpawson.co.uk/xsl/xslvocab.html
> [2] http://www.jenitennison.com/xslt/
> [3] http://markmail.org/thread/myu2h7quwbh4rjdi - How did you learn XSL?
> [4] http://exslt.org/
>
>
Hello Michael,
thanks for reminding me (yet again - sigh) to include the group.
Thanks for your help - regarding Xalan or not: We have Xalan in use in a
huge amount of different places (data storage/exchange platform) and I
currently dread to even think about switching.
Currently, there is simply no way I could ensure that no breakage happens.
Re: Catch a Character ...
Posted by Michael Ludwig <ml...@as-guides.com>.
Claus Kick schrieb:
> 2009/9/21 Michael Ludwig <ml...@as-guides.com>
>> Claus Kick schrieb:
>>
>>> <xsl:param name="specChar" select="'\u201C'" />
>>>
>> That's the Java syntax. Doesn't work in XML. Use a numerical
>> character reference as per the XML spec.
>>
>> <xsl:param name="specChar" select="'“'" /> in hex, or
>> <xsl:param name="specChar" select="'“'" /> in decimal
>
> OK, I completely forgot about that. That actually was the issue ...
Good! (BTW, this list doesn't set the Reply-To header to the list,
which I think it should really do.)
> Ok, thank you so much for your pointers, I have actually quite a few
> transformations to work on, so this will indeed help me deepening my
> knowledge!
Okay then, here are some more pointers :-) It helps to get familiarized
with the weird XML and XSLT terminology. As for XML:
* numerical character reference - as above
* entity reference - < (built-in), &myEnt; (user-defined) - same
syntax, but not exactly the same thing
* entities (XML/DTD)
* general entity
* external [general] parsed entity (EGPE)
* external [general] unparsed entity
* parameter entity
* internal subset (DTD)
* external subset (DTD)
You can read up on those in the XML recommendation (specification). The
terminology is a bit weird. The thing to keep in mind is that the stuff
is easier than the terminology. As for XSLT:
* attribute value template (AVT)
* result tree fragment (RTF)
* node set
* literal result element
* match pattern
* node test
See this page [1] on Dave Pawson's site, which is a great resource for
XSLT. Also, see Jeni Tennison's site [2], which has very nice tutorials.
Also, see a recent thread on XSL-List [3] for more pointers.
Finally, Xalan is a 1.0 processor. XSLT 2.0 is much more powerful than
1.0. Personally, I find it quite okay to get started with 1.0, which is
a much smaller language, and therefore easier to learn. But 1.0 has its
limits, and when reaching those, it's good to know about (a) EXSLT [4],
(b) extension functions (for example, JavaScript in Xalan), (c) the
possibility to upgrade to 2.0 by switching to Saxon.
[1] http://www.dpawson.co.uk/xsl/xslvocab.html
[2] http://www.jenitennison.com/xslt/
[3] http://markmail.org/thread/myu2h7quwbh4rjdi - How did you learn XSL?
[4] http://exslt.org/
Cheers,
--
Michael Ludwig
Re: Catch a Character ...
Posted by Michael Ludwig <ml...@as-guides.com>.
Claus Kick schrieb:
>
> I am trying to catch a special character with the following style sheet:
>
> <xsl:param name="specChar" select="'\u201C'" />
That's the Java syntax. Doesn't work in XML. Use a numerical character
reference as per the XML spec.
<xsl:param name="specChar" select="'“'" /> in hex, or
<xsl:param name="specChar" select="'“'" /> in decimal
> <xsl:output indent="yes" method="xml"/>
> <xsl:strip-space elements="*"/>
>
> <xsl:template
> match="CATALOOM-OPENENGINE/PRODUCTS/PRODUCT/PRODUCTREVISION">
Not knowing your input, I can't be sure, but simply doing
match="PRODUCTREVISION" would probably be specific enough.
> <xsl:variable name="primKey2">
> <xsl:value-of select="substring-before(@primarykey, '/')"/>
> </xsl:variable>
That's a very bad way of getting the value. Instead, use:
<xsl:variable name="primKey2"
select="substring-before(@primarykey, '/')"/>
Your version creates a so-called "result tree fragment" (RTF), which is
inefficient and cumbersome.
> <xsl:for-each select="FEATURE/VALUE">
> <xsl:variable name="cdata">
> <xsl:value-of select="FEATURE/VALUE/text()"/>
> </xsl:variable>
Same story here. In addition, avoid using the text() node test to get
the string value:
<xsl:value-of select="FEATURE/VALUE"/>
But are you sure your input is FEATURE/VALUE/FEATURE/VALUE?
> How do I have to mask a unicode char inside a string inside a
> stylesheet?
As shown above. Or simply as a literal, if you're using a Unicode
encoding and your input device and display support that character.
You can learn XSLT by reading XSL-List at Mulberrytech or any good book
in XSLT, like, for starters, the Pocket Guide by Evan Lenz.
--
Michael Ludwig