You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by ad...@free.fr on 2002/07/14 13:35:25 UTC
xmldb generator and utf-8 problems
Hello,
i am using xindice as data source in cocoon. the problem is that the xml
documents in xindice are encoded in utf-8 and cocoon seems to alter the
"special" characters before returning them:
so: while a simple xindice query which returns a piece of a document like, for
instance:
xindice xpath -c /db/carti/biblia -q /div1[@id=\'Fac\'] \
> tmp-facere-from-xindice.xml
returns a nice UTF-8-encoded XML file which I can edit with emacs and see that
there is no problem, the equivalent cocoon sitemap code, which goes like
this:
<map:match pattern="carti/biblia">
<map:match type="request-parameter" pattern="carte">
<map:generate
src="xmldb:xindice://localhost:4080/db/carti/biblia/#/div1[@id='{1}']"/>
<map:serialize type="xml" />
</map:match>
</map:match>
returns an invalidly-encoded XML file.
==
I will illustrate this by dumping hexa codes for a string containing a small
Romanian phrase, which is the first bit of the book of Genesis:
"La început a fãcut"
1. using the xindice client, the correct codes are returned:
L a _ î n c e p u t _ a _
4C 61 20 C3 AE 6E 63 65 70 75 74 20 61 20
f ã c u t
66 C4 83 63 75 74
2. the result from cocoon goes like this:
L a _ ? ? ? ? n c e p u t _ a _
4C 61 20 C3 ? 83 C2 AE 6E ? 63 65 70 75 ? 74 20 61 20
f ? ? & # 1 3 1 ; c u t
66 C3 84 26 ? 23 31 33 31 ? 3B 63 75 74
so it actually seems that cocoon is interpreting my utf-8 like it was
something else (look at the ƒ that it adds by itself)
thank you in advance for your ideas,
adrian.
---------------------------------------------------------------------
Please check that your question has not already been answered in the
FAQ before posting. <http://xml.apache.org/cocoon/faq/index.html>
To unsubscribe, e-mail: <co...@xml.apache.org>
For additional commands, e-mail: <co...@xml.apache.org>