You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by ad...@free.fr on 2002/07/14 13:35:25 UTC

xmldb generator and utf-8 problems

Hello,

i am using xindice as data source in cocoon. the problem is that the xml 
documents in xindice are encoded in utf-8 and cocoon seems to alter the 
"special" characters before returning them:

so: while a simple xindice query which returns a piece of a document like, for 
instance:

xindice xpath -c /db/carti/biblia -q /div1[@id=\'Fac\']  \
> tmp-facere-from-xindice.xml

returns a nice UTF-8-encoded XML file which I can edit with emacs and see that 
there is no problem, the equivalent cocoon sitemap code, which goes like 
this:

   <map:match pattern="carti/biblia">
    <map:match type="request-parameter" pattern="carte">
		<map:generate
			  src="xmldb:xindice://localhost:4080/db/carti/biblia/#/div1[@id='{1}']"/>
		<map:serialize type="xml" />
	</map:match>
   </map:match>

returns an invalidly-encoded XML file. 


==
I will illustrate this by dumping hexa codes for a string containing a small 
Romanian phrase, which is the first bit of the book of Genesis:

"La început a fãcut"

1. using the xindice client, the correct codes are returned:

L   a  _  î       n  c  e  p    u  t  _  a    _ 
4C  61 20 C3 AE   6E 63 65 70   75 74 20 61   20 

f  ã       c  u  t 
66 C4 83   63 75 74 


2. the result from cocoon goes like this:

L  a  _  ?    ?  ?  ?  n    c  e  p  u    t  _  a  _
4C 61 20 C3 ? 83 C2 AE 6E ? 63 65 70 75 ? 74 20 61 20

f  ?  ?  &    #  1  3  1    ;  c  u  t
66 C3 84 26 ? 23 31 33 31 ? 3B 63 75 74


so it actually seems that cocoon is interpreting my utf-8 like it was 
something else (look at the &#131; that it adds by itself)


thank you in advance for your ideas,
adrian.

---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>