You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Sanne de Roever <sa...@newfoundland.nl> on 2002/11/25 18:47:26 UTC

Don't decode (or transform) my entities!

Hi,

I've got an xml file with entities like &eacute; (declared in a dtd) and others.
If I output this file after an xsl identity transformation using the xml serializer cocoon gives me a decoded character like say î,
or a translation of the character like &#4567;

But I don't want that, I want it the way it came in: &eacute!

Does anybody how to accomplish this?

Sanne


++++++++++++ http://www.newfoundland.nl ++++++++++++
 
Sanne de Roever 
Newfoundland Interactive Technology
Jacob van Lennepkade 187
1054 ZN Amsterdam
Telefoon: +31 (0)20 4 700 623 / +31 (0)6 24 510 562
Fax: +31 (0)20 4 700 624
Email: sanne@newfoundland.nl
 
+++++++++++++++++++++++++++++++++++++++++++++++

Re: Don't decode (or transform) ... -> thanks!

Posted by Sanne de Roever <sa...@newfoundland.nl>.
Hi Joerg,

Thanks for your reply. It gave clearer understanding of special character
encoding.

My problem was that I have to deliver a large set of automated xml files to
a client with a very shaky (knowledge of) DTD's, basically just as myself...

So I thought that encoding like &eacute; would be the right way to go.
I already understood that the number reference should be ok from a DTD point
of view, I expect problems with my client though.

I didn't know that é was valid UTF-8: but that explains alot.
I've changed the encoding to ASCII, so now cocoon always encodes special
characters with number references: that should be an defendable standpoint.

Kind regards,

Sanne

----- Original Message -----
From: "Joerg Heinicke" <jo...@gmx.de>
To: <co...@xml.apache.org>
Sent: Monday, November 25, 2002 11:49 PM
Subject: Re: Don't decode (or transform) my entities!


> Hello Sanne,
>
> you have no influence how the output is serialized, but that's no
> problem in general. &eacute; is only another representation of the
> relating number reference. It should be serialized to &#xe9; or &#233;
> The advantage of a number reference is the independence on any DTD. So
> the number reference is to prefer I think - somebody can correct me
> here. If the encoding supports the character directly (like UTF-8) it's
> also allowed to write é directly (so not using any reference).
>
> What exactly problem do you have with it? For general questions you can
> also read at http://www.unicode.org/.
>
> Joerg
>
> Sanne de Roever wrote:
> > Hi,
> >
> > I've got an xml file with entities like &eacute; (declared in a dtd) and
> > others.
> > If I output this file after an xsl identity transformation using the xml
> > serializer cocoon gives me a decoded character like say î,
> > or a translation of the character like &#4567;
> >
> > But I don't want that, I want it the way it came in: &eacute!
> >
> > Does anybody how to accomplish this?
> >
> > Sanne
>
>
> ---------------------------------------------------------------------
> Please check that your question  has not already been answered in the
> FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>
>
> To unsubscribe, e-mail:     <co...@xml.apache.org>
> For additional commands, e-mail:   <co...@xml.apache.org>
>


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>


Re: Don't decode (or transform) my entities!

Posted by Joerg Heinicke <jo...@gmx.de>.
Hello Sanne,

you have no influence how the output is serialized, but that's no 
problem in general. &eacute; is only another representation of the 
relating number reference. It should be serialized to &#xe9; or &#233; 
The advantage of a number reference is the independence on any DTD. So 
the number reference is to prefer I think - somebody can correct me 
here. If the encoding supports the character directly (like UTF-8) it's 
also allowed to write é directly (so not using any reference).

What exactly problem do you have with it? For general questions you can 
also read at http://www.unicode.org/.

Joerg

Sanne de Roever wrote:
> Hi,
>  
> I've got an xml file with entities like &eacute; (declared in a dtd) and 
> others.
> If I output this file after an xsl identity transformation using the xml 
> serializer cocoon gives me a decoded character like say î,
> or a translation of the character like &#4567;
>  
> But I don't want that, I want it the way it came in: &eacute!
>  
> Does anybody how to accomplish this?
>  
> Sanne


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <co...@xml.apache.org>
For additional commands, e-mail:   <co...@xml.apache.org>