You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Sergey Bredikhin <Se...@oemk.ru> on 2002/08/28 06:35:40 UTC

&#xxxx in resulting HTML source

Hello

I using windows-1251 charset for my XML and resulting HTML documents.
When I get XML or HTML document from cocoon 2.1 I saw normal result in the 
browser's window. But when I open HTML source of the document I saw &#xxxx 
symbols instead of chars in windows-1251 codepage. It is not convenient and
rises the traffic. 

Please, could you help me. 
Probably, you had such problem with another charset, but the solvation the same.

Sergey.


RE: &#xxxx in resulting HTML source

Posted by Koen Pellegrims <ko...@pandora.be>.
The &#xxxx; numerical notation is the standard way for XML to handle less
common characters (meaning: non-english).
I'm assuming you are including some russian characters in your document, but
not everyone has these in their codepage, that is why they are encoded.

AFAIK there is no way to disable this encoding on XML parsers (which is a
good thing, because it would make your documents less portable).

While I can agree that it may not be very convenient to debug your source, I
wouldn't worry that much about the rise in traffic. If bandwidth is an
issue, you might want to take a look at apache's mod_gzip.

Koen
  -----Oorspronkelijk bericht-----
  Van: Sergey Bredikhin [mailto:Sergey_Bredikhin@oemk.ru]
  Verzonden: woensdag 28 augustus 2002 6:36
  Aan: cocoon-users@xml.apache.org
  Onderwerp: &#xxxx in resulting HTML source


  Hello

  I using windows-1251 charset for my XML and resulting HTML documents.
  When I get XML or HTML document from cocoon 2.1 I saw normal result in the
  browser's window. But when I open HTML source of the document I saw &#xxxx
  symbols instead of chars in windows-1251 codepage. It is not convenient
and
  rises the traffic.

  Please, could you help me.
  Probably, you had such problem with another charset, but the solvation the
same.

  Sergey.