You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by bu...@apache.org on 2003/04/25 19:29:54 UTC
DO NOT REPLY [Bug 19327] New: -
Character entities are escaped to aggressively
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19327>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19327
Character entities are escaped to aggressively
Summary: Character entities are escaped to aggressively
Product: Axis
Version: 1.0
Platform: All
OS/Version: All
Status: NEW
Severity: Normal
Priority: Other
Component: Serialization/Deserialization
AssignedTo: axis-dev@ws.apache.org
ReportedBy: sander@x-hive.com
We are using SOAP to send XML documents from client to server and back. The
documents contain a lot of non-ASCII data. This is encoded as UTF-8 by us.
However, when retrieved from an Axis server, Axis will escape almost all of our
characters into character entities (so &#...;) This means messages become about
three times as big as they have to for 'international' documents, which for us
is a large performance problem. I narrowed down the problem to
XMLUtils::xmlEncodeString
that has the code:
if (((int)chars[i]) > 127) {
strBuf.append("&#");
strBuf.append((int)chars[i]);
strBuf.append(";");
This seems unnecessary to me, as Axis will send all messages in UTF-8 anyway,
for which no encoding is necessary (and should encoding be configurable, I feel
this should be escaped elsewhere).
Is there any reason for this code, I commented it out and it seemed to have no
adverse effect on our application (apart from reduced network traffic)?
Tested with 1.0, also looked up in the sources of 1.1-rc2.