You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-dev@axis.apache.org by bu...@apache.org on 2003/04/25 19:29:54 UTC

DO NOT REPLY [Bug 19327] New: - Character entities are escaped to aggressively

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19327>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19327

Character entities are escaped to aggressively

           Summary: Character entities are escaped to aggressively
           Product: Axis
           Version: 1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Serialization/Deserialization
        AssignedTo: axis-dev@ws.apache.org
        ReportedBy: sander@x-hive.com


We are using SOAP to send XML documents from client to server and back. The 
documents contain a lot of non-ASCII data. This is encoded as UTF-8 by us. 
However, when retrieved from an Axis server, Axis will escape almost all of our 
characters into character entities (so &#...;) This means messages become about 
three times as big as they have to for 'international' documents, which for us 
is a large performance problem. I narrowed down the problem to
  XMLUtils::xmlEncodeString
that has the code:
                if (((int)chars[i]) > 127) {
                        strBuf.append("&#");
                        strBuf.append((int)chars[i]);
                        strBuf.append(";");
This seems unnecessary to me, as Axis will send all messages in UTF-8 anyway, 
for which no encoding is necessary (and should encoding be configurable, I feel 
this should be escaped elsewhere).

Is there any reason for this code, I commented it out and it seemed to have no 
adverse effect on our application (apart from reduced network traffic)?

Tested with 1.0, also looked up in the sources of 1.1-rc2.