You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xmlrpc-dev@ws.apache.org by bu...@apache.org on 2003/07/11 17:36:10 UTC

DO NOT REPLY [Bug 21515] New: - International Characters cannot be used as Method Parameters

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=21515>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=21515

International Characters cannot be used as Method Parameters

           Summary: International Characters cannot be used as Method
                    Parameters
           Product: XML-RPC
           Version: 1.2
          Platform: All
        OS/Version: Windows NT/2K
            Status: NEW
          Severity: Blocker
          Priority: Other
         Component: Source
        AssignedTo: rpc-dev@xml.apache.org
        ReportedBy: vmichalitsis@velti.net


The switch statement in method "protected void chardata(String text)throws 
XmlRpcException, IOException" (line 310) in class org.apache.xmlrpc.XmlWriter 
does not allow character codes outside the 0x20 - 0xFF range prohibiting any 
serious use of the library other than for demonstrations. 

I've fixed that by allowing all valid XML characters, but I would prefer you to 
provide a perhaps more conservative-safe solution plus integrate it in a next 
version of the library.
 
TIA.

Re: DO NOT REPLY [Bug 21515] New: - International Characters cannot be used as Method Parameters

Posted by John Wilson <tu...@wilson.co.uk>.
bugzilla@apache.org wrote:
[snip]

> The switch statement in method "protected void chardata(String
> text)throws XmlRpcException, IOException" (line 310) in class
> org.apache.xmlrpc.XmlWriter does not allow character codes outside
> the 0x20 - 0xFF range prohibiting any serious use of the library
> other than for demonstrations.
>
> I've fixed that by allowing all valid XML characters, but I would
> prefer you to provide a perhaps more conservative-safe solution plus
> integrate it in a next version of the library.


Unitl very recently it was believed that the XML-RPC specification required
that strings contain only USASCII characters. The Apache implementation is
more liberal than this and allows all ISO 8859/1 characters which are also
vaild XML characters (i.e. only \r, \n and \t of the control characters are
allowed). It also sets the encoding attribute of the document to be 8859-1.

If you have modified the code to allow Unicode characters with values > 255
to be written you must ensure that these are written as numeric character
references (e.g. &#350;).

The "bug" you report is expected behaviour of the system as it stands and is
not a bug.

In the light of the recent "clarification" of the spec I think it may well
be appropriate to modify the behaviour of the software. There seem to me to
be two options:

1/ Keep the 8859-1 encoding and emit numeric character references for
characters > 255
2/ Remove the 8859-1 encoding attribute and emit UTF-8 (or, as a user
option, UTF-16).

Complient XML parsers will handle either of the above.

John Wilson
The Wilson Partnership
http://www.wilson.co.uk