You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@ws.apache.org by me...@my.com on 2006/05/03 11:39:46 UTC

Date and unicode handling in 3.0b1

My work on the current project made me really appreciate the simplicity and
power of this protocol, and for this I'm really grateful to Jochen and all
of you guys who took the time to develop this library. Thank you, and keep
up the good job !

I've been developing the client java application which has to RPC
delphi-based server. This is a multi-language project, so all the
communication has to go in utf-8. I do understand that my requirements are
far from common, but because of these requirements I was lucky to find some
issues ( most of them traced to ws.commons.util ) which could be resolved in
next version.

Environment :
Client : windows xp
	java 1.5
	xmlrpc-3.0b1-SNAPSHOT
	
Server : delphi 7.0 on Windows xp or kylix 3.0 on linux
	xml-rpc library from http://sourceforge.net/projects/delphixml-rpc/
	

1. org.apache.xmlrpc.parser.DateParser throws SAXParseException("Failed to
parse integer value: " ), which is a little bit misleading since it is DATE
value it tries to parse. This probably should be changed to "Failed to parse
date value"

2. org.apache.ws.commons.util.XsDateTimeFormat simply does not work with
dates in 19980717T14:08:55 format, as described in XMLRPC specification. I
had to change the source to make it understand iso8601. Would it be possible
to add some configuration parameter to ws.commons.util, so it could handle
the expected date format ?

3. org.apache.xmlrpc.test.BaseTest does not have any tests to demonstrate
handling of date objects, that's why issues 1 and 2 weren't caught at the
testing stage. I think it would be useful to unclude date tests in future
versions.

4. org.apache.ws.commons.serialize.XMLWriterImpl encodes utf-8 characters as
&#code;&#code;.... which causes trouble on the server side. I do not think
that there should be any special handling of unicode characters at all,
since 
	- xmlrpc uses utf-8 encoding of the request by default
	- encoding bloats up the size of xml requests ( 7 encoded characters
for each non-encoded one )
	- encoding takes up additional resources
	- encoding causes incompatibilities on the server side ( i.e.
delphixmlrpc )
The way to get rid of the encoding would be modify canEncode function in
XMLWriterImpl to always return true.

All this leads me to question the benefits of building future versions of
xmlrpc based on ws.common.util . Is the functionality delegated there really
so big, as to justify the need for additional resources to test and
integrate this library ?

I would appreciate your comments.


Re: Date and unicode handling in 3.0b1

Posted by Jochen Wiedmann <jo...@gmail.com>.
On 5/3/06, me@my.com <me...@my.com> wrote:

Hi,

first of all, a hint: Please be so kind and do not crosspost to
multiple lists. Use the list which you find most appropriate. In this
case I have choosen xmlrpc-dev for the reply.


> 1. org.apache.xmlrpc.parser.DateParser throws SAXParseException("Failed to
> parse integer value: " ), which is a little bit misleading since it is DATE
> value it tries to parse. This probably should be changed to "Failed to parse
> date value"

Patches welcome. :-)


> 2. org.apache.ws.commons.util.XsDateTimeFormat simply does not work with
> dates in 19980717T14:08:55 format, as described in XMLRPC specification. I
> had to change the source to make it understand iso8601. Would it be possible
> to add some configuration parameter to ws.commons.util, so it could handle
> the expected date format ?

Again, a patch would be welcome, at least a patch to the test suite of
ws-commons-util, which demonstrates the problem.


> 3. org.apache.xmlrpc.test.BaseTest does not have any tests to demonstrate
> handling of date objects, that's why issues 1 and 2 weren't caught at the
> testing stage. I think it would be useful to unclude date tests in future
> versions.

Is that so? That's of course, not good. How about yet another patch? :-)))


> 4. org.apache.ws.commons.serialize.XMLWriterImpl encodes utf-8 characters as
> &#code;&#code;.... which causes trouble on the server side. I do not think
> that there should be any special handling of unicode characters at all,
> since
>         - xmlrpc uses utf-8 encoding of the request by default
>         - encoding bloats up the size of xml requests ( 7 encoded characters
> for each non-encoded one )
>         - encoding takes up additional resources
>         - encoding causes incompatibilities on the server side ( i.e.
> delphixmlrpc )
> The way to get rid of the encoding would be modify canEncode function in
> XMLWriterImpl to always return true.
>
> All this leads me to question the benefits of building future versions of
> xmlrpc based on ws.common.util . Is the functionality delegated there really
> so big, as to justify the need for additional resources to test and
> integrate this library ?

First of all, your requirements seem to be quite unusual: I have met
XML parsers, which are unable to deal with Unicode characters, but I
have never met XML parsers, which are unable to deal with character
escape sequences?

The particular XMLWriter instance was choosen because it is (in the
light of the above paragraph) the best thing in terms of portability.
I agree with you, that this is possibly not optimal in terms of
performance. However, there are multiple XMLWriter instances (in
particular the RawXMLWriter, which should do exactly what you want.)
and the XML-RPC framework should allow you to use a different
implementation by supplying another XMLWriterFactory. Did you consider
using that?


Jochen

--
Whenever you find yourself on the side of the
majority, it is time to pause and reflect.
(Mark Twain)