You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@jakarta.apache.org by Rick Beaubien <rb...@library.berkeley.edu> on 2000/10/26 18:05:11 UTC

Unicode and Classpath problems in newest release of Tomcat 3.1

We recently tried upgrading our version of Tomcat 3.1. We had actually been
running 3.1 for awhile, but discovered that changes had been made to it
since we first installed it (the modified date for the class files in the
servlet.jar file for the earlier issue of 3.1 is 3/08/00; the modified date
for the more recent issue is 4/18/00).  Unfortunately, the newer issue has
introduced two big problems for us, and we have had to roll back to the
earlier issue of 3.1.

The most serious of the two problems involves the handling of multibyte
UTF-8 encoded Unicode values. My servlet produces UTF-8 encoded HTML pages
that include encoded Unicode character values in the CJK range.  Under the
earlier issue of Tomcat 3.1, these UTF-8 encoded values got submitted to
the browser properly; IE5 with the proper fonts installed was able to
display them just fine.  Under the newest issue of Tomcat 3.1, however, the
CJK characters are replaced with "?"s before they are submitted to the
browser! 

To see how the older version of Tomcat 3.1 treats the UTF-8 encodings of
characters in the CJK range, specify the following location in IE5: 

http://sunsite.berkeley.edu/xdlib/servlet/archobj?DOCCHOICE=misc/sprintatest
2.xml

If you have the proper fonts installed, Chinese characters will appear in
the two lefthand frames.  If you view the source, you can see that these
are being transmitted by the 3 byte UTF-8 encodings of the corresponding
Unicode values.  When we switch to the newer issue of Tomcat 3.1, the 4
Chinese characters get translated to question marks ("????") before they
are submitted to the browser.  This is definitely happening somewhere in
Tomcat; and it's just under the newest issue of 3.1.

The second problem introduced by the newer issue of 3.1 is easier for us to
work around. Under the earlier issue of Tomcat 3.1, I had been able to run
parallel versions of a servlet off the same running copy of the servlet
engine: a production version and a development version. The two servlet
versions use classes with identical names but residing of course in
different CLASSPATH locations. (The production version is accessed via
http://sunsite.berkeley.edu/xdlib/servlet/... and the development version
via http://sunsite.berkeley.edu/xdlibdev/servlet/...) The previous release
of Tomcat 3.1 had no problem keeping these two versions of my servlet
sorted out; it activated the proper classes from the proper classpath for
the version of the servlet which the URL indicated. But under the newer
issue of Tomcat 3.1, this has changed. If a user invokes the xdlibdev
servlet (my development servlet), Tomcat will now use classes from the
xdlib/servlet classpath (the production servlet classpath) if these are
already loaded! In other words, it now seems only to pay attention to the
package and class names, not to the classpath that is associated with a
servlet when loading classes for use. 

I have reported the second problem above as Bug report 170.  However all of
my attempts to report the first problem as a bug have timed out; it no
longer seems to be possible to submit a bug report! 

Thanks in advance for any insights anyone might have into either of these
matters.


Rick Beaubien

-----------------------------------------------------
Rick Beaubien 

Software Engineer: Research and Development
Library Systems Office
Rm 386 Doe Library
University of California
Berkeley, CA 94720-6000
510-643-9776