You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Nikola Milutinovic <Ni...@ev.co.yu> on 2001/12/05 12:36:21 UTC
Character Encoding problems 2
Hi all.
It's me again and troubles are not resolved. I've created a simple test servlet:
----------------------------------------
import javax.servlet.*;
import java.io.*;
public class TestServlet extends GenericServlet {
private static final String testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 \uC491 \uC5BD \uC5BE";
PrintWriter out;
public void service( ServletRequest req, ServletResponse res )
throws javax.servlet.ServletException, java.io.IOException
{
res.setContentType("text/html; charset=ISO-8859-2");
out = res.getWriter();
out.print( "<html>\r\n<head><title>Test servlet</title>\r\n" );
out.print( "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=iso-8859-2\">\r\n</head>\r\n" );
out.print( "<body>\r\n<h1>Test</h1>\r\n<p>Let us see how this gets out</p>\r\n<p>\r\n<p>" );
out.print( testText );
out.print( "</p>\r\n</body>\r\n</html>" );
}
}
----------------------------------------
This prints "?" instead of characters. The string in question prints desired characters in an ordinary Java application.
QUESTION 1
---------------
How can I get Tomcat to honour "charset=ISO-8859-2"?
QUESTION 2
---------------
What about static HTML? Suppose I should enter a part of static HTML data in Latin-2 encoding. That translates to a string. A string is supposed to be Unicode. Do those strings get translated from "pageEncoding" to Unicode?
Nix.
Re: Character Encoding problems 2
Posted by Martin Fekete <fe...@zoznam.sk>.
i got some problems with encoding too ... but there was problem when i
submit data from forms (page was in cp1250 submited data were iso-8859-?)
... when i submited and writed to DB characters was wrong ... solution was
to add filter which sets encoding of each request ...
more here ..
http://marc.theaimsgroup.com/?l=tomcat-user&m=100679292919360&w=2
feky
----- Original Message -----
From: "Nikola Milutinovic" <Ni...@ev.co.yu>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: Wednesday, December 05, 2001 2:19 PM
Subject: Re: Character Encoding problems 2
> > I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
> > have to have in mind this was not done on Tomcat 4.x, but 3.x.
> > This is what I have done:
> > - <%@page contentType="text/html; charset=windows-1250"%> on top of
> > every JSP file
>
> I don't think that is a correct character encoding as far as Java is
concerned. I think Java supports only ISO-8859-* and UTF-*. Please correct
me if I'm wrong
>
> > - default_character_set=latin2 in my.cnf
>
> Is there a way to set defaul character encoding for Tomcat? Setting
LOCALLE on Unix?
>
> > - created new database so it gets created in latin2 character set
>
> Done that with PostgreSQL.
>
> > - when I connected to MySQL I was using mm.mysql driver and the database
> > URL was
> >
jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true
>
> I've never used MySQL, just PostgreSQL. So, the database is ISO-8859-2 and
this converts it to CP-1250, which goes by as Latin-1, as far as Tomcat is
concerned.
>
> I have had a similar "success" with my setup: the database was Latin-1,
the data in it was win-1250 and when I forced JDBC connection to Latin-1
charset, it would pass through JSP. But that is such a hack...
>
> > Then all characters were correctly displayed on JSP pages.
>
> What I'm looking for is a "politically correct" solution. I have so far:
>
> - PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the
same data in correct form.
> - JDBC driver which is acting OK.
> - JSP pages with correctly set pageEncoding
> - Java Servlet with correctly set contentType/encoding
>
> Still, Tomcat goes for default charset encoding and screwes up Latin-2
characters.
>
> Any help?
>
> Nix.
>
--
To unsubscribe: <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>
Re: Character Encoding problems 2
Posted by " <gregor.kovac@mikropis.si>" <Gregor>.
Hi!
Nikola Milutinovic wrote:
>>>>I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
>>>>have to have in mind this was not done on Tomcat 4.x, but 3.x.
>>>>This is what I have done:
>>>>- <%@page contentType="text/html; charset=windows-1250"%> on top of
>>>>every JSP file
>>>>
>>>>
>>>I don't think that is a correct character encoding as far as Java is concerned. I think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
>>>
>>>
>>
>>I'm sorry, butr you are wrong. You can convert between numerous
>>encodings, but you have to have i18n.jar in your classpath.
>>
>
> Hmm, I thought that Java community loathed anything but ISO, where can I find i18n.jar? I'll look for it on Sun's site, but if it is not there, drop me a line.
>
You can get it in jre/lib directory of your JDK install directory.
>
>>>What I'm looking for is a "politically correct" solution. I have so far:
>>>
>>>- PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data in correct form.
>>>- JDBC driver which is acting OK.
>>>- JSP pages with correctly set pageEncoding
>>>- Java Servlet with correctly set contentType/encoding
>>>
>>>Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
>>>
>>>
>>Have you tried putting %@page contentType="text/html;
>>charset=iso8859-2"%> on top of your JSP's ?
>>
>
> Always. And that is what is driving me crazy. I have even tested what is the character encoding of the ServletResponse object - it was OK, ISO-8859-2. The trouth is I'm running 4.0.1 and I have been looking at sources for 4.0. I'll test 4.0 and if it displays characters correctly, there's gonna be a bug report.
>
> Nix.
>
Best regards,
Kovi
--
To unsubscribe: <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>
Re: Character Encoding problems 2
Posted by Nikola Milutinovic <Ni...@ev.co.yu>.
> >>I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
> >>have to have in mind this was not done on Tomcat 4.x, but 3.x.
> >>This is what I have done:
> >>- <%@page contentType="text/html; charset=windows-1250"%> on top of
> >>every JSP file
> >>
> >
> > I don't think that is a correct character encoding as far as Java is concerned. I think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
> >
>
>
> I'm sorry, butr you are wrong. You can convert between numerous
> encodings, but you have to have i18n.jar in your classpath.
Hmm, I thought that Java community loathed anything but ISO, where can I find i18n.jar? I'll look for it on Sun's site, but if it is not there, drop me a line.
> > What I'm looking for is a "politically correct" solution. I have so far:
> >
> > - PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data in correct form.
> > - JDBC driver which is acting OK.
> > - JSP pages with correctly set pageEncoding
> > - Java Servlet with correctly set contentType/encoding
> >
> > Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
> >
>
> Have you tried putting %@page contentType="text/html;
> charset=iso8859-2"%> on top of your JSP's ?
Always. And that is what is driving me crazy. I have even tested what is the character encoding of the ServletResponse object - it was OK, ISO-8859-2. The trouth is I'm running 4.0.1 and I have been looking at sources for 4.0. I'll test 4.0 and if it displays characters correctly, there's gonna be a bug report.
Nix.
Re: Character Encoding problems 2
Posted by " <gregor.kovac@mikropis.si>" <Gregor>.
Hi!
Nikola Milutinovic wrote:
>>I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
>>have to have in mind this was not done on Tomcat 4.x, but 3.x.
>>This is what I have done:
>>- <%@page contentType="text/html; charset=windows-1250"%> on top of
>>every JSP file
>>
>
> I don't think that is a correct character encoding as far as Java is concerned. I think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
>
I'm sorry, butr you are wrong. You can convert between numerous
encodings, but you have to have i18n.jar in your classpath.
>
>>- default_character_set=latin2 in my.cnf
>>
>
> Is there a way to set defaul character encoding for Tomcat? Setting LOCALLE on Unix?
>
Hmm, I wouldn't know.... Sorry.
>
>>- created new database so it gets created in latin2 character set
>>
>
> Done that with PostgreSQL.
>
>
>>- when I connected to MySQL I was using mm.mysql driver and the database
>>URL was
>>jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true
>>
>
> I've never used MySQL, just PostgreSQL. So, the database is ISO-8859-2 and this converts it to CP-1250, which goes by as Latin-1, as far as Tomcat is concerned.
>
> I have had a similar "success" with my setup: the database was Latin-1, the data in it was win-1250 and when I forced JDBC connection to Latin-1 charset, it would pass through JSP. But that is such a hack...
>
>
>>Then all characters were correctly displayed on JSP pages.
>>
>
> What I'm looking for is a "politically correct" solution. I have so far:
>
> - PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data in correct form.
> - JDBC driver which is acting OK.
> - JSP pages with correctly set pageEncoding
> - Java Servlet with correctly set contentType/encoding
>
> Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
>
Have you tried putting %@page contentType="text/html;
charset=iso8859-2"%> on top of your JSP's ?
> Any help?
>
> Nix.
>
Best regards,
Kovi
--
To unsubscribe: <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>
Re: Character Encoding problems 2
Posted by Nikola Milutinovic <Ni...@ev.co.yu>.
> I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
> have to have in mind this was not done on Tomcat 4.x, but 3.x.
> This is what I have done:
> - <%@page contentType="text/html; charset=windows-1250"%> on top of
> every JSP file
I don't think that is a correct character encoding as far as Java is concerned. I think Java supports only ISO-8859-* and UTF-*. Please correct me if I'm wrong
> - default_character_set=latin2 in my.cnf
Is there a way to set defaul character encoding for Tomcat? Setting LOCALLE on Unix?
> - created new database so it gets created in latin2 character set
Done that with PostgreSQL.
> - when I connected to MySQL I was using mm.mysql driver and the database
> URL was
> jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true
I've never used MySQL, just PostgreSQL. So, the database is ISO-8859-2 and this converts it to CP-1250, which goes by as Latin-1, as far as Tomcat is concerned.
I have had a similar "success" with my setup: the database was Latin-1, the data in it was win-1250 and when I forced JDBC connection to Latin-1 charset, it would pass through JSP. But that is such a hack...
> Then all characters were correctly displayed on JSP pages.
What I'm looking for is a "politically correct" solution. I have so far:
- PostgreSQL with one Unicode and one ISO-8859-2 databases, both with the same data in correct form.
- JDBC driver which is acting OK.
- JSP pages with correctly set pageEncoding
- Java Servlet with correctly set contentType/encoding
Still, Tomcat goes for default charset encoding and screwes up Latin-2 characters.
Any help?
Nix.
Re: Character Encoding problems 2
Posted by " <gregor.kovac@mikropis.si>" <Gregor>.
Hi!
I have had similar problem with Cp1250 encoding(Tomcat and MySQL). You
have to have in mind this was not done on Tomcat 4.x, but 3.x.
This is what I have done:
- <%@page contentType="text/html; charset=windows-1250"%> on top of
every JSP file
- default_character_set=latin2 in my.cnf
- created new database so it gets created in latin2 character set
- when I connected to MySQL I was using mm.mysql driver and the database
URL was
jdbc:mysql://hostname:port/database?characterEncoding=Cp1250&useUnicode=true
Then all characters were correctly displayed on JSP pages.
I hope this helps.
Best regards,
Kovi
Nikola Milutinovic wrote:
> Hi all.
>
> It's me again and troubles are not resolved. I've created a simple test servlet:
>
> ----------------------------------------
> import javax.servlet.*;
> import java.io.*;
>
> public class TestServlet extends GenericServlet {
> private static final String testText = "\uC5A0 \uC5A1 \uC486 \uC487 \uC48C \uC48D \uC490 \uC491 \uC5BD \uC5BE";
> PrintWriter out;
>
> public void service( ServletRequest req, ServletResponse res )
> throws javax.servlet.ServletException, java.io.IOException
> {
> res.setContentType("text/html; charset=ISO-8859-2");
> out = res.getWriter();
> out.print( "<html>\r\n<head><title>Test servlet</title>\r\n" );
> out.print( "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=iso-8859-2\">\r\n</head>\r\n" );
> out.print( "<body>\r\n<h1>Test</h1>\r\n<p>Let us see how this gets out</p>\r\n<p>\r\n<p>" );
> out.print( testText );
> out.print( "</p>\r\n</body>\r\n</html>" );
> }
> }
> ----------------------------------------
>
> This prints "?" instead of characters. The string in question prints desired characters in an ordinary Java application.
>
> QUESTION 1
> ---------------
> How can I get Tomcat to honour "charset=ISO-8859-2"?
>
> QUESTION 2
> ---------------
> What about static HTML? Suppose I should enter a part of static HTML data in Latin-2 encoding. That translates to a string. A string is supposed to be Unicode. Do those strings get translated from "pageEncoding" to Unicode?
>
> Nix.
>
--
To unsubscribe: <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>