You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@tomcat.apache.org by Nikola Milutinovic <Ni...@ev.co.yu> on 2002/02/18 14:44:45 UTC

Input from a FORM - encoding problem

Hi all.

I have a HTML FORM that I'd like to use to update data in my database. DB (PostgreSQL + Unicode) is configured and correctly loaded with Unicode data. Translations from UTF-8 -> Win-1250 works like a charm (and so does UTF-8 -> ISO-8859-2).

In other words, displaying the data is OK.

Now I want to update fields and there is a problem. If I enter some of win-1250 chars in a textfield it gets translated to "?".

A simple investigation shows that the loathed Win1250 -> '?' occurs within the HTTPRequest object creation.

How do I specify that the data coming from a FORM is Win1250 encoded?
Do I do that in HTML FORM that submits the data (most likely)?
Or do I do that in the JSP/Servlet accepting the data (highly unlikely)?

I'm looking at HTML 4.01 specification, but so far I'm unlucky - nothing seams to work.

Nix.

Re: Input from a FORM - encoding problem

Posted by Nikola Milutinovic <Ni...@ev.co.yu>.

> try this ...
> 
> <quote>
> FORM attribute
> 
> accept-charset = charset list [CI]
>     This attribute specifies the list of character encodings for input data that is accepted by the server processing this form. The value is a space- and/or comma-delimited list of charset values. The client must interpret this list as an
> exclusive-or list, i.e., the server is able to accept any single character encoding per entity received.

Nothing yet. I've tried it. I'll try ISO-8859-2 tomorrow.

How doeas it work anyway? What does HTTP request have in headers for this encoding?

Nix.

Re: Input from a FORM - encoding problem

Posted by Attila Szegedi <sz...@freemail.hu>.

----- Original Message -----
From: "Nikola Milutinovic" <Ni...@ev.co.yu>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: 2002. február 18. 18:19
Subject: Re: Input from a FORM - encoding problem


> Attila Szegedi wrote:
> > Don't bother fiddling with <FORM> attributes. I've done this before to
no avail.
> > Right now, no matter what you specify as an encoding in a HTML page,
most
> > browsers (all favorite IE and NN flavors) ignore it altogether and
encode
> > the form data using the encoding in which the page containing the form
was
> > sent to them. Worse yet, they *don't* specify the encoding of characters
> > in the form data when sending them back via a POST request, so you must
> > know on the server side what was the encoding of the page that contained
> > the form. Servlet 2.3 spec is meant to contain a solution for this, but
I
> > don't know how is it (or isn't) implemented in Tomcat 4.x.
>
> And how is it supposed to be specified? HTTP headers? Which ones?
>

request.setCharacterEncoding(String encoding)

See http://www.servlets.com/soapbox/servlet23.html (Jason Hunter's article)
for more info.

> > As if all of the above weren't enough, Tomcat 3.x gives yet another stab
to
> > internationalization efforts: it will blindly interpret all form data as
> > being iso-8859-1 (~ Cp1252), so your iso-8859-2 (~Cp1250) characters are
> > lost. Again, I don't know how Tomcat 4.x line handles this.
>
> I guess I'll have to dig into the code. (sigh) Oh well, at least I HAVE
access
> to the source code.
>
>
> > Being a Hungarian, I'm just as interested in entering 8859-2 characters
in my
> > pages, and not seeing ? marks on the server side, so I'm transcoding all
form
> > data strings on the fly. The off-the-wall solution looks like this:
> > param = new String(param.getBytes("8859_1"), "8859_2");
>
> Where do you place this? Is it like:
>
> param = request.getParameter( "name" );
> param = new String(param.getBytes("8859_1"), "8859_2");
>
> Basically, my question would be: once inside the JSP page, can I get
parameters
> and re-code them some way or are they "destroyed" (transfigured to those
pesky
> "?"s) upon construction oh HHTPResponse object?
>

There's a good chance they are not destroyed. I guess question marks are the
artifact of later transformation of the string to bytes (like when
generating a response). In the request, the byte value of your characters
should be preserved and thus transcoding should be possible.

>
> > altough this tends to be slow (running through Java char-to-byte, then
through
> > byte-to-char machinery). I have developed a fast 8859-1 to 8859-2
transcoder
> > that addresses speed issues; contact me in private mail and I can send
it to you.
>
> Sure. Send it, please.
>
>
> BTW, I'm using Tomcat 4.01, so, if need be, I could employ some sort of
filter,
> but I'd like a proper solution. Tomcat 4 is supposed to be a reference
Servlet
> container, after all.

Then try using request.setCharacterEncoding(String encoding) method before
you jump the gun and start coding the filter.

--
Attila Szegedi
home: http://www.szegedi.org


>
> Nix.
>
>
> --
> To unsubscribe:   <ma...@jakarta.apache.org>
> For additional commands: <ma...@jakarta.apache.org>
> Troubles with the list: <ma...@jakarta.apache.org>
>
>
>
>


--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>

Re: Input from a FORM - encoding problem

Posted by Nikola Milutinovic <Ni...@ev.co.yu>.

Attila Szegedi wrote:

> Don't bother fiddling with <FORM> attributes. I've done this before to no avail.
> 
> Right now, no matter what you specify as an encoding in a HTML page, most

> browsers (all favorite IE and NN flavors) ignore it altogether and encode

> the form data using the encoding in which the page containing the form was

> sent to them. Worse yet, they *don't* specify the encoding of characters

> in the form data when sending them back via a POST request, so you must

> know on the server side what was the encoding of the page that contained

> the form. Servlet 2.3 spec is meant to contain a solution for this, but I

> don't know how is it (or isn't) implemented in Tomcat 4.x.


And how is it supposed to be specified? HTTP headers? Which ones?


> As if all of the above weren't enough, Tomcat 3.x gives yet another stab to

> internationalization efforts: it will blindly interpret all form data as

> being iso-8859-1 (~ Cp1252), so your iso-8859-2 (~Cp1250) characters are

> lost. Again, I don't know how Tomcat 4.x line handles this.


I guess I'll have to dig into the code. (sigh) Oh well, at least I HAVE access 
to the source code.


> Being a Hungarian, I'm just as interested in entering 8859-2 characters in my

> pages, and not seeing ? marks on the server side, so I'm transcoding all form

> data strings on the fly. The off-the-wall solution looks like this:
> 
> param = new String(param.getBytes("8859_1"), "8859_2");


Where do you place this? Is it like:

param = request.getParameter( "name" );
param = new String(param.getBytes("8859_1"), "8859_2");

Basically, my question would be: once inside the JSP page, can I get parameters 
and re-code them some way or are they "destroyed" (transfigured to those pesky 
"?"s) upon construction oh HHTPResponse object?


> altough this tends to be slow (running through Java char-to-byte, then through

> byte-to-char machinery). I have developed a fast 8859-1 to 8859-2 transcoder

> that addresses speed issues; contact me in private mail and I can send it to you.


Sure. Send it, please.


BTW, I'm using Tomcat 4.01, so, if need be, I could employ some sort of filter,

but I'd like a proper solution. Tomcat 4 is supposed to be a reference Servlet

container, after all.


Nix.


--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>

RE: Input from a FORM - encoding problem

Posted by Satoshi Okamoto <sa...@mizuho-sc.com>.

if its servlet, try this..

response.setContentType("text/html;charset=UR ENCODING TYPE");
PrintWriter out = new PrintWriter( new
OutputStreamWriter(response.getOutputStream(), "UR ENCODING TYPE"));

-----Original Message-----
From: Attila Szegedi [mailto:szegedia@freemail.hu]
Sent: Tuesday, February 19, 2002 5:16 PM
To: Tomcat Users List
Subject: Re: Input from a FORM - encoding problem


OK: he might try. I admit I've not used IE6, only IEs up to 5.5 and NN up to
4.72, but it's a fact that:

- these browsers never appended a charset declaration to the Content-Type
header (i.e. "Content-Type: application/x-form-urlencoded" and not
"Content-Type: application/x-form-urlencoded; charset=iso-8859-2" so it was
up to the server side to figure out what the charset was.

- Tomcat 3.2.x blindly decoded form data as ISO-8859-1 (in fact, it is the
code in javax.servlet.http.HttpUtils#parsePostData() method which contains
the following much revealing comment:
<quote>
        // XXX we shouldn't assume that the only kind of POST body
        // is FORM data encoded using ASCII or ISO Latin/1 ... or
        // that the body should always be treated as FORM data.

</quote>
So, even if your browser acts to the spec, Tomcat 3.2.x certainly does not.
I must underline that I don't know if 3.3.x or 4.x Tomcats rely on this
(flawed) code or not. Tomcat 4.x definitely should not, since it is supposed
to implement request.setCharacterEncoding()...

Cheers,
  Attila.

--
Attila Szegedi
home: http://www.szegedi.org


----- Original Message -----
From: "Arnold Shore" <as...@dgs.dgsys.com>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: 2002. febru? 18. 16:58
Subject: RE: Input from a FORM - encoding problem


> Re "Don't bother fiddling with <FORM> attributes. I've done this before to
> no avail":
>
> I'm accepting Arabic, Hebrew, Russian, and Chinese doing exactly that,
with
> IE 6 and using Unicode encodings. (Will be trying NN and Opera shortly.)
And
> yes, I'm also using that encoding on the page.
>
> It's going into a database, with subsequent retrieval and display.  Works
> correctly for the stuff I've tried.
>
> Arnold Shore
> Annapolis, MD USA
>
> -----Original Message-----
> From: Attila Szegedi [mailto:szegedia@freemail.hu]
> Sent: Monday, February 18, 2002 9:39 AM
> To: Tomcat Users List
> Subject: Re: Input from a FORM - encoding problem
>
>
> Don't bother fiddling with <FORM> attributes. I've done this before to no
> avail.
>
> Right now, no matter what you specify as an encoding in a HTML page, most
> browsers (all favorite IE and NN flavors) ignore it altogether and encode
> the form data using the encoding in which the page containing the form was
> sent to them. Worse yet, they *don't* specify the encoding of characters
in
> the form data when sending them back via a POST request, so you must know
on
> the server side what was the encoding of the page that contained the form.
> Servlet 2.3 spec is meant to contain a solution for this, but I don't know
> how is it (or isn't) implemented in Tomcat 4.x.
>
> As if all of the above weren't enough, Tomcat 3.x gives yet another stab
to
> internationalization efforts: it will blindly interpret all form data as
> being iso-8859-1 (~ Cp1252), so your iso-8859-2 (~Cp1250) characters are
> lost. Again, I don't know how Tomcat 4.x line handles this.
>
> Being a Hungarian, I'm just as interested in entering 8859-2 characters in
> my pages, and not seeing ? marks on the server side, so I'm transcoding
all
> form data strings on the fly. The off-the-wall solution looks like this:
>
> param = new String(param.getBytes("8859_1"), "8859_2");
>
> altough this tends to be slow (running through Java char-to-byte, then
> through byte-to-char machinery). I have developed a fast 8859-1 to 8859-2
> transcoder that addresses speed issues; contact me in private mail and I
can
> send it to you.
>
> Cheers,
>   Attila.
> --
> Attila Szegedi
> home: http://www.szegedi.org
>
> ----- Original Message -----
> From: "Nikola Milutinovic" <Ni...@ev.co.yu>
> To: "Tomcat Users List" <to...@jakarta.apache.org>
> Sent: 2002. febru? 18. 15:17
> Subject: Re: Input from a FORM - encoding problem
>
>
> > > <quote>
> > > FORM attribute
> > >
> > > accept-charset = charset list [CI]
> > >     This attribute specifies the list of character encodings for input
> data that is accepted by the server processing this form. The value is a
> space- and/or comma-delimited list of charset values. The client must
> interpret this list as an
> > > exclusive-or list, i.e., the server is able to accept any single
> character encoding per entity received.
> >
> > This bit is a "bit unclear" to me. If I specify several encodings, how
> will the browser know which one was actually used? How will the server
know
> which one was used?
> >
> > Nix.
> >
>
>
> --
> To unsubscribe:   <ma...@jakarta.apache.org>
> For additional commands: <ma...@jakarta.apache.org>
> Troubles with the list: <ma...@jakarta.apache.org>
>
>
>
>


--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>



--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>

Re: Input from a FORM - encoding problem

Posted by Attila Szegedi <sz...@freemail.hu>.

OK: he might try. I admit I've not used IE6, only IEs up to 5.5 and NN up to
4.72, but it's a fact that:

- these browsers never appended a charset declaration to the Content-Type
header (i.e. "Content-Type: application/x-form-urlencoded" and not
"Content-Type: application/x-form-urlencoded; charset=iso-8859-2" so it was
up to the server side to figure out what the charset was.

- Tomcat 3.2.x blindly decoded form data as ISO-8859-1 (in fact, it is the
code in javax.servlet.http.HttpUtils#parsePostData() method which contains
the following much revealing comment:
<quote>
        // XXX we shouldn't assume that the only kind of POST body
        // is FORM data encoded using ASCII or ISO Latin/1 ... or
        // that the body should always be treated as FORM data.

</quote>
So, even if your browser acts to the spec, Tomcat 3.2.x certainly does not.
I must underline that I don't know if 3.3.x or 4.x Tomcats rely on this
(flawed) code or not. Tomcat 4.x definitely should not, since it is supposed
to implement request.setCharacterEncoding()...

Cheers,
  Attila.

--
Attila Szegedi
home: http://www.szegedi.org


----- Original Message -----
From: "Arnold Shore" <as...@dgs.dgsys.com>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: 2002. február 18. 16:58
Subject: RE: Input from a FORM - encoding problem


> Re "Don't bother fiddling with <FORM> attributes. I've done this before to
> no avail":
>
> I'm accepting Arabic, Hebrew, Russian, and Chinese doing exactly that,
with
> IE 6 and using Unicode encodings. (Will be trying NN and Opera shortly.)
And
> yes, I'm also using that encoding on the page.
>
> It's going into a database, with subsequent retrieval and display.  Works
> correctly for the stuff I've tried.
>
> Arnold Shore
> Annapolis, MD USA
>
> -----Original Message-----
> From: Attila Szegedi [mailto:szegedia@freemail.hu]
> Sent: Monday, February 18, 2002 9:39 AM
> To: Tomcat Users List
> Subject: Re: Input from a FORM - encoding problem
>
>
> Don't bother fiddling with <FORM> attributes. I've done this before to no
> avail.
>
> Right now, no matter what you specify as an encoding in a HTML page, most
> browsers (all favorite IE and NN flavors) ignore it altogether and encode
> the form data using the encoding in which the page containing the form was
> sent to them. Worse yet, they *don't* specify the encoding of characters
in
> the form data when sending them back via a POST request, so you must know
on
> the server side what was the encoding of the page that contained the form.
> Servlet 2.3 spec is meant to contain a solution for this, but I don't know
> how is it (or isn't) implemented in Tomcat 4.x.
>
> As if all of the above weren't enough, Tomcat 3.x gives yet another stab
to
> internationalization efforts: it will blindly interpret all form data as
> being iso-8859-1 (~ Cp1252), so your iso-8859-2 (~Cp1250) characters are
> lost. Again, I don't know how Tomcat 4.x line handles this.
>
> Being a Hungarian, I'm just as interested in entering 8859-2 characters in
> my pages, and not seeing ? marks on the server side, so I'm transcoding
all
> form data strings on the fly. The off-the-wall solution looks like this:
>
> param = new String(param.getBytes("8859_1"), "8859_2");
>
> altough this tends to be slow (running through Java char-to-byte, then
> through byte-to-char machinery). I have developed a fast 8859-1 to 8859-2
> transcoder that addresses speed issues; contact me in private mail and I
can
> send it to you.
>
> Cheers,
>   Attila.
> --
> Attila Szegedi
> home: http://www.szegedi.org
>
> ----- Original Message -----
> From: "Nikola Milutinovic" <Ni...@ev.co.yu>
> To: "Tomcat Users List" <to...@jakarta.apache.org>
> Sent: 2002. február 18. 15:17
> Subject: Re: Input from a FORM - encoding problem
>
>
> > > <quote>
> > > FORM attribute
> > >
> > > accept-charset = charset list [CI]
> > >     This attribute specifies the list of character encodings for input
> data that is accepted by the server processing this form. The value is a
> space- and/or comma-delimited list of charset values. The client must
> interpret this list as an
> > > exclusive-or list, i.e., the server is able to accept any single
> character encoding per entity received.
> >
> > This bit is a "bit unclear" to me. If I specify several encodings, how
> will the browser know which one was actually used? How will the server
know
> which one was used?
> >
> > Nix.
> >
>
>
> --
> To unsubscribe:   <ma...@jakarta.apache.org>
> For additional commands: <ma...@jakarta.apache.org>
> Troubles with the list: <ma...@jakarta.apache.org>
>
>
>
>


--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>

RE: Input from a FORM - encoding problem

Posted by Arnold Shore <as...@dgs.dgsys.com>.

Re "Don't bother fiddling with <FORM> attributes. I've done this before to
no avail":

I'm accepting Arabic, Hebrew, Russian, and Chinese doing exactly that, with
IE 6 and using Unicode encodings. (Will be trying NN and Opera shortly.) And
yes, I'm also using that encoding on the page.

It's going into a database, with subsequent retrieval and display.  Works
correctly for the stuff I've tried.

Arnold Shore
Annapolis, MD USA

-----Original Message-----
From: Attila Szegedi [mailto:szegedia@freemail.hu]
Sent: Monday, February 18, 2002 9:39 AM
To: Tomcat Users List
Subject: Re: Input from a FORM - encoding problem


Don't bother fiddling with <FORM> attributes. I've done this before to no
avail.

Right now, no matter what you specify as an encoding in a HTML page, most
browsers (all favorite IE and NN flavors) ignore it altogether and encode
the form data using the encoding in which the page containing the form was
sent to them. Worse yet, they *don't* specify the encoding of characters in
the form data when sending them back via a POST request, so you must know on
the server side what was the encoding of the page that contained the form.
Servlet 2.3 spec is meant to contain a solution for this, but I don't know
how is it (or isn't) implemented in Tomcat 4.x.

As if all of the above weren't enough, Tomcat 3.x gives yet another stab to
internationalization efforts: it will blindly interpret all form data as
being iso-8859-1 (~ Cp1252), so your iso-8859-2 (~Cp1250) characters are
lost. Again, I don't know how Tomcat 4.x line handles this.

Being a Hungarian, I'm just as interested in entering 8859-2 characters in
my pages, and not seeing ? marks on the server side, so I'm transcoding all
form data strings on the fly. The off-the-wall solution looks like this:

param = new String(param.getBytes("8859_1"), "8859_2");

altough this tends to be slow (running through Java char-to-byte, then
through byte-to-char machinery). I have developed a fast 8859-1 to 8859-2
transcoder that addresses speed issues; contact me in private mail and I can
send it to you.

Cheers,
  Attila.
--
Attila Szegedi
home: http://www.szegedi.org

----- Original Message -----
From: "Nikola Milutinovic" <Ni...@ev.co.yu>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: 2002. február 18. 15:17
Subject: Re: Input from a FORM - encoding problem


> > <quote>
> > FORM attribute
> >
> > accept-charset = charset list [CI]
> >     This attribute specifies the list of character encodings for input
data that is accepted by the server processing this form. The value is a
space- and/or comma-delimited list of charset values. The client must
interpret this list as an
> > exclusive-or list, i.e., the server is able to accept any single
character encoding per entity received.
>
> This bit is a "bit unclear" to me. If I specify several encodings, how
will the browser know which one was actually used? How will the server know
which one was used?
>
> Nix.
>


--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>

Re: Input from a FORM - encoding problem

Posted by Attila Szegedi <sz...@freemail.hu>.

Don't bother fiddling with <FORM> attributes. I've done this before to no avail.

Right now, no matter what you specify as an encoding in a HTML page, most browsers (all favorite IE and NN flavors) ignore it altogether and encode the form data using the encoding in which the page containing the form was sent to them. Worse yet, they *don't* specify the encoding of characters in the form data when sending them back via a POST request, so you must know on the server side what was the encoding of the page that contained the form. Servlet 2.3 spec is meant to contain a solution for this, but I don't know how is it (or isn't) implemented in Tomcat 4.x.

As if all of the above weren't enough, Tomcat 3.x gives yet another stab to internationalization efforts: it will blindly interpret all form data as being iso-8859-1 (~ Cp1252), so your iso-8859-2 (~Cp1250) characters are lost. Again, I don't know how Tomcat 4.x line handles this. 

Being a Hungarian, I'm just as interested in entering 8859-2 characters in my pages, and not seeing ? marks on the server side, so I'm transcoding all form data strings on the fly. The off-the-wall solution looks like this:

param = new String(param.getBytes("8859_1"), "8859_2");

altough this tends to be slow (running through Java char-to-byte, then through byte-to-char machinery). I have developed a fast 8859-1 to 8859-2 transcoder that addresses speed issues; contact me in private mail and I can send it to you.

Cheers,
  Attila.
--
Attila Szegedi
home: http://www.szegedi.org

----- Original Message ----- 
From: "Nikola Milutinovic" <Ni...@ev.co.yu>
To: "Tomcat Users List" <to...@jakarta.apache.org>
Sent: 2002. február 18. 15:17
Subject: Re: Input from a FORM - encoding problem


> > <quote>
> > FORM attribute
> > 
> > accept-charset = charset list [CI]
> >     This attribute specifies the list of character encodings for input data that is accepted by the server processing this form. The value is a space- and/or comma-delimited list of charset values. The client must interpret this list as an
> > exclusive-or list, i.e., the server is able to accept any single character encoding per entity received.
> 
> This bit is a "bit unclear" to me. If I specify several encodings, how will the browser know which one was actually used? How will the server know which one was used?
> 
> Nix.
>

Re: Input from a FORM - encoding problem

Posted by Nikola Milutinovic <Ni...@ev.co.yu>.

> <quote>
> FORM attribute
> 
> accept-charset = charset list [CI]
>     This attribute specifies the list of character encodings for input data that is accepted by the server processing this form. The value is a space- and/or comma-delimited list of charset values. The client must interpret this list as an
> exclusive-or list, i.e., the server is able to accept any single character encoding per entity received.

This bit is a "bit unclear" to me. If I specify several encodings, how will the browser know which one was actually used? How will the server know which one was used?

Nix.

Re: Input from a FORM - encoding problem

Posted by David Cassidy <dc...@hotgen.com>.

try this ...

<quote>
FORM attribute

accept-charset = charset list [CI]
    This attribute specifies the list of character encodings for input data that is accepted by the server processing this form. The value is a space- and/or comma-delimited list of charset values. The client must interpret this list as an
exclusive-or list, i.e., the server is able to accept any single character encoding per entity received.

    The default value for this attribute is the reserved string "UNKNOWN". User agents may interpret this value as the character encoding that was used to transmit the document containing this FORM element.
</quote>

<URL>http://www.w3.org/TR/html401/interact/forms.html#h-17.3


Let us know ...

Thanks

D




Nikola Milutinovic wrote:

> Hi all.
>
> I have a HTML FORM that I'd like to use to update data in my database. DB (PostgreSQL + Unicode) is configured and correctly loaded with Unicode data. Translations from UTF-8 -> Win-1250 works like a charm (and so does UTF-8 -> ISO-8859-2).
>
> In other words, displaying the data is OK.
>
> Now I want to update fields and there is a problem. If I enter some of win-1250 chars in a textfield it gets translated to "?".
>
> A simple investigation shows that the loathed Win1250 -> '?' occurs within the HTTPRequest object creation.
>
> How do I specify that the data coming from a FORM is Win1250 encoded?
> Do I do that in HTML FORM that submits the data (most likely)?
> Or do I do that in the JSP/Servlet accepting the data (highly unlikely)?
>
> I'm looking at HTML 4.01 specification, but so far I'm unlucky - nothing seams to work.
>
> Nix.


--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>

Re: Input from a FORM - encoding problem SOLVED

Posted by Nikola Milutinovic <Ni...@ev.co.yu>.

The solution was to set the character encoding on the request (not on the response) object. Aparently, the parameters of the request are fetched on method call, which is a nice thing :-)

Thanks to all who helped.

And, by the way, IE6 doesn't honour "enctype" of the FORM, just splashes it's default, which doesn't include encoding info.

Nix.

RE: Input from a FORM - encoding problem

Posted by Arnold Shore <as...@dgs.dgsys.com>.

I'm using something like the ff, which works for me with IE6 and IIS:
<FORM ACCEPT-CHARSET="UTF-8" METHOD= ..."

Arnold Shore
Annapolis, MD USA

-----Original Message-----
From: Nikola Milutinovic [mailto:Nikola.Milutinovic@ev.co.yu]
Sent: Monday, February 18, 2002 8:45 AM
To: Tomcat Users List
Subject: Input from a FORM - encoding problem

... Do I do that in HTML FORM that submits the data (most likely)?
Or do I do that in the JSP/Servlet accepting the data (highly unlikely)?

I'm looking at HTML 4.01 specification, but so far I'm unlucky - nothing
seams to work.

Nix.

--
To unsubscribe:   <ma...@jakarta.apache.org>
For additional commands: <ma...@jakarta.apache.org>
Troubles with the list: <ma...@jakarta.apache.org>