You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Garret Wilson <ga...@globalmentor.com> on 2019/05/21 17:03:03 UTC

Re: distinction between resource charset and format octet decoding

Sorry to bring up the non-UTF-8 escaped octets form POST problem again, 
but …

On 1/8/2019 3:57 PM, Mark Thomas wrote:
> …
> As of Servlet 4.0 there is a specification compliant configuration 
> option to change this default to any encoding of your choice. 
> Obviously, UTF-8 is one of the options. You can do this by adding the 
> following to your web.xml:
>
> <request-character-encoding>UTF-8</request-character-encoding>
>
> If you add it to conf/web.xml it applies to every web application 
> deployed to Tomcat.
>
> Tomcat 9 uses this in the examples, manager and host-manager 
> applications in place of the SetCharacterEncodingFilter.


As you know I've already updated the Tomcat FAQ with the options for 
forcing Tomcat to interpret form POSTs with any escaped characters using 
UTF-8 octet sequences (as modern browsers send, and as HTML5 requires) 
instead of ISO-8859-1 (as the Servlet 4 spec says).

But the problem is worse with the Spring community. If someone is using 
Spring Boot to create an executable JAR/WAR using embedded tomcat, 
Spring Boot does something to configure Tomcat to send the POSTs 
correctly (that is, as the modern web likes it, not like the Servlet 4 
spec says). Unfortunately, if I use Spring Boot to make a WAR which is 
both a self-contained executing WAR /and/ a WAR deployable on Tomcat, 
when I deploy the WAR on Tomcat the encoded characters are using escaped 
ISO-8859-1 octets, so my web app breaks. Yes, the WAR runs differently 
if using Spring Boot embedded Tomcat or deployed on standalone Tomcat as 
a WAR.

Spring Boot ignores any `web.xml` file. I guess I could create a 
`web.xml` file only for standalone Tomcat, but then this freezes Eclipse 
(as I posted elsewhere) because Eclipse doesn't understand 
`<request-character-encoding>`. So like so many things on the web, this 
is a mess.

This is a serious issue, in my opinion. The Servlet 4 specification is 
out of step with everything else in the ecosystem!

> Whether Tomcat should ship with this setting present in conf/web.xml 
> by default is something that should probably be discussed for Tomcat 
> 10. Given the current state of the web, there is a reasonable case for 
> doing so. I'll add that to the TOMCAT-NEXT discussion list.

Yes, can I just re-second (third?) that motion, and underscore the need 
for this to be changed in Tomcat 10?

Thanks,

Garret