You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Gr...@pfizer.com on 2003/12/18 17:44:18 UTC

RE: Intro/question possible buglet with Content-Type and Charsets - now more of an RFC

Hi All,

(We may be barking up the wrong tree here, so if so please point me in the
right direction)

This is still causing us issues - as IE fails to parse a charset when it is
tacked on to Content-Type: application/vnd.ms-excel

It would appear that the charset is being tacked onto the Content-Type in
setContentType method of
catalina/src/share/org/apache/catalina/connector/ResponseBase.java in the
event of it not being supplied in the Content-Type (it looks for a ';')

The encoding can never be null as it is extracted from the locale in the
setLocale method below it.

I understand this to mean that the charset will always be tacked on
irrespective of the Type.

However;

I cannot find an explicit reference to not defining a charset for binary
Types, but I cannot see why you would want to.

HTTP 1.1 implies that there is a default charset for text Types (makes
sense)(http://www.w3.org/Protocols/rfc2068/rfc2068)

'When no explicit charset parameter is provided by the sender, media
subtypes of the "text" type are defined to have a default charset value of
"ISO-8859-1"' 

Which I understand that it is fair enough to add it to text/* Types.

RFC 1341 (http://www.faqs.org/rfcs/rfc1341.html) states that:

'2.a.  A "text" Content-Type value, which can be used to represent  textual
information  in  a  number  of character  sets  and  formatted  text
description languages in a standardized manner.'

But no mention of Charsets in Application types:

'2.c.  An "application" Content-Type value, which can be used  to transmit
application data or binary data, and hence,  among  other  uses,  to
implement  an electronic mail file transfer service.

What I would suggest is a little if wrapper to only add a default if the
Content-Type is text/

A sudo code below (not tested)

###########
catalina/src/share/org/apache/catalina/connector/ResponseBase.java

 public void setContentType(String type) {

        if (isCommitted())
            return;

        if (included)
            return;     // Ignore any call from an included servlet

        this.contentType = type;
        if (type.indexOf(';') >= 0) {
            encoding = RequestUtil.parseCharacterEncoding(type);
            if (encoding == null)
                encoding = "ISO-8859-1";
        } else {
            if (encoding != null && type.startsWith('text/'))
                this.contentType = type + ";charset=" + encoding;
        }

    }

Regards,

Greg


> -----Original Message-----
> From: Tim Funk [mailto:funkman@joedog.org]
> Sent: 16 December 2003 18:09
> To: Tomcat Developers List
> Subject: Re: Intro/question possible buglet with Content-Type and
> Charsets .
> 
> 
> Yeah, nagoya.apache.org seems down. Hopefully it will be back 
> soon. The bug 
> has good detail of what and how to fix.
> 
> -Tim
> 
> Greg.Cope@pfizer.com wrote:
> 
> > Thanks Tim,
> > 
> > Having a little trouble getting anything from bugzilla, 
> nagoya.apache.org
> > seems to be having a little trouble!
> > 
> > Looking in the archives for this id, I see that someone has 
> a 4.1.29 patch
> > and a complied class, but cannot see either email address 
> or content via the
> > archive.
> > 
> > Ho hum....
> > 
> > Thanks for the pointer.
> > 
> > Greg
> > 
> > 
> > 
> > 
> >>-----Original Message-----
> >>From: Tim Funk [mailto:funkman@joedog.org]
> >>Sent: 16 December 2003 12:31
> >>To: Tomcat Developers List
> >>Subject: Re: Intro/question possible buglet with Content-Type and
> >>Charsets.
> >>
> >>
> >>http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24970
> >>
> >>Greg.Cope@pfizer.com wrote:
> >>
> >>>Hi All,
> >>>
> >>>Quick intro, and then a question;
> >>>
> >>>We use tomcat to host java web applications at our 
> >>
> >>location.  My client
> >>
> >>>requires us to follow very strict rules for deploying 
> >>
> >>software, that means
> >>
> >>>it can be a documentation intensive process (evidence 
> >>
> >>gathering/ IQP's etc
> >>
> >>>....).  So we rarely upgrade as it is quite allot of 
> >>
> >>work..... Luckily
> >>
> >>>tomcat is excellent and rarely needs upgrading or patching.
> >>>
> >>>Now the question;
> >>>
> >>>Tomcat 4.1.29 seems to insist on added charset to the 
> >>
> >>content type, even if
> >>
> >>>a Content-Type has been set using response.setContentType 
> or similar
> >>>(without a charset).  Tomcat 5 seems to do something 
> >>
> >>similar judging from
> >>
> >>http://www.mail-archive.com/tomcat-dev@jakarta.apache.org/msg4
> >>9015.html but
> >>
> >>>I think it fails to check if the Content type is a text one 
> >>
> >>(HTML) and adds
> >>
> >>>it for any content type, which would appear not to be right IMHO.
> >>>
> >>>Without wishing to appear rude :-) I need to change this 
> >>
> >>behaviour and
> >>
> >>>remove the insertion of the charset for non text based 
> >>
> >>Content-Types  eg:
> >>
> >>>application/vnd.ms-excel
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org