You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Dean Gaudet <dg...@arctic.org> on 1998/02/17 02:57:57 UTC
Accept-Encoding - the saga continues (fwd)
I WANT TO SHOOT WHOMEVER CREATED THIS STUPID X- "CONVENTION"!! It is
utterly and completely broken. It creates this sort of utter and complete
chaos. Read on for yet another broken thing that we just can't work
around.
I asked Ronald to submit a PR for the BrowserMatch brokenness.
Dean
---------- Forwarded message ----------
Date: Tue, 17 Feb 1998 02:31:21 +0200
From: "Life is hard... and then you die." <Ro...@psi.ch>
To: DGAUDET@arctic.org
Subject: Accept-Encoding - the saga continues
X-VMS-To: DGAUDET@ARCTIC.ORG
Hi Dean,
well, it seems M$ have managed to completely, utterly, and thorougly screw
up again. Here are couple excerpts of a few mails from Paul and my comments.
[Paul (ginsparg@qfwfq.lanl.gov) wrote:]
> one of the recent problems we've noted is that
> MSIE 4.something no longer understands x-gzip and prefers gzip
> (standard MS practice to change standards without leaving compatibility path)
> -- presuming they're sending the correct A-E would be nice to give them what
> they want without turning on the MultiViews
My previous patch has the problem that the Content-Encoding header is
only fixed up for negotiated documents. Asking for something like
"article.ps.gz" will return a Content-Encoding with a x-gzip or gzip
depending only on what is in the AddEncoding directive, not on what was
sent in an Accept-Encoding header. Therefore I've moved the fixup code
to the "fixup" phase in the module handling. I should've done that in
the first place... This means that all responses are fixed up, no matter
how they were generated. I think this is the correct solution.
The algorithm is still the same: if an accept-encoding header was sent
and it contains the non x- token then use that token in the
content-encoding; else if the accept-encoding header contains the x-
token then use that one; else use whatever was given in the AddEncoding
directive.
So far so good. Then comes this:
[Paul (ginsparg@qfwfq.lanl.gov) wrote:]
> things worked from our point of view, but then we had to turn it off due
> to another ridiculous bug report -- it seems that people who use MSIE
> together with gsview to look at ps.gz files had the following problem:
> MSIE downloaded the file and gunzipped it (since it was getting the C-E: gzip)
> but leaves it named .ps.gz so gsview assumes it's still gzipped and reports
> an error. so for these people it's better to send C-E: x-gzip so that MSIE
> *doesn't* gunzip and then gsview correctly reads it.
What the f*%@?
Anyway, this to me sounds like a case for a BrowserMatch, so I've added
such a beast to the patch. Usage:
BrowserMatch "MSIE 4\." strip-ce-header
If strip-ce-header set, the fixup handler will remove any Content-Encoding
header that might have been set. Note that I'm not sure exactly which
versions of MSIE exhibit this problem. Also, I think this should be put in
the known_client_problems page, and not as a part of the default srm.conf .
I'm sorry to have to be sending you yet another patch. Btw., should I be
submitting this to the apache bugs database instead?
Cheers,
Ronald
P.S. I just found a bug in mod_setenvif - the BrowserMatch directive
handling function add_browser() forgot to quote the regex before
passing it to add_setenvif(). This lead to a
BrowserMatch "MSIE 4\." strip-ce-header
being changed to a
SetEnvIf User-Agent MSIE 4\. strip-ce-header
causing a match for all MSIE browsers because the 4\. part was
being interpreted as an environment variable to set (this affected
all other BrowserMatch directives which used a regex containing
white space). I've attached the patch for this problem too. Shall I
report this to the bug database?
All Patches against the latest cvs build: apache_19980216200022.tar.gz
-------------------------------------------------------------------------
*** mod_negotiation.c.orig Thu Feb 12 09:00:14 1998
--- mod_negotiation.c Tue Feb 17 00:42:03 1998
***************
*** 1463,1469 ****
int i;
accept_rec *accept_recs = (accept_rec *) neg->accept_encodings->elts;
char *enc = variant->content_encoding;
- char *x_enc = NULL;
if (!enc || is_identity_encoding(enc)) {
return;
--- 1463,1468 ----
***************
*** 1479,1486 ****
}
/* Go through each of the encodings on the Accept-Encoding: header,
! * looking for a match with our encoding
! * Prefer non- 'x-' prefixed token (e.g. gzip over x-gzip) */
if (enc[0] == 'x' && enc[1] == '-') {
enc += 2;
}
--- 1478,1485 ----
}
/* Go through each of the encodings on the Accept-Encoding: header,
! * looking for a match with our encoding. x- prefixes are ignored.
! */
if (enc[0] == 'x' && enc[1] == '-') {
enc += 2;
}
***************
*** 1487,1509 ****
for (i = 0; i < neg->accept_encodings->nelts; ++i) {
char *name = accept_recs[i].type_name;
if (!strcmp(name, enc)) {
variant->encoding_quality = 1;
- variant->content_encoding = name;
return;
}
-
- if (name[0] == 'x' && name[1] == '-' && !strcmp(name+2, enc)) {
- x_enc = name;
- }
}
- if (x_enc != NULL) {
- variant->encoding_quality = 1;
- variant->content_encoding = x_enc;
- return;
- }
-
/* Encoding not found on Accept-Encoding: header, so it is
* _not_ acceptable */
variant->encoding_quality = 0;
--- 1486,1501 ----
for (i = 0; i < neg->accept_encodings->nelts; ++i) {
char *name = accept_recs[i].type_name;
+ if (name[0] == 'x' && name[1] == '-') {
+ name += 2;
+ }
+
if (!strcmp(name, enc)) {
variant->encoding_quality = 1;
return;
}
}
/* Encoding not found on Accept-Encoding: header, so it is
* _not_ acceptable */
variant->encoding_quality = 0;
***************
*** 2206,2214 ****
r->filename = sub_req->filename;
r->handler = sub_req->handler;
r->content_type = sub_req->content_type;
! /* it may have been modified, so that it would match the exact encoding
! * requested by the client (i.e. x-gzip vs. gzip) */
! r->content_encoding = best->content_encoding;
r->content_languages = sub_req->content_languages;
r->content_language = sub_req->content_language;
r->finfo = sub_req->finfo;
--- 2198,2204 ----
r->filename = sub_req->filename;
r->handler = sub_req->handler;
r->content_type = sub_req->content_type;
! r->content_encoding = sub_req->content_encoding;
r->content_languages = sub_req->content_languages;
r->content_language = sub_req->content_language;
r->finfo = sub_req->finfo;
***************
*** 2231,2236 ****
--- 2221,2286 ----
return OK;
}
+ /* There is a problem with content-encoding, as some clients send and
+ * expect an x- token (e.g. x-gzip) while others expect the plain token
+ * (i.e. gzip). To try and deal with this as best as possible, we do
+ * the following: if the client sent an Accept-Encoding header and it
+ * contains a plain token corresponding to the content encoding of the
+ * response, then set content encoding using the plain token. Else if
+ * the A-E header contains the x- token use the x- token in the C-E
+ * header. Else don't do anything.
+ *
+ * Note that if no A-E header was sent, or it does not contain a token
+ * compatible with the final content encoding, then the token in the
+ * C-E header will be whatever was specified in the AddEncoding
+ * directive.
+ */
+ static int fix_encoding(request_rec *r)
+ {
+ char *enc = r->content_encoding;
+ char *x_enc = NULL;
+ array_header *accept_encodings;
+ accept_rec *accept_recs;
+ int i;
+
+ if (!enc || !*enc) {
+ return DECLINED;
+ }
+
+ if (table_get(r->subprocess_env, "strip-ce-header")) {
+ r->content_encoding = NULL;
+ return OK;
+ }
+
+ if (enc[0] == 'x' && enc[1] == '-') {
+ enc += 2;
+ }
+
+ accept_encodings = do_header_line(r->pool,
+ table_get(r->headers_in, "Accept-encoding"));
+ accept_recs = (accept_rec *) accept_encodings->elts;
+
+ for (i = 0; i < accept_encodings->nelts; ++i) {
+ char *name = accept_recs[i].type_name;
+
+ if (!strcmp(name, enc)) {
+ r->content_encoding = name;
+ return OK;
+ }
+
+ if (name[0] == 'x' && name[1] == '-' && !strcmp(name+2, enc)) {
+ x_enc = name;
+ }
+ }
+
+ if (x_enc) {
+ r->content_encoding = x_enc;
+ return OK;
+ }
+
+ return DECLINED;
+ }
+
static handler_rec negotiation_handlers[] =
{
{MAP_FILE_MAGIC_TYPE, handle_map_file},
***************
*** 2253,2259 ****
NULL, /* check auth */
NULL, /* check access */
handle_multi, /* type_checker */
! NULL, /* fixups */
NULL, /* logger */
NULL, /* header parser */
NULL, /* child_init */
--- 2303,2309 ----
NULL, /* check auth */
NULL, /* check access */
handle_multi, /* type_checker */
! fix_encoding, /* fixups */
NULL, /* logger */
NULL, /* header parser */
NULL, /* child_init */
-------------------------------------------------------------------------
*** mod_setenvif.c.orig Sat Jan 31 21:00:12 1998
--- mod_setenvif.c Tue Feb 17 02:12:19 1998
***************
*** 241,247 ****
{
const char *match_command;
! match_command = pstrcat(cmd->pool, "User-Agent ", word1, " ", word2, NULL);
return add_setenvif(cmd, mconfig, match_command);
}
--- 241,247 ----
{
const char *match_command;
! match_command = pstrcat(cmd->pool, "User-Agent \"", word1, "\" ", word2, NULL);
return add_setenvif(cmd, mconfig, match_command);
}
-------------------------------------------------------------------------