You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Dean Gaudet <dg...@arctic.org> on 1998/02/17 02:57:57 UTC

Accept-Encoding - the saga continues (fwd)

I WANT TO SHOOT WHOMEVER CREATED THIS STUPID X- "CONVENTION"!!  It is
utterly and completely broken.  It creates this sort of utter and complete
chaos.  Read on for yet another broken thing that we just can't work
around. 

I asked Ronald to submit a PR for the BrowserMatch brokenness.

Dean

---------- Forwarded message ----------
Date: Tue, 17 Feb 1998 02:31:21 +0200
From: "Life is hard... and then you die." <Ro...@psi.ch>
To: DGAUDET@arctic.org
Subject: Accept-Encoding - the saga continues
X-VMS-To: DGAUDET@ARCTIC.ORG


  Hi Dean,

well, it seems M$ have managed to completely, utterly, and thorougly screw
up again. Here are couple excerpts of a few mails from Paul and my comments.

[Paul (ginsparg@qfwfq.lanl.gov) wrote:]
> one of the recent problems we've noted is that
> MSIE 4.something no longer understands x-gzip and prefers gzip
> (standard MS practice to change standards without leaving compatibility path)
> -- presuming they're sending the correct A-E would be nice to give them what  
> they want without turning on the MultiViews

My previous patch has the problem that the Content-Encoding header is
only fixed up for negotiated documents. Asking for something like
"article.ps.gz" will return a Content-Encoding with a x-gzip or gzip
depending only on what is in the AddEncoding directive, not on what was
sent in an Accept-Encoding header. Therefore I've moved the fixup code
to the "fixup" phase in the module handling. I should've done that in
the first place... This means that all responses are fixed up, no matter
how they were generated. I think this is the correct solution.

The algorithm is still the same: if an accept-encoding header was sent
and it contains the non x- token then use that token in the
content-encoding; else if the accept-encoding header contains the x-
token then use that one; else use whatever was given in the AddEncoding
directive.

So far so good. Then comes this:

[Paul (ginsparg@qfwfq.lanl.gov) wrote:]
> things worked from our point of view, but then we had to turn it off due
> to another ridiculous bug report -- it seems that people who use MSIE
> together with gsview to look at ps.gz files had the following problem:
> MSIE downloaded the file and gunzipped it (since it was getting the C-E: gzip)
> but leaves it named .ps.gz so gsview assumes it's still gzipped and reports
> an error. so for these people it's better to send C-E: x-gzip so that MSIE
> *doesn't* gunzip and then gsview correctly reads it.

What the f*%@?

Anyway, this to me sounds like a case for a BrowserMatch, so I've added
such a beast to the patch. Usage:

BrowserMatch "MSIE 4\." strip-ce-header

If strip-ce-header set, the fixup handler will remove any Content-Encoding
header that might have been set. Note that I'm not sure exactly which
versions of MSIE exhibit this problem. Also, I think this should be put in
the known_client_problems page, and not as a part of the default srm.conf .

I'm sorry to have to be sending you yet another patch. Btw., should I be
submitting this to the apache bugs database instead?


  Cheers,

  Ronald


P.S. I just found a bug in mod_setenvif - the BrowserMatch directive
     handling function add_browser() forgot to quote the regex before
     passing it to add_setenvif(). This lead to a

     BrowserMatch "MSIE 4\." strip-ce-header

     being changed to a

     SetEnvIf User-Agent MSIE 4\. strip-ce-header

     causing a match for all MSIE browsers because the 4\. part was
     being interpreted as an environment variable to set (this affected
     all other BrowserMatch directives which used a regex containing
     white space). I've attached the patch for this problem too. Shall I
     report this to the bug database?


All Patches against the latest cvs build: apache_19980216200022.tar.gz

-------------------------------------------------------------------------
*** mod_negotiation.c.orig	Thu Feb 12 09:00:14 1998
--- mod_negotiation.c	Tue Feb 17 00:42:03 1998
***************
*** 1463,1469 ****
      int i;
      accept_rec *accept_recs = (accept_rec *) neg->accept_encodings->elts;
      char *enc = variant->content_encoding;
-     char *x_enc = NULL;
  
      if (!enc || is_identity_encoding(enc)) {
          return;
--- 1463,1468 ----
***************
*** 1479,1486 ****
      }
  
      /* Go through each of the encodings on the Accept-Encoding: header,
!      * looking for a match with our encoding
!      * Prefer non- 'x-' prefixed token (e.g. gzip over x-gzip) */
      if (enc[0] == 'x' && enc[1] == '-') {
          enc += 2;
      }
--- 1478,1485 ----
      }
  
      /* Go through each of the encodings on the Accept-Encoding: header,
!      * looking for a match with our encoding. x- prefixes are ignored.
!      */
      if (enc[0] == 'x' && enc[1] == '-') {
          enc += 2;
      }
***************
*** 1487,1509 ****
      for (i = 0; i < neg->accept_encodings->nelts; ++i) {
          char *name = accept_recs[i].type_name;
  
          if (!strcmp(name, enc)) {
              variant->encoding_quality = 1;
-             variant->content_encoding = name;
              return;
          }
- 
-         if (name[0] == 'x' && name[1] == '-' && !strcmp(name+2, enc)) {
-             x_enc = name;
-         }
      }
  
-     if (x_enc != NULL) {
-         variant->encoding_quality = 1;
-         variant->content_encoding = x_enc;
-         return;
-     }
- 
      /* Encoding not found on Accept-Encoding: header, so it is
       * _not_ acceptable */
      variant->encoding_quality = 0;
--- 1486,1501 ----
      for (i = 0; i < neg->accept_encodings->nelts; ++i) {
          char *name = accept_recs[i].type_name;
  
+         if (name[0] == 'x' && name[1] == '-') {
+             name += 2;
+         }
+ 
          if (!strcmp(name, enc)) {
              variant->encoding_quality = 1;
              return;
          }
      }
  
      /* Encoding not found on Accept-Encoding: header, so it is
       * _not_ acceptable */
      variant->encoding_quality = 0;
***************
*** 2206,2214 ****
      r->filename = sub_req->filename;
      r->handler = sub_req->handler;
      r->content_type = sub_req->content_type;
!     /* it may have been modified, so that it would match the exact encoding
!      * requested by the client (i.e. x-gzip vs. gzip) */
!     r->content_encoding = best->content_encoding;
      r->content_languages = sub_req->content_languages;
      r->content_language = sub_req->content_language;
      r->finfo = sub_req->finfo;
--- 2198,2204 ----
      r->filename = sub_req->filename;
      r->handler = sub_req->handler;
      r->content_type = sub_req->content_type;
!     r->content_encoding = sub_req->content_encoding;
      r->content_languages = sub_req->content_languages;
      r->content_language = sub_req->content_language;
      r->finfo = sub_req->finfo;
***************
*** 2231,2236 ****
--- 2221,2286 ----
      return OK;
  }
  
+ /* There is a problem with content-encoding, as some clients send and
+  * expect an x- token (e.g. x-gzip) while others expect the plain token
+  * (i.e. gzip). To try and deal with this as best as possible, we do
+  * the following: if the client sent an Accept-Encoding header and it
+  * contains a plain token corresponding to the content encoding of the
+  * response, then set content encoding using the plain token. Else if
+  * the A-E header contains the x- token use the x- token in the C-E
+  * header. Else don't do anything.
+  *
+  * Note that if no A-E header was sent, or it does not contain a token
+  * compatible with the final content encoding, then the token in the
+  * C-E header will be whatever was specified in the AddEncoding
+  * directive.
+  */
+ static int fix_encoding(request_rec *r)
+ {
+     char *enc = r->content_encoding;
+     char *x_enc = NULL;
+     array_header *accept_encodings;
+     accept_rec *accept_recs;
+     int i;
+ 
+     if (!enc || !*enc) {
+         return DECLINED;
+     }
+ 
+     if (table_get(r->subprocess_env, "strip-ce-header")) {
+         r->content_encoding = NULL;
+         return OK;
+     }
+ 
+     if (enc[0] == 'x' && enc[1] == '-') {
+         enc += 2;
+     }
+ 
+     accept_encodings = do_header_line(r->pool,
+                                 table_get(r->headers_in, "Accept-encoding"));
+     accept_recs = (accept_rec *) accept_encodings->elts;
+ 
+     for (i = 0; i < accept_encodings->nelts; ++i) {
+         char *name = accept_recs[i].type_name;
+ 
+         if (!strcmp(name, enc)) {
+             r->content_encoding = name;
+             return OK;
+         }
+ 
+         if (name[0] == 'x' && name[1] == '-' && !strcmp(name+2, enc)) {
+             x_enc = name;
+         }
+     }
+ 
+     if (x_enc) {
+         r->content_encoding = x_enc;
+         return OK;
+     }
+ 
+     return DECLINED;
+ }
+ 
  static handler_rec negotiation_handlers[] =
  {
      {MAP_FILE_MAGIC_TYPE, handle_map_file},
***************
*** 2253,2259 ****
      NULL,                       /* check auth */
      NULL,                       /* check access */
      handle_multi,               /* type_checker */
!     NULL,                       /* fixups */
      NULL,                       /* logger */
      NULL,                       /* header parser */
      NULL,                       /* child_init */
--- 2303,2309 ----
      NULL,                       /* check auth */
      NULL,                       /* check access */
      handle_multi,               /* type_checker */
!     fix_encoding,               /* fixups */
      NULL,                       /* logger */
      NULL,                       /* header parser */
      NULL,                       /* child_init */
-------------------------------------------------------------------------
*** mod_setenvif.c.orig	Sat Jan 31 21:00:12 1998
--- mod_setenvif.c	Tue Feb 17 02:12:19 1998
***************
*** 241,247 ****
  {
      const char *match_command;
  
!     match_command = pstrcat(cmd->pool, "User-Agent ", word1, " ", word2, NULL);
      return add_setenvif(cmd, mconfig, match_command);
  }
  
--- 241,247 ----
  {
      const char *match_command;
  
!     match_command = pstrcat(cmd->pool, "User-Agent \"", word1, "\" ", word2, NULL);
      return add_setenvif(cmd, mconfig, match_command);
  }
  
-------------------------------------------------------------------------