You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafficserver.apache.org by "Dimitry Andric (JIRA)" <ji...@apache.org> on 2014/03/05 14:41:43 UTC

[jira] [Created] (TS-2616) Multiple Transfer-Encoding: chunked headers are not sanitized

Dimitry Andric created TS-2616:
----------------------------------

             Summary: Multiple Transfer-Encoding: chunked headers are not sanitized
                 Key: TS-2616
                 URL: https://issues.apache.org/jira/browse/TS-2616
             Project: Traffic Server
          Issue Type: Bug
          Components: HTTP
            Reporter: Dimitry Andric


I encountered an origin server which sometimes sends multiple copies of the same header.  Obviously, this origin server is buggy, but I cannot touch it, or modify it in any way.  An example of such a request:

{noformat}
CLIENT> GET /example/request?token=12345678 HTTP/1.1
CLIENT> Host: 192.168.42.42:8080
CLIENT> Accept-Encoding: deflate, gzip
CLIENT> User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0
CLIENT> Accept: application/json
CLIENT> Referer: http://example.com/
CLIENT> Client-ip: 127.0.0.1
CLIENT> X-Forwarded-For: 127.0.0.1
CLIENT> Via: http/1.1 trafficserver.localdomain[FF000000000000000000000000000011] (ApacheTrafficServer/4.0.2)
CLIENT>
SERVER> HTTP/1.1 200 OK
SERVER> Access-Control-Allow-Origin: *
SERVER> Access-Control-Allow-Headers : Content-Type
SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE
SERVER> Server: Apache-Coyote/1.1
SERVER> Transfer-Encoding: chunked
SERVER> Content-Encoding: gzip
SERVER> Vary: Accept-Encoding
SERVER> Date: Tue, 04 Feb 2014 13:55:56 GMT
SERVER> Transfer-Encoding: chunked
SERVER>
SERVER> 14
SERVER> ....................
SERVER> 0
{noformat}

When this request is initially passed through by TrafficServer, the client sees at least one "Transfer-Encoding: chunked" header, and assumes a chunked transfer.  Everything works fine.

After TrafficServer has cached the request, subsequent requests from the client will receive a non-chunked transfer.  However, while TrafficServer has coalesced multiple headers into one, it removes only *one* instance of the "chunked" value:

{noformat}
CLIENT> GET http://192.168.42.42:8080/example/request?token=12345678 HTTP/1.1
CLIENT> Host: 192.168.42.42
CLIENT> Accept-Encoding: deflate, gzip
CLIENT> Proxy-Connection: Keep-Alive
CLIENT> User-Agent: User-Agent: Mozilla/5.0 (Mac OS X; en-us) AppleWebKit/535.7.0 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/535.7.0
CLIENT> Accept: application/json
CLIENT> Referer: http://example.com/
CLIENT>
SERVER> HTTP/1.1 200 OK
SERVER> Access-Control-Allow-Origin: *
SERVER> Access-Control-Allow-Headers : Content-Type
SERVER> Access-Control-Allow-Methods : GET, PUT, POST, DELETE
SERVER> Server: ATS/4.0.2
SERVER> Content-Encoding: gzip
SERVER> Vary: Accept-Encoding
SERVER> Date: Tue, 04 Feb 2014 13:57:36 GMT
SERVER> Transfer-Encoding: chunked
SERVER> Age: 1546
SERVER> Content-Length: 20
SERVER> Proxy-Connection: keep-alive
SERVER>
SERVER> ....................
{noformat}

Now the client sees both a "Content-Length: 20" and "Transfer-Encoding: chunked" header, which should not happen.  When e.g. curl is used as a client, it will assume the transfer is chunked, and it attempts to parse the initial bytes as a hexadecimal size line.  Of course this fails, since there is no real chunked transfer, and curl will abort the transfer with an error, after reading approximately 256 kiB.

The problem is due to a piece of code in HttpTransact::initialize_state_variables_from_response(), which scans the Transfer-Encoding: headers, coalesces them, and removes any "chunked" value from them.  However, it does not handle multiple "chunked" values, as explicitly noticed in a comment on line 5795:

{noformat}
  5720  void
  5721  HttpTransact::initialize_state_variables_from_response(State* s, HTTPHdr* incoming_response)
  5722  {
...
  5785    if (incoming_response->presence(MIME_PRESENCE_TRANSFER_ENCODING)) {
  5786      MIMEField *field = incoming_response->field_find(MIME_FIELD_TRANSFER_ENCODING, MIME_LEN_TRANSFER_ENCODING);
...
  5792      while (enc_value) {
  5793        const char *wks_value = hdrtoken_string_to_wks(enc_value, enc_val_len);
  5794
  5795        //   FIX ME: What is chunked appears more than once?  Old
  5796        //     code didn't deal with this so I don't either
  5797        if (wks_value == HTTP_VALUE_CHUNKED) {
...
  5813          // Loop over the all the values in existing Trans-enc header and
  5814          //   copy the ones that aren't our chunked value to a new field
  5815          while (new_enc_val) {
  5816            if (new_enc_val != enc_value) {
  5817              if (new_enc_field) {
  5818                new_enc_field->value_append(incoming_response->m_heap, incoming_response->m_mime, new_enc_val, new_enc_len, true);
  5819              } else {
  5820                new_enc_field = incoming_response->field_create();
  5821                incoming_response->field_value_set(new_enc_field, new_enc_val, new_enc_len);
  5822              }
  5823            }
  5824
  5825            new_enc_val = new_enc_iter.get_next(&new_enc_len);
  5826          }
{noformat}

The comparison on line 5816 compares the char pointers new_enc_val and enc_value, to pluck out any non-"chunked" values, while it should really compare the *contents* of the strings.  Although it is probably better to just re-use hdrtoken_string_to_wks(), and compare the result with the special HTTP_VALUE_CHUNKED value, e.g.:

{noformat}
  5813          // Loop over the all the values in existing Trans-enc header and
  5814          //   copy the ones that aren't our chunked value to a new field
  5815          while (new_enc_val) {
  5816            const char *new_wks_val = hdrtoken_string_to_wks(new_enc_val, new_enc_len);
  5817            if (new_wks_val != HTTP_VALUE_CHUNKED) {
  5818              if (new_enc_field) {
  5819                new_enc_field->value_append(incoming_response->m_heap, incoming_response->m_mime, new_enc_val, new_enc_len, true);
  5820              } else {
  5821                new_enc_field = incoming_response->field_create();
  5822                incoming_response->field_value_set(new_enc_field, new_enc_val, new_enc_len);
  5823              }
  5824            }
  5825
  5826            new_enc_val = new_enc_iter.get_next(&new_enc_len);
  5827          }
{noformat}

After locally modifying my copy of TrafficServer 4.0.2 in the above fashion, I have successfully used it to sanitize the multiple Transfer-Encoding: headers from the origin server.

Please review the patch I will attach to this issue, which modifies proxy/http/HttpTransact.cc as shown above.



--
This message was sent by Atlassian JIRA
(v6.2#6252)