You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Dean Gaudet <dg...@arctic.org> on 1999/01/21 01:54:09 UTC

find_token

Here's two messages from me talking about the find_token problem. 
find_token has since been rewritten (by me)... which fixes some of these
problems, but it doesn't fix its use for etags.

Dean

---------- Forwarded message ----------
Date: Fri, 13 Mar 1998 17:23:59 -0800 (PST)
From: Dean Gaudet <dg...@arctic.org>
To: new-httpd@apache.org
Subject: find_token bug
X-Comment: Visit http://www.arctic.org/~dgaudet/legal for information regarding copyright and disclaimer.
Reply-To: new-httpd@apache.org

RFC2068 defines a token like so:

          token          = 1*<any CHAR except CTLs or tspecials>

          tspecials      = "(" | ")" | "<" | ">" | "@"
                         | "," | ";" | ":" | "\" | <">
                         | "/" | "[" | "]" | "?" | "="
                         | "{" | "}" | SP | HT

find_token() is used when we want to know (boolean) if a particular token
appears in an HTTP/1.1 header.  Some headers allow for token |
quoted-string... and find_token attempts to be accomodating, by skipping
quoted-strings.  next_token is actually the function involved, and it
doesn't do things right: 

    static char *tspecials = " \t()<>@,;:\\/[]?={}";

    /* Next HTTP token from a header line.  Warning --- destructive!
    * Use only with a copy!
    */

    static char *next_token(char **toks)
    {
	char *cp = *toks;
	char *ret;

	while (*cp && (iscntrl(*cp) || strchr(tspecials, *cp))) {
	    if (*cp == '"')
		while (*cp && (*cp != '"'))
		    ++cp;
	    else
		++cp;
	}
    ...

Notice that the quote test in the inner loop is never true because the
outer loop condition will never be true when *cp == '"'. 

Now, the headers that we use find_token() on are: 

    Connection  (both in and out)
    If-Match
    If-None-Match

For Connection: 

    Connection-header = "Connection" ":" 1#(connection-token)
    connection-token  = token

Quotes are part of a token, and find_token() would be wrong to ignore them
w.r.t. Connection. 

For the other two: 

    If-Match = "If-Match" ":" ( "*" | 1#entity-tag )

    If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag )

    entity-tag = [ weak ] opaque-tag

    weak       = "W/"
    opaque-tag = quoted-string

quotes are required, and things just happen to work right now because
find_token() doesn't deal with quotes the way it looks like it tries to. 
If you look through meets_conditions() you'll see the etag used for
comparisons is quoted as well, so find_token

So I'll be removing that dead code from next_token. 

Dean

P.S. I haven't looked at the etag stuff in RFC2068, but I'm assuming that
since we do nothing with weak in meets_conditions() that it's only
required for proxying.




---------- Forwarded message ----------
Date: Fri, 13 Mar 1998 17:39:28 -0800 (PST)
From: Dean Gaudet <dg...@arctic.org>
To: new-httpd@apache.org
Subject: Re: find_token bug
X-Comment: Visit http://www.arctic.org/~dgaudet/legal for information regarding copyright and disclaimer.
Reply-To: new-httpd@apache.org



On Fri, 13 Mar 1998, Dean Gaudet wrote:

> For the other two: 
> 
>     If-Match = "If-Match" ":" ( "*" | 1#entity-tag )
> 
>     If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag )
> 
>     entity-tag = [ weak ] opaque-tag
> 
>     weak       = "W/"
>     opaque-tag = quoted-string

Actually, using find_token here is completely bogus:

          quoted-string  = ( <"> *(qdtext) <"> )

          qdtext         = <any TEXT except <">>

          TEXT           = <any OCTET except CTLs,
                           but including LWS>

          CTL            = <any US-ASCII control character
                           (octets 0 - 31) and DEL (127)>

We need find_quoted_string().

That is to say, if a request includes:

If-Match: "abc def"

we'll do the wrong thing... since I'm already fixing this code I'll write
find_quoted_string(). 

Dean




Re: find_token

Posted by Koen Holtman <Ko...@cern.ch>.

On Wed, 20 Jan 1999, Dean Gaudet wrote:

[...]
> P.S. I haven't looked at the etag stuff in RFC2068, but I'm assuming that
> since we do nothing with weak in meets_conditions() that it's only
> required for proxying.

No, you need to deal with the W/ part too.  Qouting from the 1.1 spec
(section 13.3.3): 

   The only function that the HTTP/1.1 protocol defines on validators is
   comparison. There are two validator comparison functions, depending
   on whether the comparison context allows the use of weak validators
   or not:

      . The strong comparison function: in order to be considered
        equal, both validators MUST be identical in every way, and both
        MUST NOT be weak.

      . The weak comparison function: in order to be considered equal,
        both validators MUST be identical in every way, but either or
        both of them MAY be tagged as "weak" without affecting the
        result.

In 'strong comparison', the W/ part is significant.
In some cases you are allowed to use weak comparison, but (from 14.25):

   A server MUST use the strong comparison function (see section 3.11)
   to compare the entity tags in If-Match.

and you must support if-match so there is no way around implementing
strong comparison.

[...]
> We need find_quoted_string().
> 
> That is to say, if a request includes:
> 
> If-Match: "abc def"
> 
> we'll do the wrong thing... since I'm already fixing this code I'll write
> find_quoted_string(). 

As discussed above, using find_quoted_string() by itself is not enough for
strong comparison.  It is sufficient for weak comparison, but I would not
bother implementing weak comparison, and using it when this is allowed by
http/1.1, because the benefits will be tiny given the low chance that
Apache generates a weak entity tag. 
 
> Dean

Koen.