You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Eric Covener <co...@gmail.com> on 2017/09/16 01:37:14 UTC

Re: svn commit: r1426877 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/http_core.h include/httpd.h modules/http/http_filters.c server/core.c server/protocol.c server/util.c server/vhost.c

On Sat, Dec 29, 2012 at 8:23 PM,  <sf...@apache.org> wrote:
> Author: sf
> Date: Sun Dec 30 01:23:24 2012
> New Revision: 1426877
>
> URL: http://svn.apache.org/viewvc?rev=1426877&view=rev
> Log:
> Add an option to enforce stricter HTTP conformance
>
> Modified: httpd/httpd/trunk/server/vhost.c

> URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/server/vhost.c?rev=1426877&r1=1426876&r2=1426877&view=diff
> ==============================================================================
> --- httpd/httpd/trunk/server/vhost.c (original)
> +++ httpd/httpd/trunk/server/vhost.c Sun Dec 30 01:23:24 2012
> @@ -735,6 +735,59 @@ static apr_status_t fix_hostname_non_v6(
>      return APR_SUCCESS;
>  }
>
> +/*
> + * If strict mode ever becomes the default, this should be folded into
> + * fix_hostname_non_v6()
> + */
> +static apr_status_t strict_hostname_check(request_rec *r, char *host,
> +                                          int logonly)
> +{
> +    char *ch;
> +    int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0;
> +
> +    for (ch = host; *ch; ch++) {
> +        if (!apr_isascii(*ch)) {
> +            goto bad;
> +        }
> +        else if (apr_isalpha(*ch) || *ch == '-') {
> +            is_dotted_decimal = 0;
> +        }
> +        else if (ch[0] == '.') {
> +            dots++;
> +            if (ch[1] == '0' && apr_isdigit(ch[2]))
> +                leading_zeroes = 1;
> +        }
> +        else if (!apr_isdigit(*ch)) {
> +           /* also takes care of multiple Host headers by denying commas */
> +            goto bad;
> +        }
> +    }
> +    if (is_dotted_decimal) {
> +        if (host[0] == '.' || (host[0] == '0' && apr_isdigit(host[1])))
> +            leading_zeroes = 1;
> +        if (leading_zeroes || dots != 3) {
> +            /* RFC 3986 7.4 */
> +            goto bad;
> +        }
> +    }
> +    else {
> +        /* The top-level domain must start with a letter (RFC 1123 2.1) */
> +        while (ch > host && *ch != '.')
> +            ch--;
> +        if (ch[0] == '.' && ch[1] != '\0' && !apr_isalpha(ch[1]))
> +            goto bad;
> +    }
> +    return APR_SUCCESS;
> +
> +bad:
> +    ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO()
> +                  "[strict] Invalid host name '%s'%s%.6s",
> +                  host, *ch ? ", problem near: " : "", ch);
> +    if (logonly)
> +        return APR_SUCCESS;
> +    return APR_EINVAL;
> +}

(sorry for the necromancy of this very old commit)

Re: the 1123 2.1 reference a dozen lines from the end of the function:
RFC 1123 2.1 seems to say the opposite. Just a bug or something over
my head?

  2.1  Host Names and Numbers

      The syntax of a legal Internet host name was specified in RFC-952
      [DNS:4].  One aspect of host name syntax is hereby changed: the
      restriction on the first character is relaxed to allow either a
      letter or a digit.  Host software MUST support this more liberal
      syntax.

Re: svn commit: r1426877 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/http_core.h include/httpd.h modules/http/http_filters.c server/core.c server/protocol.c server/util.c server/vhost.c

Posted by William A Rowe Jr <wr...@rowe-clan.net>.
This has been the object of some debate, read Lisa's errata rejection of ID
1081 and 1353...

https://www.rfc-editor.org/errata/rfc1123



On Sep 16, 2017 10:00, "Eric Covener" <co...@gmail.com> wrote:

On Sat, Sep 16, 2017 at 9:48 AM, Yann Ylavic <yl...@gmail.com> wrote:
> On Sat, Sep 16, 2017 at 3:37 AM, Eric Covener <co...@gmail.com> wrote:
>> On Sat, Dec 29, 2012 at 8:23 PM,  <sf...@apache.org> wrote:
>>>
>>> +/*
>>> + * If strict mode ever becomes the default, this should be folded into
>>> + * fix_hostname_non_v6()
>>> + */
>>> +static apr_status_t strict_hostname_check(request_rec *r, char *host,
>>> +                                          int logonly)
>>> +{
>>> +    char *ch;
>>> +    int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0;
>>> +
>>> +    for (ch = host; *ch; ch++) {
>>> +        if (!apr_isascii(*ch)) {
>>> +            goto bad;
>>> +        }
>>> +        else if (apr_isalpha(*ch) || *ch == '-') {
>>> +            is_dotted_decimal = 0;
>>> +        }
>>> +        else if (ch[0] == '.') {
>>> +            dots++;
>>> +            if (ch[1] == '0' && apr_isdigit(ch[2]))
>>> +                leading_zeroes = 1;
>>> +        }
>>> +        else if (!apr_isdigit(*ch)) {
>>> +           /* also takes care of multiple Host headers by denying
commas */
>>> +            goto bad;
>>> +        }
>>> +    }
>>> +    if (is_dotted_decimal) {
>>> +        if (host[0] == '.' || (host[0] == '0' && apr_isdigit(host[1])))
>>> +            leading_zeroes = 1;
>>> +        if (leading_zeroes || dots != 3) {
>>> +            /* RFC 3986 7.4 */
>>> +            goto bad;
>>> +        }
>>> +    }
>>> +    else {
>>> +        /* The top-level domain must start with a letter (RFC 1123
2.1) */
>>> +        while (ch > host && *ch != '.')
>>> +            ch--;
>>> +        if (ch[0] == '.' && ch[1] != '\0' && !apr_isalpha(ch[1]))
>>> +            goto bad;
>>> +    }
>>> +    return APR_SUCCESS;
>>> +
>>> +bad:
>>> +    ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO()
>>> +                  "[strict] Invalid host name '%s'%s%.6s",
>>> +                  host, *ch ? ", problem near: " : "", ch);
>>> +    if (logonly)
>>> +        return APR_SUCCESS;
>>> +    return APR_EINVAL;
>>> +}
>>
>> (sorry for the necromancy of this very old commit)
>>
>> Re: the 1123 2.1 reference a dozen lines from the end of the function:
>> RFC 1123 2.1 seems to say the opposite. Just a bug or something over
>> my head?
>>
>>   2.1  Host Names and Numbers
>>
>>       The syntax of a legal Internet host name was specified in RFC-952
>>       [DNS:4].  One aspect of host name syntax is hereby changed: the
>>       restriction on the first character is relaxed to allow either a
>>       letter or a digit.  Host software MUST support this more liberal
>>       syntax.
>
> RFC 1123 2.1 seems to be about the first character of the host, while
> the code checks the first one of the TLD. Are there TLDs starting with
> a digit?

I see, thanks.  The basis in 1123 is a bit later in 2.1 but doesn't
really seem normative:

If a dotted-decimal number can be entered without such
           identifying delimiters, then a full syntactic check must be
           made, because a segment of a host domain name is now allowed
           to begin with a digit and could legally be entirely numeric
           (see Section 6.1.2.4).  However, a valid host name can never
           have the dotted-decimal form #.#.#.#, since at least the
           highest-level component label will be alphabetic.

The 6.1.2.4 reference is likely an error because that is about compression.

It seems like we'd reject "1foo" but accept "1foo.com", but i am not
sure if this warrants an exception or reconsidering the check.

(In the case that had me looking, a high TCP port was used as the
hostname AND port in the Host header so it is clearly someone elses
bug at the core)

--
Eric Covener
covener@gmail.com

Re: svn commit: r1426877 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/http_core.h include/httpd.h modules/http/http_filters.c server/core.c server/protocol.c server/util.c server/vhost.c

Posted by Eric Covener <co...@gmail.com>.
On Sat, Sep 16, 2017 at 9:48 AM, Yann Ylavic <yl...@gmail.com> wrote:
> On Sat, Sep 16, 2017 at 3:37 AM, Eric Covener <co...@gmail.com> wrote:
>> On Sat, Dec 29, 2012 at 8:23 PM,  <sf...@apache.org> wrote:
>>>
>>> +/*
>>> + * If strict mode ever becomes the default, this should be folded into
>>> + * fix_hostname_non_v6()
>>> + */
>>> +static apr_status_t strict_hostname_check(request_rec *r, char *host,
>>> +                                          int logonly)
>>> +{
>>> +    char *ch;
>>> +    int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0;
>>> +
>>> +    for (ch = host; *ch; ch++) {
>>> +        if (!apr_isascii(*ch)) {
>>> +            goto bad;
>>> +        }
>>> +        else if (apr_isalpha(*ch) || *ch == '-') {
>>> +            is_dotted_decimal = 0;
>>> +        }
>>> +        else if (ch[0] == '.') {
>>> +            dots++;
>>> +            if (ch[1] == '0' && apr_isdigit(ch[2]))
>>> +                leading_zeroes = 1;
>>> +        }
>>> +        else if (!apr_isdigit(*ch)) {
>>> +           /* also takes care of multiple Host headers by denying commas */
>>> +            goto bad;
>>> +        }
>>> +    }
>>> +    if (is_dotted_decimal) {
>>> +        if (host[0] == '.' || (host[0] == '0' && apr_isdigit(host[1])))
>>> +            leading_zeroes = 1;
>>> +        if (leading_zeroes || dots != 3) {
>>> +            /* RFC 3986 7.4 */
>>> +            goto bad;
>>> +        }
>>> +    }
>>> +    else {
>>> +        /* The top-level domain must start with a letter (RFC 1123 2.1) */
>>> +        while (ch > host && *ch != '.')
>>> +            ch--;
>>> +        if (ch[0] == '.' && ch[1] != '\0' && !apr_isalpha(ch[1]))
>>> +            goto bad;
>>> +    }
>>> +    return APR_SUCCESS;
>>> +
>>> +bad:
>>> +    ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO()
>>> +                  "[strict] Invalid host name '%s'%s%.6s",
>>> +                  host, *ch ? ", problem near: " : "", ch);
>>> +    if (logonly)
>>> +        return APR_SUCCESS;
>>> +    return APR_EINVAL;
>>> +}
>>
>> (sorry for the necromancy of this very old commit)
>>
>> Re: the 1123 2.1 reference a dozen lines from the end of the function:
>> RFC 1123 2.1 seems to say the opposite. Just a bug or something over
>> my head?
>>
>>   2.1  Host Names and Numbers
>>
>>       The syntax of a legal Internet host name was specified in RFC-952
>>       [DNS:4].  One aspect of host name syntax is hereby changed: the
>>       restriction on the first character is relaxed to allow either a
>>       letter or a digit.  Host software MUST support this more liberal
>>       syntax.
>
> RFC 1123 2.1 seems to be about the first character of the host, while
> the code checks the first one of the TLD. Are there TLDs starting with
> a digit?

I see, thanks.  The basis in 1123 is a bit later in 2.1 but doesn't
really seem normative:

If a dotted-decimal number can be entered without such
           identifying delimiters, then a full syntactic check must be
           made, because a segment of a host domain name is now allowed
           to begin with a digit and could legally be entirely numeric
           (see Section 6.1.2.4).  However, a valid host name can never
           have the dotted-decimal form #.#.#.#, since at least the
           highest-level component label will be alphabetic.

The 6.1.2.4 reference is likely an error because that is about compression.

It seems like we'd reject "1foo" but accept "1foo.com", but i am not
sure if this warrants an exception or reconsidering the check.

(In the case that had me looking, a high TCP port was used as the
hostname AND port in the Host header so it is clearly someone elses
bug at the core)

-- 
Eric Covener
covener@gmail.com

Re: svn commit: r1426877 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/http_core.h include/httpd.h modules/http/http_filters.c server/core.c server/protocol.c server/util.c server/vhost.c

Posted by Yann Ylavic <yl...@gmail.com>.
On Sat, Sep 16, 2017 at 3:37 AM, Eric Covener <co...@gmail.com> wrote:
> On Sat, Dec 29, 2012 at 8:23 PM,  <sf...@apache.org> wrote:
>>
>> +/*
>> + * If strict mode ever becomes the default, this should be folded into
>> + * fix_hostname_non_v6()
>> + */
>> +static apr_status_t strict_hostname_check(request_rec *r, char *host,
>> +                                          int logonly)
>> +{
>> +    char *ch;
>> +    int is_dotted_decimal = 1, leading_zeroes = 0, dots = 0;
>> +
>> +    for (ch = host; *ch; ch++) {
>> +        if (!apr_isascii(*ch)) {
>> +            goto bad;
>> +        }
>> +        else if (apr_isalpha(*ch) || *ch == '-') {
>> +            is_dotted_decimal = 0;
>> +        }
>> +        else if (ch[0] == '.') {
>> +            dots++;
>> +            if (ch[1] == '0' && apr_isdigit(ch[2]))
>> +                leading_zeroes = 1;
>> +        }
>> +        else if (!apr_isdigit(*ch)) {
>> +           /* also takes care of multiple Host headers by denying commas */
>> +            goto bad;
>> +        }
>> +    }
>> +    if (is_dotted_decimal) {
>> +        if (host[0] == '.' || (host[0] == '0' && apr_isdigit(host[1])))
>> +            leading_zeroes = 1;
>> +        if (leading_zeroes || dots != 3) {
>> +            /* RFC 3986 7.4 */
>> +            goto bad;
>> +        }
>> +    }
>> +    else {
>> +        /* The top-level domain must start with a letter (RFC 1123 2.1) */
>> +        while (ch > host && *ch != '.')
>> +            ch--;
>> +        if (ch[0] == '.' && ch[1] != '\0' && !apr_isalpha(ch[1]))
>> +            goto bad;
>> +    }
>> +    return APR_SUCCESS;
>> +
>> +bad:
>> +    ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO()
>> +                  "[strict] Invalid host name '%s'%s%.6s",
>> +                  host, *ch ? ", problem near: " : "", ch);
>> +    if (logonly)
>> +        return APR_SUCCESS;
>> +    return APR_EINVAL;
>> +}
>
> (sorry for the necromancy of this very old commit)
>
> Re: the 1123 2.1 reference a dozen lines from the end of the function:
> RFC 1123 2.1 seems to say the opposite. Just a bug or something over
> my head?
>
>   2.1  Host Names and Numbers
>
>       The syntax of a legal Internet host name was specified in RFC-952
>       [DNS:4].  One aspect of host name syntax is hereby changed: the
>       restriction on the first character is relaxed to allow either a
>       letter or a digit.  Host software MUST support this more liberal
>       syntax.

RFC 1123 2.1 seems to be about the first character of the host, while
the code checks the first one of the TLD. Are there TLDs starting with
a digit?