You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by shenzhang920 <gi...@git.apache.org> on 2016/03/29 02:49:41 UTC

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

GitHub user shenzhang920 opened a pull request:

    https://github.com/apache/trafficserver/pull/541

    TS-4312: Adding config to parse urls according to RFC

    Adding a config option "proxy.config.http.strict_uri_parsing" to sends http status code 400 back to client if the URL includes non-RFC 3986 compliant character

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shenzhang920/trafficserver master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/trafficserver/pull/541.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #541
    
----
commit ab5f0741591d8240d999542894c4b351df649460
Author: Shen Zhang <se...@linkedin.com>
Date:   2016-03-28T22:49:36Z

    ats-core sends http status code 400 back to client if the URL includes non-RFC 3986 compliant character (global option)

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by bgaff <gi...@git.apache.org>.
Github user bgaff commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-209696367
  
    Thanks @shenzhang920 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r59663218
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1846,4 +1823,42 @@ REGRESSION_TEST(VALIDATE_HDR_FIELD)(RegressionTest *t, int /* level ATS_UNUSED *
       }
     }
     
    +
    +const static struct {
    +  const char *const uri;
    +  bool valid;
    +} http_strict_uri_parsing_test_case[] = {{"/home", true},
    +                                         {"/path/data?key=value#id", true},
    +                                         {"/ABCDEFGHIJKLMNOPQRSTUVWXYZ", true},
    +                                         {"/abcdefghijklmnopqrstuvwxyz", true},
    +                                         {"/0123456789", true},
    +                                         {":/?#[]@", true},
    +                                         {"!$&'()*+,;=", true},
    +                                         {"-._~", true},
    +                                         {"%", true},
    +                                         {"\n", false},
    +                                         {"\"", false},
    +                                         {"<", false},
    +                                         {">", false},
    +                                         {"\\", false},
    +                                         {"^", false},
    +                                         {"`", false},
    +                                         {"{", false},
    +                                         {"|", false},
    +                                         {"}", false},
    +                                         {"é", false}};
    +
    +REGRESSION_TEST(STRICT_URI_PARSING)(RegressionTest *t, int /* level ATS_UNUSED */, int *pstatus)
    --- End diff --
    
    We don't use all caps ``STRICT_URI_PARSING``.  How about ``ParseRules_strict_URI`` or something?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by shenzhang920 <gi...@git.apache.org>.
Github user shenzhang920 closed the pull request at:

    https://github.com/apache/trafficserver/pull/541


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by shenzhang920 <gi...@git.apache.org>.
Github user shenzhang920 commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-209686860
  
    @jpeach I removed is_wildmat_BIT and added is_uri_BIT, please refer to commit d9d972d


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-208686127
  
    I definitely think that ``ParseRules.cc`` is the right place for this. I suggest we reclaim one of the obsolete bits for this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r59663241
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1846,4 +1823,42 @@ REGRESSION_TEST(VALIDATE_HDR_FIELD)(RegressionTest *t, int /* level ATS_UNUSED *
       }
     }
     
    +
    +const static struct {
    +  const char *const uri;
    +  bool valid;
    +} http_strict_uri_parsing_test_case[] = {{"/home", true},
    +                                         {"/path/data?key=value#id", true},
    +                                         {"/ABCDEFGHIJKLMNOPQRSTUVWXYZ", true},
    +                                         {"/abcdefghijklmnopqrstuvwxyz", true},
    +                                         {"/0123456789", true},
    +                                         {":/?#[]@", true},
    +                                         {"!$&'()*+,;=", true},
    +                                         {"-._~", true},
    +                                         {"%", true},
    +                                         {"\n", false},
    +                                         {"\"", false},
    +                                         {"<", false},
    +                                         {">", false},
    +                                         {"\\", false},
    +                                         {"^", false},
    +                                         {"`", false},
    +                                         {"{", false},
    +                                         {"|", false},
    +                                         {"}", false},
    +                                         {"é", false}};
    +
    +REGRESSION_TEST(STRICT_URI_PARSING)(RegressionTest *t, int /* level ATS_UNUSED */, int *pstatus)
    --- End diff --
    
    Wrap the test and it's data in ``#ifdef TS_HAS_TESTS``.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r57665851
  
    --- Diff: proxy/hdrs/HTTP.cc ---
    @@ -878,7 +878,7 @@ http_parser_clear(HTTPParser *parser)
     
     MIMEParseResult
     http_parser_parse_req(HTTPParser *parser, HdrHeap *heap, HTTPHdrImpl *hh, const char **start, const char *end,
    -                      bool must_copy_strings, bool eof)
    +                      bool must_copy_strings, bool eof, bool strict_uri_parsing)
    --- End diff --
    
    Rather than multiple ``bool``, maybe we ought to pass bitwise flags?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r58485657
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    +  for (size_t i = 0; i < strlen(allowed_char); ++i) {
    +    ink_assert((unsigned char)allowed_char[i] < size);
    +    non_encoded_char[(unsigned char)allowed_char[i]] = 1;
    +  }
    +
    +  return true;
    +}
    +
    +static char non_encoded_char[256];
    +static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char));
    --- End diff --
    
    Yes I see your point. I think that this properly belongs in [ParseRules.cc](https://github.com/apache/trafficserver/blob/master/lib/ts/ParseRules.cc), though all the parse bits are taken. Maybe we could reclaim something esoteric like ``is_wildmat_BIT``?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by shenzhang920 <gi...@git.apache.org>.
Github user shenzhang920 commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r58474731
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    +  for (size_t i = 0; i < strlen(allowed_char); ++i) {
    +    ink_assert((unsigned char)allowed_char[i] < size);
    +    non_encoded_char[(unsigned char)allowed_char[i]] = 1;
    +  }
    +
    +  return true;
    +}
    +
    +static char non_encoded_char[256];
    +static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char));
    --- End diff --
    
    Hardcoding would look like as below, it is error prone.  Should I change it?
    
    static const char non_encoded_char[256] = {
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // ! # $ % & ' ( ) * + , - . /
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, // 0-9 : ; = ?
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // @ A-O
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, // P-Z [ ] _
        0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, // a-o
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0  // p-z ~
        // values of indices from 128 to 255 are all 0
      };



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by bgaff <gi...@git.apache.org>.
Github user bgaff commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-202656925
  
    I worked with @shenzhang920 on this and it looks good to me. :+1:  from me.
    
    @zwoop , mind taking a look?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r57665801
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    +  for (size_t i = 0; i < strlen(allowed_char); ++i) {
    +    ink_assert((unsigned char)allowed_char[i] < size);
    +    non_encoded_char[(unsigned char)allowed_char[i]] = 1;
    +  }
    +
    +  return true;
    +}
    +
    +static char non_encoded_char[256];
    +static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char));
    +
    +/**
    + *  This method will return TRUE if the uri is strictly compliant with
    + *  RFC 3986 and it will return FALSE if not.
    + */
    +static bool
    +url_is_strictly_compliant(const char *start, const char *end)
    +{
    +  for (const char *i = start; i < end; ++i) {
    +    if (!non_encoded_char[(unsigned char)*i]) {
    +      Debug("http", "Non-RFC compliant character [0x%.2X] found in URL\n", (unsigned char)*i);
    +      return false;
    +    }
    +  }
    +  return true;
    +}
    +
     MIMEParseResult
    -url_parse(HdrHeap *heap, URLImpl *url, const char **start, const char *end, bool copy_strings_p)
    +url_parse(HdrHeap *heap, URLImpl *url, const char **start, const char *end, bool copy_strings_p, bool strict_uri_parsing)
    --- End diff --
    
    Add regression tests please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r57665721
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    +  for (size_t i = 0; i < strlen(allowed_char); ++i) {
    +    ink_assert((unsigned char)allowed_char[i] < size);
    +    non_encoded_char[(unsigned char)allowed_char[i]] = 1;
    +  }
    +
    +  return true;
    +}
    +
    +static char non_encoded_char[256];
    +static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char));
    +
    +/**
    + *  This method will return TRUE if the uri is strictly compliant with
    + *  RFC 3986 and it will return FALSE if not.
    + */
    +static bool
    +url_is_strictly_compliant(const char *start, const char *end)
    +{
    +  for (const char *i = start; i < end; ++i) {
    +    if (!non_encoded_char[(unsigned char)*i]) {
    +      Debug("http", "Non-RFC compliant character [0x%.2X] found in URL\n", (unsigned char)*i);
    --- End diff --
    
    ``Debug`` doesn't need a ``\n``.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r57665750
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    --- End diff --
    
    You don't need to init BSS data (see other comment about initialization).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-209762909
  
    I'm ok with ``ParseRules::is_uri``. I made a couple of comments on the tests.
    
    Please take a look at [this](http://chris.beams.io/posts/git-commit/) article about git commit messages. I think we should squash this down into a single commit with a good summary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r57665699
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    +  for (size_t i = 0; i < strlen(allowed_char); ++i) {
    +    ink_assert((unsigned char)allowed_char[i] < size);
    +    non_encoded_char[(unsigned char)allowed_char[i]] = 1;
    +  }
    +
    +  return true;
    +}
    +
    +static char non_encoded_char[256];
    +static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char));
    --- End diff --
    
    This is a fixed initialization. Let's just hardcode it rather than generating it at startup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by bgaff <gi...@git.apache.org>.
Github user bgaff commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-202657737
  
    This is actually how haproxy and nginx behave by default, this config option is disabled by default for backwards compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r59663226
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1846,4 +1823,42 @@ REGRESSION_TEST(VALIDATE_HDR_FIELD)(RegressionTest *t, int /* level ATS_UNUSED *
       }
     }
     
    +
    +const static struct {
    +  const char *const uri;
    +  bool valid;
    +} http_strict_uri_parsing_test_case[] = {{"/home", true},
    --- End diff --
    
    Move this inside the regression test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by shenzhang920 <gi...@git.apache.org>.
Github user shenzhang920 commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-209688326
  
    @bgaff I search the code, no usage of is_wildmat()/is_wildmat_BIT.  @jpeach any suggestion to the name "is_uri"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by bgaff <gi...@git.apache.org>.
Github user bgaff commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-209687483
  
    :+1: This looks good to me, one concern would be the bit name is_uri, can we find a more descriptive name for this bit? Also, @shenzhang920 you also confirmed there is no longer any usage of the is_wildmat_BIT?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by shenzhang920 <gi...@git.apache.org>.
Github user shenzhang920 commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r59641766
  
    --- Diff: mgmt/RecordsConfig.cc ---
    @@ -441,6 +441,8 @@ static const RecordElement RecordsConfig[] =
       ,
       {RECT_CONFIG, "proxy.config.http.post.check.content_length.enabled", RECD_INT, "1", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
       ,
    +  {RECT_CONFIG, "proxy.config.http.strict_uri_parsing", RECD_INT, "0", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
    --- End diff --
    
    Added the doc, please refer to commit 81ed4e2


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by jpeach <gi...@git.apache.org>.
Github user jpeach commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r57665768
  
    --- Diff: mgmt/RecordsConfig.cc ---
    @@ -441,6 +441,8 @@ static const RecordElement RecordsConfig[] =
       ,
       {RECT_CONFIG, "proxy.config.http.post.check.content_length.enabled", RECD_INT, "1", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
       ,
    +  {RECT_CONFIG, "proxy.config.http.strict_uri_parsing", RECD_INT, "0", RECU_DYNAMIC, RR_NULL, RECC_INT, "[0-1]", RECA_NULL}
    --- End diff --
    
    Add documentation please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by shenzhang920 <gi...@git.apache.org>.
Github user shenzhang920 commented on a diff in the pull request:

    https://github.com/apache/trafficserver/pull/541#discussion_r59641878
  
    --- Diff: proxy/hdrs/URL.cc ---
    @@ -1158,9 +1158,51 @@ url_parse_scheme(HdrHeap *heap, URLImpl *url, const char **start, const char *en
       return PARSE_ERROR; // no non-whitespace found
     }
     
    +static bool
    +url_init_non_encoded_char_array(char *non_encoded_char, size_t size)
    +{
    +  const char *allowed_char = ":/?#[]@"                     // gen-delims
    +                              "!$&'()*+,;="                // sub-delims
    +                              "ABCDEFGHIJKLMNOPQRSTUVWXYZ" // ALPHA
    +                              "abcdefghijklmnopqrstuvwxyz"
    +                              "0123456789"                 // DIGIT
    +                              "-._~"                       // other unreserved
    +                              "%";                         // '%' should also not be encoded
    +
    +  memset(non_encoded_char, 0, size);
    +  for (size_t i = 0; i < strlen(allowed_char); ++i) {
    +    ink_assert((unsigned char)allowed_char[i] < size);
    +    non_encoded_char[(unsigned char)allowed_char[i]] = 1;
    +  }
    +
    +  return true;
    +}
    +
    +static char non_encoded_char[256];
    +static bool url_init_non_encoded_char_array_done = url_init_non_encoded_char_array(non_encoded_char, sizeof(non_encoded_char));
    +
    +/**
    + *  This method will return TRUE if the uri is strictly compliant with
    + *  RFC 3986 and it will return FALSE if not.
    + */
    +static bool
    +url_is_strictly_compliant(const char *start, const char *end)
    +{
    +  for (const char *i = start; i < end; ++i) {
    +    if (!non_encoded_char[(unsigned char)*i]) {
    +      Debug("http", "Non-RFC compliant character [0x%.2X] found in URL\n", (unsigned char)*i);
    +      return false;
    +    }
    +  }
    +  return true;
    +}
    +
     MIMEParseResult
    -url_parse(HdrHeap *heap, URLImpl *url, const char **start, const char *end, bool copy_strings_p)
    +url_parse(HdrHeap *heap, URLImpl *url, const char **start, const char *end, bool copy_strings_p, bool strict_uri_parsing)
    --- End diff --
    
    Regression test added in commit d9d972d


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by bgaff <gi...@git.apache.org>.
Github user bgaff commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-210576382
  
    Hey @shenzhang920 why did you close the pull request?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] trafficserver pull request: TS-4312: Adding config to parse urls a...

Posted by bgaff <gi...@git.apache.org>.
Github user bgaff commented on the pull request:

    https://github.com/apache/trafficserver/pull/541#issuecomment-208488556
  
    @jpeach I looked into the idea of using ParseRules.cc and CompileParseRules with @shenzhang920  but it does't look like there are any free bits at the moment, should we just bump it up to 64 bit so we can add another bit for it? Thoughts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---