You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@trafficserver.apache.org by GitBox <gi...@apache.org> on 2023/01/10 21:17:02 UTC

[GitHub] [trafficserver] maskit opened a new pull request, #9297: Optimize HPACK string comparison

maskit opened a new pull request, #9297:
URL: https://github.com/apache/trafficserver/pull/9297

   When compare strings in HPACK, we don't care which string comes first on a dictionary. We just want to know the two strings are exactly the same.
   
   This PR replaces memcmp(sv, sv) and strcasecmp(sv, sv) with local functions that immediately return `false` when the lengths don't match.
   
   Before:
   ```
   finished in 14.74s, 6784.85 req/s, 53.19MB/s
   requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout
   status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
   traffic: 783.92MB (822000249) total, 976.78KB (1000219) headers (space savings 96.45%), 781.25MB (819200000) data
                        min         max         mean         sd        +/- sd
   time for request:      860us      3.60ms      1.45ms       179us    89.26%
   time for connect:     4.08ms      4.08ms      4.08ms         0us   100.00%
   time to 1st byte:     7.58ms      7.58ms      7.58ms         0us   100.00%
   req/s           :    6784.90     6784.90     6784.90        0.00   100.00%
   ```
   
   After:
   ```
   finished in 12.77s, 7829.25 req/s, 61.38MB/s
   requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout
   status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
   traffic: 783.92MB (822000238) total, 976.77KB (1000208) headers (space savings 96.45%), 781.25MB (819200000) data
                        min         max         mean         sd        +/- sd
   time for request:      797us      2.85ms      1.25ms       155us    89.15%
   time for connect:     2.79ms      2.79ms      2.79ms         0us   100.00%
   time to 1st byte:     5.49ms      5.49ms      5.49ms         0us   100.00%
   req/s           :    7829.30     7829.30     7829.30        0.00   100.00%
   ```
   
   Before:
   <img width="740" alt="image" src="https://user-images.githubusercontent.com/153144/211663677-170934a4-8678-43bd-b95a-9e6733ae878e.png">
   
   After:
   <img width="730" alt="image" src="https://user-images.githubusercontent.com/153144/211664220-37a65713-e5be-416d-8940-66517d4a8256.png">


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] bneradt commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
bneradt commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1073958926


##########
proxy/http2/HPACK.cc:
##########
@@ -228,12 +266,12 @@ namespace HpackStaticTable
     HpackLookupResult result;
 
     for (unsigned int index = 1; index < TS_HPACK_STATIC_TABLE_ENTRY_NUM; ++index) {
-      std::string_view name  = STATIC_TABLE[index].name;
-      std::string_view value = STATIC_TABLE[index].value;
+      const std::string_view &name  = STATIC_TABLE[index].name;
+      const std::string_view &value = STATIC_TABLE[index].value;

Review Comment:
   In any case, if your profiling found that const references are more performant, let's run with that. Can you also add a comment here explaining that the const references were more performant than std::string_view copies?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] maskit commented on pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
maskit commented on PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#issuecomment-1396128229

   Thank you for the suggestions. I added the comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] bneradt commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
bneradt commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1073957840


##########
proxy/http2/HPACK.cc:
##########
@@ -189,6 +189,44 @@ hpack_field_is_literal(HpackField ftype)
   return ftype == HpackField::INDEXED_LITERAL || ftype == HpackField::NOINDEX_LITERAL || ftype == HpackField::NEVERINDEX_LITERAL;
 }
 
+// Try not to use memcmp(sv, sv) and strncasecmp(sv, sv) because we don't care which value comes first on a dictionary.
+// Return immediately if the lengths of given strings don't match.
+static inline bool
+match(const char *s1, int s1_len, const char *s2, int s2_len)

Review Comment:
   Sounds good. If you saw an improvement by taking char* and int, then that's good. Can you please add a comment explaining this though? Something like, `// We noticed with profiling that taking char* and int was more performant than taking std::string_view.` Otherwise someone might think to refactor by taking string_view parameters since that is what the users have.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] maskit commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
maskit commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1072969132


##########
proxy/http2/HPACK.cc:
##########
@@ -228,12 +266,12 @@ namespace HpackStaticTable
     HpackLookupResult result;
 
     for (unsigned int index = 1; index < TS_HPACK_STATIC_TABLE_ENTRY_NUM; ++index) {
-      std::string_view name  = STATIC_TABLE[index].name;
-      std::string_view value = STATIC_TABLE[index].value;
+      const std::string_view &name  = STATIC_TABLE[index].name;
+      const std::string_view &value = STATIC_TABLE[index].value;

Review Comment:
   This also depends on how it'll be optimized probably. I don't know why but I saw memcpy without this change. I presume, by having these as const references, they are handled as aliases (no copies but direct access to the name, and the value).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] bneradt commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
bneradt commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1073949254


##########
proxy/http2/HPACK.cc:
##########
@@ -228,12 +266,12 @@ namespace HpackStaticTable
     HpackLookupResult result;
 
     for (unsigned int index = 1; index < TS_HPACK_STATIC_TABLE_ENTRY_NUM; ++index) {
-      std::string_view name  = STATIC_TABLE[index].name;
-      std::string_view value = STATIC_TABLE[index].value;
+      const std::string_view &name  = STATIC_TABLE[index].name;
+      const std::string_view &value = STATIC_TABLE[index].value;

Review Comment:
   That is really strange. I wouldn't expect the creation of a string view to have a memcpy...a string_view is simply nothing more than a pointer to a char* and a length (int type) value. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] bneradt commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
bneradt commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1072938445


##########
proxy/http2/HPACK.cc:
##########
@@ -189,6 +189,44 @@ hpack_field_is_literal(HpackField ftype)
   return ftype == HpackField::INDEXED_LITERAL || ftype == HpackField::NOINDEX_LITERAL || ftype == HpackField::NEVERINDEX_LITERAL;
 }
 
+// Try not to use memcmp(sv, sv) and strncasecmp(sv, sv) because we don't care which value comes first on a dictionary.
+// Return immediately if the lengths of given strings don't match.
+static inline bool
+match(const char *s1, int s1_len, const char *s2, int s2_len)

Review Comment:
   It seems like the caller of these functions would be better served with these functions taking std::string_view parameters.



##########
proxy/http2/HPACK.cc:
##########
@@ -228,12 +266,12 @@ namespace HpackStaticTable
     HpackLookupResult result;
 
     for (unsigned int index = 1; index < TS_HPACK_STATIC_TABLE_ENTRY_NUM; ++index) {
-      std::string_view name  = STATIC_TABLE[index].name;
-      std::string_view value = STATIC_TABLE[index].value;
+      const std::string_view &name  = STATIC_TABLE[index].name;
+      const std::string_view &value = STATIC_TABLE[index].value;

Review Comment:
   Is the const reference cheaper than copying the pointer and length values of original `name` and `value` types, as the original did? Each time these are accessed, it will now require a pointer dereference.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] bneradt commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
bneradt commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1073957840


##########
proxy/http2/HPACK.cc:
##########
@@ -189,6 +189,44 @@ hpack_field_is_literal(HpackField ftype)
   return ftype == HpackField::INDEXED_LITERAL || ftype == HpackField::NOINDEX_LITERAL || ftype == HpackField::NEVERINDEX_LITERAL;
 }
 
+// Try not to use memcmp(sv, sv) and strncasecmp(sv, sv) because we don't care which value comes first on a dictionary.
+// Return immediately if the lengths of given strings don't match.
+static inline bool
+match(const char *s1, int s1_len, const char *s2, int s2_len)

Review Comment:
   Sounds good. If you saw an improvement by taking char* and int, then that's good. Can you please add a comment explaining this though? Something like, `// We noticed with profiling that taking char* and int was more performant than taking std::string_view.` Otherwise someone might think to refactor by taking string_view parameters since that is what the callers of this function have.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] bneradt commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
bneradt commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1073949254


##########
proxy/http2/HPACK.cc:
##########
@@ -228,12 +266,12 @@ namespace HpackStaticTable
     HpackLookupResult result;
 
     for (unsigned int index = 1; index < TS_HPACK_STATIC_TABLE_ENTRY_NUM; ++index) {
-      std::string_view name  = STATIC_TABLE[index].name;
-      std::string_view value = STATIC_TABLE[index].value;
+      const std::string_view &name  = STATIC_TABLE[index].name;
+      const std::string_view &value = STATIC_TABLE[index].value;

Review Comment:
   That is really strange. I wouldn't expect the creation of a string view to have a memcpy...a string_view is simply nothing more than a pointer (char*) and a length (int type) value. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] maskit commented on a diff in pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
maskit commented on code in PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297#discussion_r1072964369


##########
proxy/http2/HPACK.cc:
##########
@@ -189,6 +189,44 @@ hpack_field_is_literal(HpackField ftype)
   return ftype == HpackField::INDEXED_LITERAL || ftype == HpackField::NOINDEX_LITERAL || ftype == HpackField::NEVERINDEX_LITERAL;
 }
 
+// Try not to use memcmp(sv, sv) and strncasecmp(sv, sv) because we don't care which value comes first on a dictionary.
+// Return immediately if the lengths of given strings don't match.
+static inline bool
+match(const char *s1, int s1_len, const char *s2, int s2_len)

Review Comment:
   I found use of string_view actually costs a little in some cases. It probably depends on how it'll be optimized, but I was able to prune away time spent for string_view::data() and string_view::length() by receiving unwrapped pointers and lengths. This function is really busy and tiny optimization works, so I'd like to ensure remove any possibility of unfortunate cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [trafficserver] maskit merged pull request #9297: Optimize HPACK string comparison

Posted by GitBox <gi...@apache.org>.
maskit merged PR #9297:
URL: https://github.com/apache/trafficserver/pull/9297


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@trafficserver.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org