You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@couchdb.apache.org by GitBox <gi...@apache.org> on 2022/07/06 16:19:46 UTC

[GitHub] [couchdb] nickva opened a new pull request, #4091: Optimize couch_util:to_hex/1

nickva opened a new pull request, #4091:
URL: https://github.com/apache/couchdb/pull/4091

   When profiling [1] a cluster acting as a replication source, noticed a lot of time spent in `couch_util:to_hex/1`. That function is used to emit revision ids amongst other things. When processing a million documents with a 1000 revisions
   each, it ends up in the hotpath, so to speak.
   
   Remembering that in Erlang/OTP 24 there is a new [`binary:encode_hex/1`](https://www.erlang.org/doc/man/binary.html#encode_hex-1) function, decided to benchmark ours vs the OTP implementation. It turns ours is slower [2] so let's try to use the OTP one.
   
   One difference from the OTP's version is ours emits lower case hex letters, while the OTP one emits upper case ones. 
   
   As a bonus, replaced a few calls to `couch_util:to_hex/1` wrapped in `?l2b/1` or `list_to_binary/1` with just a single call to `couch_util:to_hex_bin/1`.
   
   Existing `couch_util:to_hex/1` version, returning a list ,was left as is and just calls `to_hex_bin/1` internally and converts the result to a list.
   
   [1] 
   ```
   > ... eprof:analyze(total,[{sort, time}, {filter, [{time, 1000}, {calls, 100}]}]).
   
   FUNCTION                                                            CALLS        %      TIME  [uS / CALLS]
   --------                                                            -----  -------      ----  [----------]
   ...
   couch_doc:revid_to_str/1                                          1165641     1.71    402860  [      0.35]
   couch_key_tree:get_key_leafs_simple/4                             1304102     1.85    435700  [      0.33]
   erlang:list_to_integer/2                                           873172     2.00    471140  [      0.54]
   gen_server:loop/7                                                   62209     2.11    496932  [      7.99]
   couch_key_tree:map_simple/3                                       1829235     2.36    554429  [      0.30]
   couch_util:nibble_to_hex/1                                       37334650    16.67   3917127  [      0.10]
   couch_util:to_hex/1                                              19834192    34.37   8077050  [      0.41]
   ---------------------------------------------------------------  --------  -------  --------  [----------]
   Total:                                                           91072005  100.00%  23503072  [      0.26]
   ```
   
   [2] `to_hex1/1` is the existing version and `to_hex3/1` is the OTP version.
   
   ```
   % ~/src/erlperf/erlperf 'hex:to_hex1(<<210,90,95,232,68,185,66,248,160,33,184,103,181,221,158,96>>).' 'hex:to_hex3(<<210,90,95,232,68,185,66,248,160,33,184,103,181,221,158,96>>).'
   Code                                                                                ||        QPS       Time     Rel
   hex:to_hex3(<<210,90,95,232,68,185,66,248,160,33,184,103,181,221,158,96>>).          1    2746 Ki     364 ns    100%
   hex:to_hex1(<<210,90,95,232,68,185,66,248,160,33,184,103,181,221,158,96>>).          1    1593 Ki     627 ns     58%
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] davisp commented on a diff in pull request #4091: Optimize couch_util:to_hex/1

Posted by GitBox <gi...@apache.org>.
davisp commented on code in PR #4091:
URL: https://github.com/apache/couchdb/pull/4091#discussion_r915060435


##########
src/couch/src/couch_util.erl:
##########
@@ -212,29 +212,36 @@ validate_utf8_fast(B, O) ->
             false
     end.
 
-to_hex(<<Hi:4, Lo:4, Rest/binary>>) ->
-    [nibble_to_hex(Hi), nibble_to_hex(Lo) | to_hex(Rest)];
-to_hex(<<>>) ->
-    [];
+to_hex(Binary) when is_binary(Binary) ->
+    binary_to_list(to_hex_bin(Binary));
 to_hex(List) when is_list(List) ->
     to_hex(list_to_binary(List)).

Review Comment:
   Do we need this clause to recurse back into to_hex/1 just to have the binary turned back into a list? It'd change the error a bit if List isn't convertible to binary I guess? This is just me thinking out loud, I don't have a direct preference really.



##########
src/couch/src/couch_util.erl:
##########
@@ -792,3 +799,31 @@ version_to_binary(Ver) when is_list(Ver) ->
     Ver1 = lists:reverse(lists:dropwhile(IsZero, lists:reverse(Ver))),
     Ver2 = [erlang:integer_to_list(N) || N <- Ver1],
     ?l2b(lists:join(".", Ver2)).
+
+-compile({inline, [hex/1]}).
+
+%% erlfmt-ignore
+hex(X) ->
+    % Table of all pairs of hex characters for a byte:

Review Comment:
   Might add a note that we're encoding the hex as ascii directly here. Took me a few minutes to realize what was going on.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva merged pull request #4091: Optimize couch_util:to_hex/1

Posted by GitBox <gi...@apache.org>.
nickva merged PR #4091:
URL: https://github.com/apache/couchdb/pull/4091


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva commented on a diff in pull request #4091: Optimize couch_util:to_hex/1

Posted by GitBox <gi...@apache.org>.
nickva commented on code in PR #4091:
URL: https://github.com/apache/couchdb/pull/4091#discussion_r915067040


##########
src/couch/src/couch_util.erl:
##########
@@ -212,29 +212,36 @@ validate_utf8_fast(B, O) ->
             false
     end.
 
-to_hex(<<Hi:4, Lo:4, Rest/binary>>) ->
-    [nibble_to_hex(Hi), nibble_to_hex(Lo) | to_hex(Rest)];
-to_hex(<<>>) ->
-    [];
+to_hex(Binary) when is_binary(Binary) ->
+    binary_to_list(to_hex_bin(Binary));
 to_hex(List) when is_list(List) ->
     to_hex(list_to_binary(List)).

Review Comment:
   Oh, it does look silly this way. I'd just have it call to_hex_bin/1 directly it would look cleaner that way.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [couchdb] nickva commented on a diff in pull request #4091: Optimize couch_util:to_hex/1

Posted by GitBox <gi...@apache.org>.
nickva commented on code in PR #4091:
URL: https://github.com/apache/couchdb/pull/4091#discussion_r915063424


##########
src/couch/src/couch_util.erl:
##########
@@ -792,3 +799,31 @@ version_to_binary(Ver) when is_list(Ver) ->
     Ver1 = lists:reverse(lists:dropwhile(IsZero, lists:reverse(Ver))),
     Ver2 = [erlang:integer_to_list(N) || N <- Ver1],
     ?l2b(lists:join(".", Ver2)).
+
+-compile({inline, [hex/1]}).
+
+%% erlfmt-ignore
+hex(X) ->
+    % Table of all pairs of hex characters for a byte:

Review Comment:
   good idea



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@couchdb.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org