You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "David Allsopp (JIRA)" <ji...@apache.org> on 2011/07/04 21:47:21 UTC

[jira] [Issue Comment Edited] (CASSANDRA-2850) Converting bytes to hex string is unnecessarily slow

    [ https://issues.apache.org/jira/browse/CASSANDRA-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059589#comment-13059589 ] 

David Allsopp edited comment on CASSANDRA-2850 at 7/4/11 7:46 PM:
------------------------------------------------------------------

I think you mean (bytes.remaining() * 2) not (bytes.remaining() / 2) - we need twice as many chars as bytes.

Also, shouldn't byteToChar[] have length 16, not 256?

Not sure what string creation you are referring to?

I attach 2 further versions of bytesToHex (as another benchmark class 3). Results are below (I've had to increase the number of repeats so the stats are significant!).

v3 uses 'normal' code and is another 20% faster for large values, and _another_ factor of 2 faster than v2, i.e. 7-10 times faster than the original.

v4 uses nasty reflection to avoid doing an arraycopy on the byte array - this avoids a large chunk of memory (all the previous solutions end up doing an arraycopy somewhere). This is now 11-13 times faster than the original.

20M old: 1482
20M new: 360
20M  v2: 249
20M  v3: 203
20M  v4: 125
----
old: 2137
new: 859
 v2: 718
 v3: 203
 v4: 156
----
old: 2138
new: 843
 v2: 733
 v3: 188
 v4: 156
----



      was (Author: dallsopp):
    I think you mean (bytes.remaining() * 2) not (bytes.remaining() / 2) - we need twice as many chars as bytes.

Also, shouldn't byteToChar[] have length 16, not 256.

Not sure what string creation you are referring to?

I attach 2 further versions of bytesToHex (as another benchmark class 3). Results are below (I've had to increasse the number of repeats so the stats are significant!).

v3 uses 'normal' code and is another 20% faster for large values, and _another_ factor of 2 faster than v2, i.e. 7-10 time sfatser than the original.

v4 uses nasty reflection to avoid doing an arraycopy on the byte array - this avoids a large chunk of memory (all the previous solutions end up doing an arraycopy somewhere). This is now 11-13 times fatser than the original.

20M old: 1482
20M new: 360
20M  v2: 249
20M  v3: 203
20M  v4: 125
----
old: 2137
new: 859
 v2: 718
 v3: 203
 v4: 156
----
old: 2138
new: 843
 v2: 733
 v3: 188
 v4: 156
----


  
> Converting bytes to hex string is unnecessarily slow
> ----------------------------------------------------
>
>                 Key: CASSANDRA-2850
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2850
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.6, 0.8.1
>            Reporter: David Allsopp
>            Priority: Minor
>             Fix For: 0.8.2
>
>         Attachments: 2850-v2.patch, BytesToHexBenchmark.java, BytesToHexBenchmark2.java, BytesToHexBenchmark3.java, cassandra-2850a.diff
>
>
> ByteBufferUtil.bytesToHex() is unnecessarily slow - it doesn't pre-size the StringBuilder (so several re-sizes will be needed behind the scenes) and it makes quite a few method calls per byte.
> (OK, this may be a premature optimisation, but I couldn't resist, and it's a small change)
> Will attach patch shortly that speeds it up by about x3, plus benchmarking test.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira