You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefan Miklosovic (Jira)" <ji...@apache.org> on 2021/04/16 21:02:00 UTC

[jira] [Comment Edited] (CASSANDRA-16610) Implement XXHashPartitioner

    [ https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324080#comment-17324080 ] 

Stefan Miklosovic edited comment on CASSANDRA-16610 at 4/16/21, 9:01 PM:
-------------------------------------------------------------------------

[~brandon.williams] Fair enough, I just had this urge to code this up to see the numbers in action on my own and I wanted to share it so maybe somebody will be willing to do this too or sometimes in the future there will be some taste to take it in but your point makes total sense too as I think about it, yeah ...

 

edit: but for new deployments ... why not? nobody is going to migrate a cluster just because of this but when they start from scratch ...


was (Author: stefan.miklosovic):
[~brandon.williams] Fair enough, I just had this urge to code this up to see the numbers in action on my own and I wanted to share it so maybe somebody will be willing to do this too or sometimes in the future there will be some taste to take it in but your point makes total sense too as I think about it, yeah ...

> Implement XXHashPartitioner
> ---------------------------
>
>                 Key: CASSANDRA-16610
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Legacy/Core
>            Reporter: Stefan Miklosovic
>            Priority: Normal
>         Attachments: jmh-result.json
>
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits which are necessary.
> I am not sure what path we want to go with so I just provided both to easier elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash implementation is very fast, around 10x faster (on greater payloads). Benchmark is included in xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark                                  (bufferSize)  Mode  Cnt      Score   Error  Units
> [java] PartitionersBench.benchMurmur3Partitioner            31  avgt   20    157.942 ± 0.110  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner            67  avgt   20    204.670 ± 0.152  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner           131  avgt   20    361.068 ± 0.228  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner           517  avgt   20   1325.670 ± 1.255  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner          1031  avgt   20   2594.651 ± 2.725  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner          2041  avgt   20   5082.166 ± 1.721  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner          4097  avgt   20  10112.020 ± 3.637  ns/op
> [java] PartitionersBench.benchXXHashPartitioner             31  avgt   20     40.650 ± 0.025  ns/op
> [java] PartitionersBench.benchXXHashPartitioner             67  avgt   20     53.305 ± 0.035  ns/op
> [java] PartitionersBench.benchXXHashPartitioner            131  avgt   20     67.098 ± 0.057  ns/op
> [java] PartitionersBench.benchXXHashPartitioner            517  avgt   20    150.415 ± 0.107  ns/op
> [java] PartitionersBench.benchXXHashPartitioner           1031  avgt   20    265.614 ± 0.140  ns/op
> [java] PartitionersBench.benchXXHashPartitioner           2041  avgt   20    365.796 ± 0.225  ns/op
> [java] PartitionersBench.benchXXHashPartitioner           4097  avgt   20    925.841 ± 0.664  ns/op
> {code}
> {code:java}
> [java] PartitionersBench.benchMurmur3Partitioner             3  avgt    5  44.516 ± 0.345  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner             5  avgt    5  54.930 ± 0.450  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner             7  avgt    5  63.428 ± 0.266  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner             9  avgt    5  69.456 ± 0.467  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner            11  avgt    5  81.411 ± 0.535  ns/op
> [java] PartitionersBench.benchMurmur3Partitioner            16  avgt    5  68.621 ± 0.417  ns/op
> [java] PartitionersBench.benchXXHashPartitioner              3  avgt    5  26.820 ± 0.271  ns/op
> [java] PartitionersBench.benchXXHashPartitioner              5  avgt    5  28.182 ± 0.139  ns/op
> [java] PartitionersBench.benchXXHashPartitioner              7  avgt    5  31.557 ± 0.161  ns/op
> [java] PartitionersBench.benchXXHashPartitioner              9  avgt    5  31.017 ± 0.212  ns/op
> [java] PartitionersBench.benchXXHashPartitioner             11  avgt    5  33.233 ± 0.136  ns/op
> [java] PartitionersBench.benchXXHashPartitioner             16  avgt    5  31.386 ± 0.128  ns/op
> {code}
> https://github.com/OpenHFT/Zero-Allocation-Hashing
> https://cyan4973.github.io/xxHash/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org