You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefan Miklosovic (Jira)" <ji...@apache.org> on 2021/04/16 21:02:00 UTC
[jira] [Comment Edited] (CASSANDRA-16610) Implement
XXHashPartitioner
[ https://issues.apache.org/jira/browse/CASSANDRA-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17324080#comment-17324080 ]
Stefan Miklosovic edited comment on CASSANDRA-16610 at 4/16/21, 9:01 PM:
-------------------------------------------------------------------------
[~brandon.williams] Fair enough, I just had this urge to code this up to see the numbers in action on my own and I wanted to share it so maybe somebody will be willing to do this too or sometimes in the future there will be some taste to take it in but your point makes total sense too as I think about it, yeah ...
edit: but for new deployments ... why not? nobody is going to migrate a cluster just because of this but when they start from scratch ...
was (Author: stefan.miklosovic):
[~brandon.williams] Fair enough, I just had this urge to code this up to see the numbers in action on my own and I wanted to share it so maybe somebody will be willing to do this too or sometimes in the future there will be some taste to take it in but your point makes total sense too as I think about it, yeah ...
> Implement XXHashPartitioner
> ---------------------------
>
> Key: CASSANDRA-16610
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16610
> Project: Cassandra
> Issue Type: New Feature
> Components: Legacy/Core
> Reporter: Stefan Miklosovic
> Priority: Normal
> Attachments: jmh-result.json
>
>
> I implemented partitioner based on XXHash algorithm.
> There are two branches, the first xxhash, extracts common parts with Murmur as there is a lot of overlap between these two.
> The second branch just copies everything from Murmur and changes just bits which are necessary.
> I am not sure what path we want to go with so I just provided both to easier elaborate on.
> I have written a microbenchmark measuring both partitioners and XXHash implementation is very fast, around 10x faster (on greater payloads). Benchmark is included in xxhash-2 branch.
> https://github.com/instaclustr/cassandra/tree/xxhash-2
> https://github.com/instaclustr/cassandra/tree/xxhash
> {code:java}
> [java] Benchmark (bufferSize) Mode Cnt Score Error Units
> [java] PartitionersBench.benchMurmur3Partitioner 31 avgt 20 157.942 ± 0.110 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 67 avgt 20 204.670 ± 0.152 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 131 avgt 20 361.068 ± 0.228 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 517 avgt 20 1325.670 ± 1.255 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 1031 avgt 20 2594.651 ± 2.725 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 2041 avgt 20 5082.166 ± 1.721 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 4097 avgt 20 10112.020 ± 3.637 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 31 avgt 20 40.650 ± 0.025 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 67 avgt 20 53.305 ± 0.035 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 131 avgt 20 67.098 ± 0.057 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 517 avgt 20 150.415 ± 0.107 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 1031 avgt 20 265.614 ± 0.140 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 2041 avgt 20 365.796 ± 0.225 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 4097 avgt 20 925.841 ± 0.664 ns/op
> {code}
> {code:java}
> [java] PartitionersBench.benchMurmur3Partitioner 3 avgt 5 44.516 ± 0.345 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 5 avgt 5 54.930 ± 0.450 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 7 avgt 5 63.428 ± 0.266 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 9 avgt 5 69.456 ± 0.467 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 11 avgt 5 81.411 ± 0.535 ns/op
> [java] PartitionersBench.benchMurmur3Partitioner 16 avgt 5 68.621 ± 0.417 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 3 avgt 5 26.820 ± 0.271 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 5 avgt 5 28.182 ± 0.139 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 7 avgt 5 31.557 ± 0.161 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 9 avgt 5 31.017 ± 0.212 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 11 avgt 5 33.233 ± 0.136 ns/op
> [java] PartitionersBench.benchXXHashPartitioner 16 avgt 5 31.386 ± 0.128 ns/op
> {code}
> https://github.com/OpenHFT/Zero-Allocation-Hashing
> https://cyan4973.github.io/xxHash/
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org