You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Aleksey Yeschenko (JIRA)" <ji...@apache.org> on 2019/07/05 15:18:00 UTC
[jira] [Updated] (CASSANDRA-15202) TBD

     [ https://issues.apache.org/jira/browse/CASSANDRA-15202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aleksey Yeschenko updated CASSANDRA-15202:
------------------------------------------
    Description: 
CASSANDRA-14096 made the first step to address the heavy on-heap footprint of merkle trees on repair coordinators - by reducing the time frame over which they are referenced, and by more intelligently limiting depth of the trees based on available heap size.

That alone improves GC profile and prevents OOMs, but doesn’t address the issue entirely. The coordinator still must hold all the trees on heap at once until it’s done diffing them with each other, which has a negative effect, and, by reducing depth, we lose precision and thus cause more overstreaming than before.

One way to improve the situation further is to build on CASSANDRA-14096 and move the trees entirely off-heap. This is a trivial endeavor, given that we are dealing with what should be full binary trees (though in practice aren’t quite, yet). This JIRA makes the first step towards there - by moving just deserialisation off-heap, leaving construction on the replicas on-heap still.

Additionally, the proposed patch fixes the issue of replica coordinators sending merkle trees to itself over loopback, costing us a ser/deser loop per tree.

Please note that there is more room for improvement here, and depending on 4.0 timeline those improvements may or may not land in time. To name a few:
- with some minor modifications to init(), we can make sure that no matter the range, the tree is *always* perfectly full; this would allow us to get rid of child pointers in inner nodes, as child node addresses will be trivially calculatable given fixed size of nodes
- the trees can be easily constructed off-heap so long as you run init() to pre-size the tree to find out how large a buffer you need
- on-wire format doesn’t need to stream inner nodes, only leaves, and, really, only the hashes of the leaves

  was:TBD


> TBD
> ---
>
>                 Key: CASSANDRA-15202
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15202
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Priority: Normal
>
> CASSANDRA-14096 made the first step to address the heavy on-heap footprint of merkle trees on repair coordinators - by reducing the time frame over which they are referenced, and by more intelligently limiting depth of the trees based on available heap size.
> That alone improves GC profile and prevents OOMs, but doesn’t address the issue entirely. The coordinator still must hold all the trees on heap at once until it’s done diffing them with each other, which has a negative effect, and, by reducing depth, we lose precision and thus cause more overstreaming than before.
> One way to improve the situation further is to build on CASSANDRA-14096 and move the trees entirely off-heap. This is a trivial endeavor, given that we are dealing with what should be full binary trees (though in practice aren’t quite, yet). This JIRA makes the first step towards there - by moving just deserialisation off-heap, leaving construction on the replicas on-heap still.
> Additionally, the proposed patch fixes the issue of replica coordinators sending merkle trees to itself over loopback, costing us a ser/deser loop per tree.
> Please note that there is more room for improvement here, and depending on 4.0 timeline those improvements may or may not land in time. To name a few:
> - with some minor modifications to init(), we can make sure that no matter the range, the tree is *always* perfectly full; this would allow us to get rid of child pointers in inner nodes, as child node addresses will be trivially calculatable given fixed size of nodes
> - the trees can be easily constructed off-heap so long as you run init() to pre-size the tree to find out how large a buffer you need
> - on-wire format doesn’t need to stream inner nodes, only leaves, and, really, only the hashes of the leaves



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org