You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Sylvain Lebresne (JIRA)" <ji...@apache.org> on 2015/11/24 13:27:12 UTC

[jira] [Commented] (CASSANDRA-10409) Specialize MultiCBuilder when building a single clustering

    [ https://issues.apache.org/jira/browse/CASSANDRA-10409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024388#comment-15024388 ] 

Sylvain Lebresne commented on CASSANDRA-10409:
----------------------------------------------

Sorry for not getting back to that sooner. I've pushed a rebased version to [the same branch|https://github.com/pcmanus/cassandra/commits/10409] that includes the 2 nits above. Waiting on CI to finish to run: [utest|http://cassci.datastax.com/view/Dev/view/pcmanus/job/pcmanus-10409-testall/] and [dtests|http://cassci.datastax.com/view/Dev/view/pcmanus/job/pcmanus-10409-dtest/]

bq. I think that the change to RestrictionSet.size() is wrong. size should return the number of columns that have a restriction. If you have a MulticolumnRestriction there is only one restriction but several columns are restricted.

Modulo the inlining of {{getColumnDefs()}}, the change just replaces {{restrictions.keySet().size()}} by {{restrictions.size()}} so the change made by the patch is, I'm pretty confident, strictly equivalent to what was done before. And it does "return the number of columns that have a restriction". What I could agree with is that this {{size}} method is confusing. Because what would be more logical is that it returns the number of _restrictions_, not columns, since the class is called {{RestrictionSet}} (and so you'd expect it to behave as a set of restrictions as much as possible). But that's a concerned that is completely orthogonal to this issue and patch.


> Specialize MultiCBuilder when building a single clustering
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10409
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10409
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.x
>
>
> {{MultiCBuilder}} is used to build the {{Clustering}} and {{Slice.Bound}} used by  queries. As the name implies, it's able to build multiple {{Clustering}}/{{Slice.Bound}} for when we have {{IN}}, but most queries don't use {{IN}} and in this (frequent) case, {{MultiCBuilder}} creates quite a bit more objects that would be necessary (it creates 2 lists for its {{elementsList}}, then a {{CBuilder}} and a {{BTreeSet.Builder}} (even though we know the resulting set will have only one element in this case)). Without being huge, this does show up as non entirely negligible when profiling some simple stress.
> We can easily know if the query has a {{IN}} and so we can know when only a single {{Clustering}}/{{Slice.Bound}} is built, and we can specialize the implementation in that case to be less wasteful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)