You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Jason Brown (JIRA)" <ji...@apache.org> on 2017/11/08 22:42:00 UTC
[jira] [Comment Edited] (CASSANDRA-9988) Introduce leaf-only iterator

    [ https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244848#comment-16244848 ] 

Jason Brown edited comment on CASSANDRA-9988 at 11/8/17 10:41 PM:
------------------------------------------------------------------

Overall, I think the patch is pretty solid. A couple of comments and nits:

- is {{BTreeSearchIterator}} necessary anymore? It's just an empty interface now
- just to confirm, {{FullBTreeSearchIterator}} is just a rename of the previous {{BTreeSearchIterator}} class? My diff'ing seems to confirm that, but I'd like the extra level of sanity checking.

nits:
- {{BTreeSet}} switch imports back to original format
- in {{LeafBTreeSearchIterator}}, move {{#compareToFirst}} and {{#searchNext}} closer to the methods from where they are called. Or better yet, since each is only called from one place, just inline the functionality into those calling methods.

My biggest problem is with the microbench. I know this ticket has gone through a bunch on contortions over it's life, so maybe a little confusion is reasonable. However, what I was really hoping to see was a comparison of the leaf vs full (tree) iterators, as that's what this whole ticket is about. I took liberty of reworking the mircobench to explicitly test the differences bewtween leaf and tree iterators, as well as reduce of benchmark executions (due to every value between 0 and 31, inclusive, being a param to the {{targetIdx}} field. (I simply removed about 2/3 of those values, as I think we only really care about testing reading from the beginning, middle, and end of the btree entries; anything else just bloats total test execution time). Further, as {{#iteratorTree()}} and {{#multiSearchFound}} don't use the {{targetIdx}} param, I think they can be moved to another class as they would be executed once for each of the entries in {{targetIdx}} (I'm not awatre of a way to ignore that in JMH).

My version of your branch with the updated microbench is [here|https://github.com/jasobrown/cassandra/tree/9988]

EDIT: forgot to mention that with the updated microbench, the leaf iterator was anywhere from 3-20% faster, depending on the test.


was (Author: jasobrown):
Overall, I think the patch is pretty solid. A couple of comments and nits:

- is {{BTreeSearchIterator}} necessary anymore? It's just an empty interface now
- just to confirm, {{FullBTreeSearchIterator}} is just a rename of the previous {{BTreeSearchIterator}} class? My diff'ing seems to confirm that, but I'd like the extra level of sanity checking.

nits:
- {{BTreeSet}} switch imports back to original format
- in {{LeafBTreeSearchIterator}}, move {{#compareToFirst}} and {{#searchNext}} closer to the methods from where they are called. Or better yet, since each is only called from one place, just inline the functionality into those calling methods.

My biggest problem is with the microbench. I know this ticket has gone through a bunch on contortions over it's life, so maybe a little confusion is reasonable. However, what I was really hoping to see was a comparison of the leaf vs full (tree) iterators, as that's what this whole ticket is about. I took liberty of reworking the mircobench to explicitly test the differences bewtween leaf and tree iterators, as well as reduce of benchmark executions (due to every value between 0 and 31, inclusive, being a param to the {{targetIdx}} field. (I simply removed about 2/3 of those values, as I think we only really care about testing reading from the beginning, middle, and end of the btree entries; anything else just bloats total test execution time). Further, as {{#iteratorTree()}} and {{#multiSearchFound}} don't use the {{targetIdx}} param, I think they can be moved to another class as they would be executed once for each of the entries in {{targetIdx}} (I'm not awatre of a way to ignore that in JMH).

My version of your branch with the updated microbench is [here|https://github.com/jasobrown/cassandra/tree/9988]

> Introduce leaf-only iterator
> ----------------------------
>
>                 Key: CASSANDRA-9988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Benedict
>            Assignee: Jay Zhuang
>            Priority: Minor
>              Labels: patch
>             Fix For: 4.0
>
>         Attachments: 9988-3tests.png, 9988-data.png, 9988-result.png, 9988-result2.png, 9988-result3.png, 9988-test-result-expsearch.xlsx, 9988-test-result-raw.png, 9988-test-result.xlsx, 9988-test-result3.png, 9988-trunk-new-update.txt, 9988-trunk-new.txt, trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf page. In this case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org