You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stu Hood (JIRA)" <ji...@apache.org> on 2011/04/15 07:27:05 UTC

[jira] [Issue Comment Edited] (CASSANDRA-2319) Promote row index

    [ https://issues.apache.org/jira/browse/CASSANDRA-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020195#comment-13020195 ] 

Stu Hood edited comment on CASSANDRA-2319 at 4/15/11 5:26 AM:
--------------------------------------------------------------

Output from some stress.java runs is attached as version-*.txt:
* version-f: trunk
* version-g: promoted index _without_ LZF, but with type specific compression
* version-g-lzf: promoted index with all the features of 2398

----

A slightly larger scale wide row test is described in promotion.pdf
* 90/10 writes/reads for 6 hours
** appending to 30,000 keys
** reading the tail of the row
* sstables-per-read histogram for a wide-row workload
* trunk (unpromoted) vs 2319 (promoted)
* ~12GB of data, ~3GB of page cache

Two takeaways:
# Promotion allowed one less sstable to be accessed on average
** At the end of both runs, the largest sstable was 6GB, which was also the only file large enough for promotion to kick in (for 30,000 keys and column_index_size=64KB, it takes an sstable larger than 1.92GB for the nested index to start promoting column names)
# the promoted index was able to accomplish 25% more reads during the time period covered by the test, likely due to hitting one less file

      was (Author: stuhood):
    Output from some stress.java runs is attached as version-*.txt:
* version-f: trunk
* version-g: promoted index _without_ LZF, but with type specific compression
* version-g-lzf: promoted index with all the features of 2398
----
A slightly larger scale wide row test is described in promotion.pdf
* 90/10 writes/reads for 6 hours
** appending to 30,000 keys
** reading the tail of the row
* sstables-per-read histogram for a wide-row workload
* trunk (unpromoted) vs 2319 (promoted)
* ~12GB of data, ~3GB of page cache

Two takeaways:
# Promotion allowed one less sstable to be accessed on average
** At the end of both runs, the largest sstable was 6GB, which was also the only file large enough for promotion to kick in (for 30,000 keys and column_index_size=64KB, it takes an sstable larger than 1.92GB for the nested index to start promoting column names)
# the promoted index was able to accomplish 25% more reads during the time period covered by the test, likely due to hitting one less file
  
> Promote row index
> -----------------
>
>                 Key: CASSANDRA-2319
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2319
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>              Labels: index, timeseries
>             Fix For: 1.0
>
>         Attachments: 2319-v1.tgz, promotion.pdf, version-f.txt, version-g-lzf.txt, version-g.txt
>
>
> The row index contains entries for configurably sized blocks of a wide row. For a row of appreciable size, the row index ends up directing the third seek (1. index, 2. row index, 3. content) to nearby the first column of a scan.
> Since the row index is always used for wide rows, and since it contains information that tells us whether or not the 3rd seek is necessary (the column range or name we are trying to slice may not exist in a given sstable), promoting the row index into the sstable index would allow us to drop the maximum number of seeks for wide rows back to 2, and, more importantly, would allow sstables to be eliminated using only the index.
> An example usecase that benefits greatly from this change is time series data in wide rows, where data is appended to the beginning or end of the row. Our existing compaction strategy gets lucky and clusters the oldest data in the oldest sstables: for queries to recently appended data, we would be able to eliminate wide rows using only the sstable index, rather than needing to seek into the data file to determine that it isn't interesting. For narrow rows, this change would have no effect, as they will not reach the threshold for indexing anyway.
> A first cut design for this change would look very similar to the file format design proposed on #674: http://wiki.apache.org/cassandra/FileFormatDesignDoc: row keys clustered, column names clustered, and offsets clustered and delta encoded.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira