You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Ravikumar Govindarajan <ra...@gmail.com> on 2012/09/12 13:39:27 UTC

Composite Column Types Storage

We have a requirement, where storage format will need to use a composite
column, that has 2 parts. A string and an integer

Ex:

<Some-String> [column-name]
                     <id1>,<id2>,<id3>..... [composite column]

I did not choose super-column because the number of sub-columns could
quickly get bigger. However, i am not able to find info on how composite
columns are persisted to disk

Is every <string>/<id> combination stored separately in disk or is there
some form of composite-column-name sharing implemented to reduce disk space.

Regards,
Ravi

Re: Composite Column Types Storage

Posted by Sylvain Lebresne <sy...@datastax.com>.
> As I understand from the link below, burning column index-info onto the
> sstable index files will not only eliminate sstables but also reduce disk
> seeks from 3 to 2 for wide rows.

Yes.

> Shouldn't we be wary of the spike in heap usage by promoting column indexes
> to index file?

If you're talking about the index files getting bigger, that's not
really a problem per-se, mmapped files are not part of the heap and
it's all dealt by the file system.
Now it's true that the column index is also promoted in the index
summary that is loaded in memory. However, how much is loaded in this
summary is still configurable, so overall that shouldn't be a problem
either (fyi,  https://issues.apache.org/jira/browse/CASSANDRA-4478 is
relevant to that discussion too).

--
Sylvain

Re: Composite Column Types Storage

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
As I understand from the link below, burning column index-info onto the
sstable index files will not only eliminate sstables but also reduce disk
seeks from 3 to 2 for wide rows.

Our index files are always mmapped, so there is only one random seek for a
named column query. I think that is a wonderful improvement

Shouldn't we be wary of the spike in heap usage by promoting column indexes
to index file?

It should be nice to have say 128th entry written out to disk, while load
every 512th index in memory during start-up, just as a balancing factor?

--
Ravi

On Tue, Sep 18, 2012 at 4:47 PM, Sylvain Lebresne <sy...@datastax.com>wrote:

> > Range queries do not use bloom filters. It holds good for
> composite-columns
> > also right?
>
> Since I assume you are referring to column's bloom filters (key's bloom
> filters
> are always used) then yes, that holds good for composite columns.
> Currently,
> composite column name are completely opaque to the storage engine.
>
> > <Column-part-1> alone could have gone into the bloom-filter, speeding up
> my
> > queries really effectively
>
> True, though https://issues.apache.org/jira/browse/CASSANDRA-2319 (in 1.2
> only
> however) should help quite a lot here. Basically it will allow to skip the
> sstable based on the column index. Granted, this is less fined grained
> than a
> bloom filter (though on the other side there is no false positive), but I
> suspect that in most real life workload it won't be too much worse.
>
> --
> Sylvain
>

Re: Composite Column Types Storage

Posted by Sylvain Lebresne <sy...@datastax.com>.
> Range queries do not use bloom filters. It holds good for composite-columns
> also right?

Since I assume you are referring to column's bloom filters (key's bloom filters
are always used) then yes, that holds good for composite columns. Currently,
composite column name are completely opaque to the storage engine.

> <Column-part-1> alone could have gone into the bloom-filter, speeding up my
> queries really effectively

True, though https://issues.apache.org/jira/browse/CASSANDRA-2319 (in 1.2 only
however) should help quite a lot here. Basically it will allow to skip the
sstable based on the column index. Granted, this is less fined grained than a
bloom filter (though on the other side there is no false positive), but I
suspect that in most real life workload it won't be too much worse.

--
Sylvain

Re: Composite Column Types Storage

Posted by aaron morton <aa...@thelastpickle.com>.
> It is slowly dawning on me that I need a super-column to use column blooms effectively and at the same time don't want the entire sub-column list deserialized. 
Queries by name use the row level bloom filter, regardless of the CF type. 

> In fact, for my use-case I also do not need a column sampling index. Rather I would much prefer a multi-level skip-list
Are you thinking about performance or functionality ? If it's performance do you have an example of something that needs optimisation ?

> Is there a way to customize how cassandra writes/reads it's key/column indexes to SSTables.
No.

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 18/09/2012, at 2:44 AM, Ravikumar Govindarajan <ra...@gmail.com> wrote:

> Yes Aaron, I was not clear about Bloom Filters. I was thinking about the column bloom filters when I specify an absolute value for Part1 of the composite column and a start/end value for Part2 of the composite column
> 
> It is slowly dawning on me that I need a super-column to use column blooms effectively and at the same time don't want the entire sub-column list deserialized. 
> 
> In fact, for my use-case I also do not need a column sampling index. Rather I would much prefer a multi-level skip-list
> 
> Is there a way to customize how cassandra writes/reads it's key/column indexes to SSTables. Any hooks/API that is available as of now should be greatly helpful
> 
> On Fri, Sep 14, 2012 at 10:33 AM, aaron morton <aa...@thelastpickle.com> wrote:
>> Range queries do not use bloom filters. 
> Are you talking about row range queries ? Or a slice of columns in a row ? 
> 
> If you are getting a slice of columns from a single row, a bloom filter is used to locate the row. 
> If you are getting a slice of columns from a range of rows, the bloom filter is used to locate the first row. After that is a scan. 
> 
> There are also row level bloom filters for columns on a row. These are used when you columns by names. If you are doing a slice with a start the bloom filter is not used, instead the row level column index is used (if present). 
> 
> Hope that helps. 
> 
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 13/09/2012, at 2:30 AM, Ravikumar Govindarajan <ra...@gmail.com> wrote:
> 
>> Thanks for the clarification. Even though compression solves disk space issue, we might still have Memtable bloat right?
>> 
>> There is another issue to be handled for us. The queries are always going to be range queries with absolute match on part1 and range on part 2 of the composite columns
>> 
>> Ex: Query <some-key> <Column-part-1> <Start-Id-part-2> <Limit> 
>> 
>> Range queries do not use bloom filters. It holds good for composite-columns also right? I believe I will end up writing BF bytes only to skip it later.
>> 
>> If sharing had been possible, then <Column-part-1> alone could have gone into the bloom-filter, speeding up my queries really effectively.
>> 
>> But as I understand, there are many levels of nesting possible in a composite type and casing at every level is a big task
>> 
>> May be casing for the top-level or the first-part should be a good start?
>> 
>> --
>> Ravi
>> 
>> On Wed, Sep 12, 2012 at 5:46 PM, Sylvain Lebresne <sy...@datastax.com> wrote:
>> > Is every <string>/<id> combination stored separately in disk
>> 
>> Yes, each combination is stored separately on disk (the storage engine
>> itself doesn't have special casing for composite column, at least not
>> yet). But as far as disk space is concerned, I suspect that sstable
>> compression makes this largely a non issue.
>> 
>> --
>> Sylvain
>> 
> 
> 


Re: Composite Column Types Storage

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Yes Aaron, I was not clear about Bloom Filters. I was thinking about the
column bloom filters when I specify an absolute value for Part1 of the
composite column and a start/end value for Part2 of the composite column

It is slowly dawning on me that I need a super-column to use column blooms
effectively and at the same time don't want the entire sub-column list
deserialized.

In fact, for my use-case I also do not need a column sampling index. Rather
I would much prefer a multi-level skip-list

Is there a way to customize how cassandra writes/reads it's key/column
indexes to SSTables. Any hooks/API that is available as of now should be
greatly helpful

On Fri, Sep 14, 2012 at 10:33 AM, aaron morton <aa...@thelastpickle.com>wrote:

> Range queries do not use bloom filters.
>
> Are you talking about row range queries ? Or a slice of columns in a row ?
>
> If you are getting a slice of columns from a single row, a bloom filter is
> used to locate the row.
> If you are getting a slice of columns from a range of rows, the bloom
> filter is used to locate the first row. After that is a scan.
>
> There are also row level bloom filters for columns on a row. These are
> used when you columns by names. If you are doing a slice with a start the
> bloom filter is not used, instead the row level column index is used (if
> present).
>
> Hope that helps.
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 13/09/2012, at 2:30 AM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> Thanks for the clarification. Even though compression solves disk space
> issue, we might still have Memtable bloat right?
>
> There is another issue to be handled for us. The queries are always going
> to be range queries with absolute match on part1 and range on part 2 of the
> composite columns
>
> Ex: Query <some-key> <Column-part-1> <Start-Id-part-2> <Limit>
>
> Range queries do not use bloom filters. It holds good for
> composite-columns also right? I believe I will end up writing BF bytes only
> to skip it later.
>
> If sharing had been possible, then <Column-part-1> alone could have gone
> into the bloom-filter, speeding up my queries really effectively.
>
> But as I understand, there are many levels of nesting possible in a
> composite type and casing at every level is a big task
>
> May be casing for the top-level or the first-part should be a good start?
>
> --
> Ravi
>
> On Wed, Sep 12, 2012 at 5:46 PM, Sylvain Lebresne <sy...@datastax.com>wrote:
>
>> > Is every <string>/<id> combination stored separately in disk
>>
>> Yes, each combination is stored separately on disk (the storage engine
>> itself doesn't have special casing for composite column, at least not
>> yet). But as far as disk space is concerned, I suspect that sstable
>> compression makes this largely a non issue.
>>
>> --
>> Sylvain
>>
>
>
>

Re: Composite Column Types Storage

Posted by aaron morton <aa...@thelastpickle.com>.
> Range queries do not use bloom filters. 
Are you talking about row range queries ? Or a slice of columns in a row ? 

If you are getting a slice of columns from a single row, a bloom filter is used to locate the row. 
If you are getting a slice of columns from a range of rows, the bloom filter is used to locate the first row. After that is a scan. 

There are also row level bloom filters for columns on a row. These are used when you columns by names. If you are doing a slice with a start the bloom filter is not used, instead the row level column index is used (if present). 

Hope that helps. 


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 13/09/2012, at 2:30 AM, Ravikumar Govindarajan <ra...@gmail.com> wrote:

> Thanks for the clarification. Even though compression solves disk space issue, we might still have Memtable bloat right?
> 
> There is another issue to be handled for us. The queries are always going to be range queries with absolute match on part1 and range on part 2 of the composite columns
> 
> Ex: Query <some-key> <Column-part-1> <Start-Id-part-2> <Limit> 
> 
> Range queries do not use bloom filters. It holds good for composite-columns also right? I believe I will end up writing BF bytes only to skip it later.
> 
> If sharing had been possible, then <Column-part-1> alone could have gone into the bloom-filter, speeding up my queries really effectively.
> 
> But as I understand, there are many levels of nesting possible in a composite type and casing at every level is a big task
> 
> May be casing for the top-level or the first-part should be a good start?
> 
> --
> Ravi
> 
> On Wed, Sep 12, 2012 at 5:46 PM, Sylvain Lebresne <sy...@datastax.com> wrote:
> > Is every <string>/<id> combination stored separately in disk
> 
> Yes, each combination is stored separately on disk (the storage engine
> itself doesn't have special casing for composite column, at least not
> yet). But as far as disk space is concerned, I suspect that sstable
> compression makes this largely a non issue.
> 
> --
> Sylvain
> 


Re: Composite Column Types Storage

Posted by Ravikumar Govindarajan <ra...@gmail.com>.
Thanks for the clarification. Even though compression solves disk space
issue, we might still have Memtable bloat right?

There is another issue to be handled for us. The queries are always going
to be range queries with absolute match on part1 and range on part 2 of the
composite columns

Ex: Query <some-key> <Column-part-1> <Start-Id-part-2> <Limit>

Range queries do not use bloom filters. It holds good for composite-columns
also right? I believe I will end up writing BF bytes only to skip it later.

If sharing had been possible, then <Column-part-1> alone could have gone
into the bloom-filter, speeding up my queries really effectively.

But as I understand, there are many levels of nesting possible in a
composite type and casing at every level is a big task

May be casing for the top-level or the first-part should be a good start?

--
Ravi

On Wed, Sep 12, 2012 at 5:46 PM, Sylvain Lebresne <sy...@datastax.com>wrote:

> > Is every <string>/<id> combination stored separately in disk
>
> Yes, each combination is stored separately on disk (the storage engine
> itself doesn't have special casing for composite column, at least not
> yet). But as far as disk space is concerned, I suspect that sstable
> compression makes this largely a non issue.
>
> --
> Sylvain
>

Re: Composite Column Types Storage

Posted by Sylvain Lebresne <sy...@datastax.com>.
> Is every <string>/<id> combination stored separately in disk

Yes, each combination is stored separately on disk (the storage engine
itself doesn't have special casing for composite column, at least not
yet). But as far as disk space is concerned, I suspect that sstable
compression makes this largely a non issue.

--
Sylvain