You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Lee Mighdoll <le...@underneath.ca> on 2013/12/18 18:26:20 UTC

how wide to make wide rows in practice?

I think the recommendation once upon a time was to keep wide storage engine
internal rows from growing too large.  e.g. for time series, it was
recommended to partition samples by day or by hour to keep the size
manageable.

What's the current cassandra 2.0 advice on sizing for wide storage engine
rows?  Can we drop the added complexity of managing day/hour partitioning
for time series stores?

And what do you watch out for if the storage engine rows are a bit
uncomfortably large?  Do extra large rows slow the read path at all?  Or
something subtle like added latency from GC pressure at compaction time?

Cheers,
Lee

Re: how wide to make wide rows in practice?

Posted by Lee Mighdoll <le...@underneath.ca>.

Hi Rob, thanks for the refresher, and the the issue link (fixed today too-
thanks Sylvain!).

Cheers,
Lee


On Wed, Dec 18, 2013 at 10:47 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Wed, Dec 18, 2013 at 9:26 AM, Lee Mighdoll <le...@underneath.ca> wrote:
>
>> What's the current cassandra 2.0 advice on sizing for wide storage engine
>> rows?  Can we drop the added complexity of managing day/hour partitioning
>> for time series stores?
>>
>
> "A few hundred megs" at very most is generally
> recommended. in_memory_compaction_limit_in_mb still defaults to 64mb, so
> rows greater than this size are compacted on disk...
>
> Cassandra 2.0 and CQL3 storage don't meaningfully change underlying
> storage assumptions. It just packs an abstraction layer on top. Cassandra
> 2.1 moves some of that abstraction down into storage, but most fundamental
> assumptions will still remain the same.
>
> https://issues.apache.org/jira/browse/CASSANDRA-5417
>
> =Rob
>

Re: how wide to make wide rows in practice?

Posted by Robert Coli <rc...@eventbrite.com>.

On Wed, Dec 18, 2013 at 9:26 AM, Lee Mighdoll <le...@underneath.ca> wrote:

> What's the current cassandra 2.0 advice on sizing for wide storage engine
> rows?  Can we drop the added complexity of managing day/hour partitioning
> for time series stores?
>

"A few hundred megs" at very most is generally
recommended. in_memory_compaction_limit_in_mb still defaults to 64mb, so
rows greater than this size are compacted on disk...

Cassandra 2.0 and CQL3 storage don't meaningfully change underlying storage
assumptions. It just packs an abstraction layer on top. Cassandra 2.1 moves
some of that abstraction down into storage, but most fundamental
assumptions will still remain the same.

https://issues.apache.org/jira/browse/CASSANDRA-5417

=Rob