You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Andreas Finke <An...@solvians.com> on 2015/02/06 01:15:38 UTC

Writing the same column frequently - anti pattern?

Hi,

we are currently writing the same column within a row multiple times (up to 10 times a second). I am familiar with the concept of tombstones in SSTables. My question is: I assume that in our case in most cases when a column gets overwritten it still resides in the memtable. So I assume for that particular case no tombstone is set but the column is replaced in memory and then the 'newest' version is flushed to disk.

Is this assumption correct? Or Is writing the same column an an anti-pattern?

I am thankful for any input.

Regards
Andi


Re: Writing the same column frequently - anti pattern?

Posted by Robert Coli <rc...@eventbrite.com>.
On Thu, Feb 5, 2015 at 4:15 PM, Andreas Finke <An...@solvians.com>
wrote:

>  we are currently writing the same column within a row multiple times (up
> to 10 times a second). I am familiar with the concept of tombstones in
> SSTables. My question is: I assume that in our case in most cases when a
> column gets overwritten it still resides in the memtable. So I assume for
> that particular case no tombstone is set but the column is replaced in
> memory and then the 'newest' version is flushed to disk.
>

Memtables are periodically flushed; some percentage of these "bursts" will
always cross a flush boundary, leading to at least two writes to disk for
the same exact value. This is without considering any concerns of the class
that Johnathan Haddad mentions.

I personally would be conceptually uncomfortable with such a design as
having a strong smell of Doing It Wrong, but you should be able to design a
test which illustrates how badly (or not?) it bloats your SSTables with
garbage in actual operation?

Or Is writing the same column an an anti-pattern?
>

Writing the same column 10 times in a second is likely to be an
anti-pattern for a log structured data-store with immutable data files.
Consider a memory oriented database?

=Rob

Re: Writing the same column frequently - anti pattern?

Posted by Jens Rantil <je...@tink.se>.
Hi,

If the writes are coming from the same machine, you could potentially
use request
collapsing
<https://github.com/Netflix/Hystrix/wiki/How-To-Use#request-collapsing> to
avoid the duplicate writes.

Just an idea,
Jens

On Fri, Feb 6, 2015 at 1:15 AM, Andreas Finke <An...@solvians.com>
wrote:

>  Hi,
>
>  we are currently writing the same column within a row multiple times (up
> to 10 times a second). I am familiar with the concept of tombstones in
> SSTables. My question is: I assume that in our case in most cases when a
> column gets overwritten it still resides in the memtable. So I assume for
> that particular case no tombstone is set but the column is replaced in
> memory and then the 'newest' version is flushed to disk.
>
>  Is this assumption correct? Or Is writing the same column an an
> anti-pattern?
>
>  I am thankful for any input.
>
>  Regards
> Andi
>
>


-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.rantil@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook <https://www.facebook.com/#!/tink.se> Linkedin
<http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary>
 Twitter <https://twitter.com/tink>

Re: Writing the same column frequently - anti pattern?

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
Well... this is actually only true if your server times are perfectly in
sync.  The reality is if 1 server is 50ms ahead and 1 is 50 behind, your
will actually end up with unpredictable results.

On Thu Feb 05 2015 at 4:22:43 PM Philip Thompson <
philip.thompson@datastax.com> wrote:

> You are correct. If an overwrite occurs while the original is still in the
> memtable, only the newest will be flushed to disk.
>
> On Thu, Feb 5, 2015 at 6:15 PM, Andreas Finke <An...@solvians.com>
> wrote:
>
>>  Hi,
>>
>>  we are currently writing the same column within a row multiple times
>> (up to 10 times a second). I am familiar with the concept of tombstones in
>> SSTables. My question is: I assume that in our case in most cases when a
>> column gets overwritten it still resides in the memtable. So I assume for
>> that particular case no tombstone is set but the column is replaced in
>> memory and then the 'newest' version is flushed to disk.
>>
>>  Is this assumption correct? Or Is writing the same column an an
>> anti-pattern?
>>
>>  I am thankful for any input.
>>
>>  Regards
>> Andi
>>
>>
>

Re: Writing the same column frequently - anti pattern?

Posted by Philip Thompson <ph...@datastax.com>.
You are correct. If an overwrite occurs while the original is still in the
memtable, only the newest will be flushed to disk.

On Thu, Feb 5, 2015 at 6:15 PM, Andreas Finke <An...@solvians.com>
wrote:

>  Hi,
>
>  we are currently writing the same column within a row multiple times (up
> to 10 times a second). I am familiar with the concept of tombstones in
> SSTables. My question is: I assume that in our case in most cases when a
> column gets overwritten it still resides in the memtable. So I assume for
> that particular case no tombstone is set but the column is replaced in
> memory and then the 'newest' version is flushed to disk.
>
>  Is this assumption correct? Or Is writing the same column an an
> anti-pattern?
>
>  I am thankful for any input.
>
>  Regards
> Andi
>
>