You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kevin Burton <bu...@spinn3r.com> on 2015/01/23 18:59:43 UTC

Disabling the write ahead log with 2 data centers?

The WAL (and walls in general) impose a performance overhead.

If one were to just take a machine out of the cluster, permanently, when a
machine crashes, you could quickly get all the shards back up to N replicas
after a node crashes.

So realistically, running with a WAL is somewhat redundant.

ESPECIALLY when you have 2 data centers at 3 replicas in each datacenter
(for a total of 6 replicas).

I think this would only be about a 15% performance overhead.

Additionally, on flash, if you lay out the SSTables properly, you arguably
don’t need a WAL because your SSTable itself can be a wall and you could
run without memtables.   This has been proposed in a number of situations.
Especially on something like FusionIO …

Thoughts?

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Re: Disabling the write ahead log with 2 data centers?

Posted by Robert Coli <rc...@eventbrite.com>.
On Fri, Jan 23, 2015 at 9:59 AM, Kevin Burton <bu...@spinn3r.com> wrote:

> The WAL (and walls in general) impose a performance overhead.
>
> If one were to just take a machine out of the cluster, permanently, when a
> machine crashes, you could quickly get all the shards back up to N replicas
> after a node crashes.
>

It depends a bit on your consistencylevel and frequency of repair, and (as
Rusty says) how much you care about your data. But why take the machine
out? Just have it rejoin... taking it out means you lower the "unique
replica count" by one and lose any data that for whatever reason was only
propagated there.

But yes, the overhead of the commit log is why the durable_writes:false
cassandra.yaml option exists. Some people are ok with losing everything
that was in a memtable on only the single crashed node; most are not, and
should not disable the commit log.

=Rob

Re: Disabling the write ahead log with 2 data centers?

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
Well... it depends.   Are you saying whenever a machine dies, or any
reason, you'd bootstrap a new one in it's place?  Or do you just not care
about the data?

There are cases where it might be ok (if you're using Cassandra as a cache)
but if it's your source of truth I think this is likely to bite you more
than it's likely to be of benefit.

On Fri Jan 23 2015 at 10:01:58 AM Kevin Burton <bu...@spinn3r.com> wrote:

> The WAL (and walls in general) impose a performance overhead.
>
> If one were to just take a machine out of the cluster, permanently, when a
> machine crashes, you could quickly get all the shards back up to N replicas
> after a node crashes.
>
> So realistically, running with a WAL is somewhat redundant.
>
> ESPECIALLY when you have 2 data centers at 3 replicas in each datacenter
> (for a total of 6 replicas).
>
> I think this would only be about a 15% performance overhead.
>
> Additionally, on flash, if you lay out the SSTables properly, you arguably
> don’t need a WAL because your SSTable itself can be a wall and you could
> run without memtables.   This has been proposed in a number of situations.
> Especially on something like FusionIO …
>
> Thoughts?
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>