You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Stefan Kaufmann <st...@gmail.com> on 2010/08/12 17:30:07 UTC

Data Distribution / Replication

Hello again,

last day's I started several tests with Cassandra and learned quite some facts.

However, of course, there are still enough things I need to
understand. One thing is, how the data replication works.
For my Testing:
1. I set the replication Factor to 3, started with 1 active node (the
seed) and I inserted some test key's.
2. I started 2 more nodes, which joined the cluster.
3. I waited for the data to replicate, which didn't happen.
4. I inserted more key's, and it looked like they were distributed to
all three nodes.

So here is my question:
How can I ensure that every key exists at least on three nodes? So,
when I start with one node and later join 2 more - the data will be
distributed.
Shouldn't this happen automatically? Am I just not patient enough?
How is this handled in productive environments? For instance, one node
has a hardware failure, so it will be exchanged with a new blank one.
How does that one get it's data back?

I searched the mailinglist, the only answer I found was to copy the
data manually, is this true?

I'm currently using Cassandra 0.6.4 in our testing environment. I
chose the RackUnawareStrategy

Stefan

Re: Data Distribution / Replication

Posted by Benjamin Black <b...@b3k.us>.

#546
#1076
#1169
#1377

etc...

On Sat, Aug 14, 2010 at 12:05 PM, Bill de hÓra <bi...@dehora.net> wrote:
> That data suggests the inbuilt tools are a hazard and manual workarounds
> less so.
>
> Can you point me at the bugs?
>
> Bill
>
>
> On Fri, 2010-08-13 at 20:30 -0700, Benjamin Black wrote:
>> Number of bugs I've hit doing this with scp: 0
>> Number of bugs I've hit with streaming: 2 (and others found more)
>>
>> Also easier to monitor progress, manage bandwidth, etc.  I just prefer
>> using specialized tools that are really good at specific things.  This
>> is such a case.
>>
>> b
>>
>> On Fri, Aug 13, 2010 at 2:05 PM, Bill de hÓra <bi...@dehora.net> wrote:
>> > On Fri, 2010-08-13 at 09:51 -0700, Benjamin Black wrote:
>> >
>> >> My recommendation is to leave Autobootstrap disabled, copy the
>> >> datafiles over, and then run cleanup.  It is faster and more reliable
>> >> than streaming, in my experience.
>> >
>> > What is less reliable about streaming?
>> >
>> > Bill
>> >
>> >
>
>
>

Re: Data Distribution / Replication

Posted by Bill de hÓra <bi...@dehora.net>.

That data suggests the inbuilt tools are a hazard and manual workarounds
less so. 

Can you point me at the bugs?

Bill


On Fri, 2010-08-13 at 20:30 -0700, Benjamin Black wrote:
> Number of bugs I've hit doing this with scp: 0
> Number of bugs I've hit with streaming: 2 (and others found more)
> 
> Also easier to monitor progress, manage bandwidth, etc.  I just prefer
> using specialized tools that are really good at specific things.  This
> is such a case.
> 
> b
> 
> On Fri, Aug 13, 2010 at 2:05 PM, Bill de hÓra <bi...@dehora.net> wrote:
> > On Fri, 2010-08-13 at 09:51 -0700, Benjamin Black wrote:
> >
> >> My recommendation is to leave Autobootstrap disabled, copy the
> >> datafiles over, and then run cleanup.  It is faster and more reliable
> >> than streaming, in my experience.
> >
> > What is less reliable about streaming?
> >
> > Bill
> >
> >

Re: Data Distribution / Replication

Posted by Benjamin Black <b...@b3k.us>.

Number of bugs I've hit doing this with scp: 0
Number of bugs I've hit with streaming: 2 (and others found more)

Also easier to monitor progress, manage bandwidth, etc.  I just prefer
using specialized tools that are really good at specific things.  This
is such a case.

b

On Fri, Aug 13, 2010 at 2:05 PM, Bill de hÓra <bi...@dehora.net> wrote:
> On Fri, 2010-08-13 at 09:51 -0700, Benjamin Black wrote:
>
>> My recommendation is to leave Autobootstrap disabled, copy the
>> datafiles over, and then run cleanup.  It is faster and more reliable
>> than streaming, in my experience.
>
> What is less reliable about streaming?
>
> Bill
>
>

Re: Data Distribution / Replication

Posted by Bill de hÓra <bi...@dehora.net>.

On Fri, 2010-08-13 at 09:51 -0700, Benjamin Black wrote:

> My recommendation is to leave Autobootstrap disabled, copy the
> datafiles over, and then run cleanup.  It is faster and more reliable
> than streaming, in my experience.

What is less reliable about streaming?

Bill

Re: Data Distribution / Replication

Posted by Benjamin Black <b...@b3k.us>.

On Fri, Aug 13, 2010 at 10:13 PM, Stefan Kaufmann <st...@gmail.com> wrote:
>> My recommendation is to leave Autobootstrap disabled, copy the
>> datafiles over, and then run cleanup.  It is faster and more reliable
>> than streaming, in my experience.
>
> I thought about copying da Data manually. However if I have a running
> environment
> and add a node (or replace a broken one), how do I determine from which node I
> have to copy the data?
>

Range is taken from the node with the previous token on the ring.  If
you have tokens 0 and 2, the node with 2 has the range from 0-2.  If
you add a node with token 1, it takes range from the node with 2.
This is basics of ring operations.

b

Re: Data Distribution / Replication

Posted by Stefan Kaufmann <st...@gmail.com>.

> My recommendation is to leave Autobootstrap disabled, copy the
> datafiles over, and then run cleanup.  It is faster and more reliable
> than streaming, in my experience.

I thought about copying da Data manually. However if I have a running
environment
and add a node (or replace a broken one), how do I determine from which node I
have to copy the data?

Stefan

Re: Data Distribution / Replication

Posted by Benjamin Black <b...@b3k.us>.

On Fri, Aug 13, 2010 at 9:48 AM, Oleg Anastasjev <ol...@gmail.com> wrote:
> Benjamin Black <b <at> b3k.us> writes:
>
>> > 3. I waited for the data to replicate, which didn't happen.
>>
>> Correct, you need to run nodetool repair because the nodes were not
>> present when the writes came in.  You can also use a higher
>> consistency level to force read repair before returning data, which
>> will incrementally repair things.
>>
>
> Alternatively you could configure new nodes to bootstrap in storage-conf.xml. If
> bootstrap is enalbed you'll get data replicated on them as soon as they join
> cluster.
>

My recommendation is to leave Autobootstrap disabled, copy the
datafiles over, and then run cleanup.  It is faster and more reliable
than streaming, in my experience.


b

Re: Data Distribution / Replication

Posted by Oleg Anastasjev <ol...@gmail.com>.

Benjamin Black <b <at> b3k.us> writes:

> > 3. I waited for the data to replicate, which didn't happen.
> 
> Correct, you need to run nodetool repair because the nodes were not
> present when the writes came in.  You can also use a higher
> consistency level to force read repair before returning data, which
> will incrementally repair things.
> 

Alternatively you could configure new nodes to bootstrap in storage-conf.xml. If
bootstrap is enalbed you'll get data replicated on them as soon as they join
cluster.

Re: Data Distribution / Replication

Posted by Benjamin Black <b...@b3k.us>.

On Thu, Aug 12, 2010 at 8:30 AM, Stefan Kaufmann <st...@gmail.com> wrote:
> Hello again,
>
> last day's I started several tests with Cassandra and learned quite some facts.
>
> However, of course, there are still enough things I need to
> understand. One thing is, how the data replication works.
> For my Testing:
> 1. I set the replication Factor to 3, started with 1 active node (the
> seed) and I inserted some test key's.

This is not a correct concept of what a seed is.  I suggest you not
use the word 'seed' for it.

> 2. I started 2 more nodes, which joined the cluster.
> 3. I waited for the data to replicate, which didn't happen.

Correct, you need to run nodetool repair because the nodes were not
present when the writes came in.  You can also use a higher
consistency level to force read repair before returning data, which
will incrementally repair things.

> 4. I inserted more key's, and it looked like they were distributed to
> all three nodes.
>

Correct, they were up at the time and received the write operations directly.

Seems like you might benefit from reading the operations wiki:
http://wiki.apache.org/cassandra/Operations

b
b