You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Eric Czech <er...@nextbigsound.com> on 2011/10/02 22:25:50 UTC

unwanted node discovery

We're exploring a data processing procedure where we snapshot our production
cluster data and move that data to a new cluster for analysis but I'm having
some strange issues where the analysis cluster is still somehow aware of the
production cluster (i.e. the production cluster ring is trying to include
nodes from the other cluster with the same token).

The seed addresses in cassandra.yaml definitely prohibit this type of
intersection between the two clusters so I'm guessing that it has something
to do with the information in the system sstables.

Is there anyway to duplicate raw sstables in an effort to "copy" a cluster
such that the copied cluster has a different name?  I know this usually
results in a "saved cluster name X != Y" sort of error but it looks like we
need to find some sort of way to do this logical separation.

Any help would be much appreciated!

Thanks.

Re: unwanted node discovery

Posted by Eric Czech <er...@nextbigsound.com>.
Another thing Edward if you don't mind, how does cassandra choose a node to
associate with a token if there is more than one node with the same token?
 I know that's definitely not a favorable situation to be in, but I'm
curious how my production ring chose to switch ownership of the tokens.

On Sun, Oct 2, 2011 at 3:14 PM, Edward Capriolo <ed...@gmail.com>wrote:

>
>
> On Sun, Oct 2, 2011 at 4:25 PM, Eric Czech <er...@nextbigsound.com> wrote:
>
>> We're exploring a data processing procedure where we snapshot our
>> production cluster data and move that data to a new cluster for analysis but
>> I'm having some strange issues where the analysis cluster is still somehow
>> aware of the production cluster (i.e. the production cluster ring is trying
>> to include nodes from the other cluster with the same token).
>>
>> The seed addresses in cassandra.yaml definitely prohibit this type of
>> intersection between the two clusters so I'm guessing that it has something
>> to do with the information in the system sstables.
>>
>> Is there anyway to duplicate raw sstables in an effort to "copy" a cluster
>> such that the copied cluster has a different name?  I know this usually
>> results in a "saved cluster name X != Y" sort of error but it looks like we
>> need to find some sort of way to do this logical separation.
>>
>> Any help would be much appreciated!
>>
>> Thanks.
>>
>
> Cassandra stores information about the cluster topology in the system
> table. This is stored in the LocationInfo column family. If you set
> AutoBootstrap to false, assign the Initial Token correctly and wipe the
> LocationInfo column family. Cassandra will have no memory of the topology.
> (You can also wipe the entire system keyspace but then you have to reinstall
> the schema)
>

Re: unwanted node discovery

Posted by Eric Czech <er...@nextbigsound.com>.
Thanks Edward.  So would you say this is a good strategy:

1.  snapshot files from production cluster
2.  move snapshot files to analysis cluster in a one-to-one node fashion
(the system/LocationInfo* sstables could be excluded here but I'm moving
them all because the transfer is also part of our DR strategy)
3.  delete LocationInfo* sstables in system keyspace on each node in
analysis cluster
4.  configure analysis cluster nodes to have a different cluster name
5.  set initial token for each analysis node (in cassandra.yaml) to the
token claimed by the corresponding production cluster node
6.  start cassandra (or brisk really) on each analysis node to create
separate cluster

Any reason that procedure wouldn't work?

On Sun, Oct 2, 2011 at 3:14 PM, Edward Capriolo <ed...@gmail.com>wrote:

>
>
> On Sun, Oct 2, 2011 at 4:25 PM, Eric Czech <er...@nextbigsound.com> wrote:
>
>> We're exploring a data processing procedure where we snapshot our
>> production cluster data and move that data to a new cluster for analysis but
>> I'm having some strange issues where the analysis cluster is still somehow
>> aware of the production cluster (i.e. the production cluster ring is trying
>> to include nodes from the other cluster with the same token).
>>
>> The seed addresses in cassandra.yaml definitely prohibit this type of
>> intersection between the two clusters so I'm guessing that it has something
>> to do with the information in the system sstables.
>>
>> Is there anyway to duplicate raw sstables in an effort to "copy" a cluster
>> such that the copied cluster has a different name?  I know this usually
>> results in a "saved cluster name X != Y" sort of error but it looks like we
>> need to find some sort of way to do this logical separation.
>>
>> Any help would be much appreciated!
>>
>> Thanks.
>>
>
> Cassandra stores information about the cluster topology in the system
> table. This is stored in the LocationInfo column family. If you set
> AutoBootstrap to false, assign the Initial Token correctly and wipe the
> LocationInfo column family. Cassandra will have no memory of the topology.
> (You can also wipe the entire system keyspace but then you have to reinstall
> the schema)
>

Re: unwanted node discovery

Posted by Edward Capriolo <ed...@gmail.com>.
On Sun, Oct 2, 2011 at 4:25 PM, Eric Czech <er...@nextbigsound.com> wrote:

> We're exploring a data processing procedure where we snapshot our
> production cluster data and move that data to a new cluster for analysis but
> I'm having some strange issues where the analysis cluster is still somehow
> aware of the production cluster (i.e. the production cluster ring is trying
> to include nodes from the other cluster with the same token).
>
> The seed addresses in cassandra.yaml definitely prohibit this type of
> intersection between the two clusters so I'm guessing that it has something
> to do with the information in the system sstables.
>
> Is there anyway to duplicate raw sstables in an effort to "copy" a cluster
> such that the copied cluster has a different name?  I know this usually
> results in a "saved cluster name X != Y" sort of error but it looks like we
> need to find some sort of way to do this logical separation.
>
> Any help would be much appreciated!
>
> Thanks.
>

Cassandra stores information about the cluster topology in the system table.
This is stored in the LocationInfo column family. If you set AutoBootstrap
to false, assign the Initial Token correctly and wipe the LocationInfo
column family. Cassandra will have no memory of the topology. (You can also
wipe the entire system keyspace but then you have to reinstall the schema)

Re: unwanted node discovery

Posted by Eric Czech <er...@nextbigsound.com>.
The tokens were different than the production cluster and after closer
inspection a lot of data wasn't queryable (as excpected I suppose).  I set
the tokens and everything seems ok now.

Auto bootstrap was false so no issues there.

Thanks for the insight Shyamal!  It's good to finally have this up and
running.


On Sun, Oct 2, 2011 at 8:29 PM, Shyamal Prasad <sh...@member.fsf.org>wrote:

> >>>>> "Eric" == Eric Czech <er...@nextbigsound.com> writes:
>
>     Eric> Yea that's not a mapping I'd like to maintain either -- as an
>    Eric> experiment, I copied production sstables to the analysis
>    Eric> cluster and ran brisk/cassandra without specifying an initial
>    Eric> token (after deleting the LocationInfo* files and renaming the
>    Eric> cluster).
>
> Based on my understanding this will allow everything to start up, yes.
>
>    Eric>  As far as I can tell, everything is running normally but I'm
>    Eric> not sure how the cluster chose tokens for the nodes given that
>    Eric> I didn't specify them after just dropping the raw sstables
>    Eric> in.  I can still read data as usual from the column families
>    Eric> that were copied but I'm not sure how not specifying the
>    Eric> tokens affects everything.
>
> Did you check the ring to see what tokens you got for the analysis
> cluster? I would be surprised if you got the same ring configuration as
> production.
>
>    Eric>  Is some of my data just unreachable now because the tokens
>    Eric> weren't manually defined?
>
> I suspect your data is messed up. But the best way to determine it would
> be to examine the ring (use nodetool) - if it is the same as your
> production cluster you are good to go.
>
> Also, did you set your (non seed) nodes in the analysis cluster to auto
> bootstrap or not? That impacts what happens.
>
>
>    Eric>  This doesn't appear to be the case but is this something you
>    Eric> have tried too or do you understand the storage / topology
>    Eric> logic well enough to know that this isn't a viable strategy?
>
> No and No. I have been reading the code. Line 497 of
> org.apache.cassandra.service.StorageService.java on trunk is a good
> place to start since what happens depends somewhat on your specific
> cassandra.yaml settings (specifically auto bootstrap).
>
> I would be betting you are getting random tokens (look for "Generated
> random token..." in your log). Don't trust me, read the code. I have all
> of two weeks of experience with this stuff (and it's not quite my day
> job to be doing it either :-)
>
> Bottom line: I think you need to fix the seeds for your use case.
>
> Cheers!
> Shyamal
>

Re: unwanted node discovery

Posted by Shyamal Prasad <sh...@member.fsf.org>.
>>>>> "Eric" == Eric Czech <er...@nextbigsound.com> writes:

    Eric> Yea that's not a mapping I'd like to maintain either -- as an
    Eric> experiment, I copied production sstables to the analysis
    Eric> cluster and ran brisk/cassandra without specifying an initial
    Eric> token (after deleting the LocationInfo* files and renaming the
    Eric> cluster). 

Based on my understanding this will allow everything to start up, yes.

    Eric>  As far as I can tell, everything is running normally but I'm
    Eric> not sure how the cluster chose tokens for the nodes given that
    Eric> I didn't specify them after just dropping the raw sstables
    Eric> in.  I can still read data as usual from the column families
    Eric> that were copied but I'm not sure how not specifying the
    Eric> tokens affects everything.

Did you check the ring to see what tokens you got for the analysis
cluster? I would be surprised if you got the same ring configuration as
production.

    Eric>  Is some of my data just unreachable now because the tokens
    Eric> weren't manually defined?

I suspect your data is messed up. But the best way to determine it would
be to examine the ring (use nodetool) - if it is the same as your
production cluster you are good to go.

Also, did you set your (non seed) nodes in the analysis cluster to auto
bootstrap or not? That impacts what happens.


    Eric>  This doesn't appear to be the case but is this something you
    Eric> have tried too or do you understand the storage / topology
    Eric> logic well enough to know that this isn't a viable strategy?

No and No. I have been reading the code. Line 497 of
org.apache.cassandra.service.StorageService.java on trunk is a good
place to start since what happens depends somewhat on your specific
cassandra.yaml settings (specifically auto bootstrap).

I would be betting you are getting random tokens (look for "Generated
random token..." in your log). Don't trust me, read the code. I have all
of two weeks of experience with this stuff (and it's not quite my day
job to be doing it either :-)

Bottom line: I think you need to fix the seeds for your use case.

Cheers!
Shyamal

Re: unwanted node discovery

Posted by Eric Czech <er...@nextbigsound.com>.
Yea that's not a mapping I'd like to maintain either -- as an experiment, I
copied production sstables to the analysis cluster and ran brisk/cassandra
without specifying an initial token (after deleting the LocationInfo* files
and renaming the cluster).  As far as I can tell, everything is running
normally but I'm not sure how the cluster chose tokens for the nodes given
that I didn't specify them after just dropping the raw sstables in.  I can
still read data as usual from the column families that were copied but I'm
not sure how not specifying the tokens affects everything.  Is some of my
data just unreachable now because the tokens weren't manually defined?  This
doesn't appear to be the case but is this something you have tried too or do
you understand the storage / topology logic well enough to know that this
isn't a viable strategy?

On Sun, Oct 2, 2011 at 4:14 PM, Shyamal Prasad <sh...@member.fsf.org>wrote:

> >>>>> "Eric" == Eric Czech <er...@nextbigsound.com> writes:
>
>     Eric> Hi Shyamal, I was using the same cluster name but since
>    Eric> writing that first email, I've already had success bringing up
>    Eric> nodes in the analysis cluster with a different cluster name
>    Eric> after deleting the LocationInfo* tables.
>
>    Eric> How have you been setting the tokens in the copied version of
>    Eric> the cluster?  Are you just mapping them one-to-one on the
>    Eric> original cluster?
>
> Yep. It's brittle but it works. It's brittle because if the production
> cluster topology changes I have some work to do (since the copied
> SSTables will now not match the static mapping until I manually update
> it).
>
> I've yet to find a simpler way to do this (the LocationInfo CF stores
> (token, inetaddress) pairs, and those simply don't automagically
> transfer to a backup cluster).
>
> /Shyamal
>
>

Re: unwanted node discovery

Posted by Shyamal Prasad <sh...@member.fsf.org>.
>>>>> "Eric" == Eric Czech <er...@nextbigsound.com> writes:

    Eric> Hi Shyamal, I was using the same cluster name but since
    Eric> writing that first email, I've already had success bringing up
    Eric> nodes in the analysis cluster with a different cluster name
    Eric> after deleting the LocationInfo* tables.  

    Eric> How have you been setting the tokens in the copied version of
    Eric> the cluster?  Are you just mapping them one-to-one on the
    Eric> original cluster? 

Yep. It's brittle but it works. It's brittle because if the production
cluster topology changes I have some work to do (since the copied
SSTables will now not match the static mapping until I manually update
it).

I've yet to find a simpler way to do this (the LocationInfo CF stores
(token, inetaddress) pairs, and those simply don't automagically
transfer to a backup cluster).

/Shyamal


Re: unwanted node discovery

Posted by Eric Czech <er...@nextbigsound.com>.
Hi Shyamal,

I was using the same cluster name but since writing that first email, I've
already had success bringing up nodes in the analysis cluster with a
different cluster name after deleting the LocationInfo* tables.

How have you been setting the tokens in the copied version of the cluster?
 Are you just mapping them one-to-one on the original cluster?

On Sun, Oct 2, 2011 at 3:49 PM, Shyamal Prasad <sh...@member.fsf.org>wrote:

> >>>>> "Eric" == Eric Czech <er...@nextbigsound.com> writes:
>
>    Eric> We're exploring a data processing procedure where we snapshot
>    Eric> our production cluster data and move that data to a new
>    Eric> cluster for analysis but I'm having some strange issues where
>    Eric> the analysis cluster is still somehow aware of the production
>    Eric> cluster (i.e. the production cluster ring is trying to include
>    Eric> nodes from the other cluster with the same token).
>
> Are you using the same cluster name in for both clusters? If so, I would
> suggest you don't.
>
>    Eric> The seed addresses in cassandra.yaml definitely prohibit this
>    Eric> type of intersection between the two clusters so I'm guessing
>    Eric> that it has something to do with the information in the system
>    Eric> sstables.
>
> I'm sure you will get a more knowledgeable answer from people who have
> been doing this for a while: but I have to ask are copying over the
> LocationInfo* SSTables from the snapshot to the analysis cluster?
>
> The LocationInfo CF can record the endpoints in your production cluster.
> From the little I've read of the code (StorageService.java and
> SystemTable.java) it is possible (likely?) that endpoints from your
> production cluster will get added to your analysis cluster's Gossiper on
> startup. If you are using the same cluster name, well, there you have
> it.....
>
>    Eric> Is there anyway to duplicate raw sstables in an effort to
>    Eric> "copy" a cluster such that the copied cluster has a different
>    Eric> name?  I know this usually results in a "saved cluster name X
>    Eric> != Y" sort of error but it looks like we need to find some
>    Eric> sort of way to do this logical separation.
>
> Copying the raw tables and ignoring/deleting the
> data/system/LocationInfo* files has worked for me. But I have to add the
> disclaimer that I'm definitely a Cassandra newbie!
>
> Cheers!
> Shyamal
>

Re: unwanted node discovery

Posted by Shyamal Prasad <sh...@member.fsf.org>.
>>>>> "Eric" == Eric Czech <er...@nextbigsound.com> writes:

    Eric> We're exploring a data processing procedure where we snapshot
    Eric> our production cluster data and move that data to a new
    Eric> cluster for analysis but I'm having some strange issues where
    Eric> the analysis cluster is still somehow aware of the production
    Eric> cluster (i.e. the production cluster ring is trying to include
    Eric> nodes from the other cluster with the same token).

Are you using the same cluster name in for both clusters? If so, I would
suggest you don't.

    Eric> The seed addresses in cassandra.yaml definitely prohibit this
    Eric> type of intersection between the two clusters so I'm guessing
    Eric> that it has something to do with the information in the system
    Eric> sstables.

I'm sure you will get a more knowledgeable answer from people who have
been doing this for a while: but I have to ask are copying over the
LocationInfo* SSTables from the snapshot to the analysis cluster?

The LocationInfo CF can record the endpoints in your production cluster.
>From the little I've read of the code (StorageService.java and
SystemTable.java) it is possible (likely?) that endpoints from your
production cluster will get added to your analysis cluster's Gossiper on
startup. If you are using the same cluster name, well, there you have
it.....

    Eric> Is there anyway to duplicate raw sstables in an effort to
    Eric> "copy" a cluster such that the copied cluster has a different
    Eric> name?  I know this usually results in a "saved cluster name X
    Eric> != Y" sort of error but it looks like we need to find some
    Eric> sort of way to do this logical separation.

Copying the raw tables and ignoring/deleting the
data/system/LocationInfo* files has worked for me. But I have to add the
disclaimer that I'm definitely a Cassandra newbie!

Cheers!
Shyamal