You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Prakrati Agrawal <Pr...@mu-sigma.com> on 2012/06/08 08:49:47 UTC

Not getting all data from a 2 node cluster

Dear all

I am using Cassandra to retrieve a number of rows and columns stored in it.
Initially I had a 1 node cluster and I flooded it with data. I ran a Hector code to retrieve data from it I got the following output:
Total number of rows in the database are 396
Total number of columns in the database are 16316426
Now I added one more node to it by doing the following steps:

1.       I added both the nodes ip addresses in the seeds property in Cassandra.yaml file.

2.       I also changed the rpc_address to 0.0.0.0 in both the nodes config file.

3.       I changed the listen_address to their respective ip addresses.

4.       I specified the initial token in the new node config file

5.       I did not specify auto_bootstrap option anywhere because there is no such option available in Cassandra 1.1.0

6.       Then I restarted the first node and the new node
Now after adding the second node when I run the same Hector code, I am getting the following result:
Total number of rows in the database are 183
Total number of columns in the database are 7903753
I am using the consistency level 1 and I did not specify any replication factor while creating the keyspace. I used the following link for the reference:
http://www.datastax.com/docs/0.7/getting_started/configuring
Please tell me what step am I doing wrong to not get the entire data on a 2 node cluster ?
Thanks and Regards
Prakrati




________________________________
This email message may contain proprietary, private and confidential information. The information transmitted is intended only for the person(s) or entities to which it is addressed. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited and may be illegal. If you received this in error, please contact the sender and delete the message from your system.

Mu Sigma takes all reasonable steps to ensure that its electronic communications are free from viruses. However, given Internet accessibility, the Company cannot accept liability for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date virus checking software.

Re: Not getting all data from a 2 node cluster

Posted by Boris Yen <yu...@gmail.com>.
My guess is your RF is 1. When the new node joins the cluster, only part
(depends on the token) of the data goes to this new node.

On Fri, Jun 8, 2012 at 2:49 PM, Prakrati Agrawal <
Prakrati.Agrawal@mu-sigma.com> wrote:

>  Dear all****
>
> ** **
>
> I am using Cassandra to retrieve a number of rows and columns stored in it.
> ****
>
> Initially I had a 1 node cluster and I flooded it with data. I ran a
> Hector code to retrieve data from it I got the following output:****
>
> Total number of rows in the database are 396****
>
> Total number of columns in the database are 16316426****
>
> Now I added one more node to it by doing the following steps:****
>
> **1.       **I added both the nodes ip addresses in the seeds property in
> Cassandra.yaml file.****
>
> **2.       **I also changed the rpc_address to 0.0.0.0 in both the nodes
> config file.****
>
> **3.       **I changed the listen_address to their respective ip
> addresses.****
>
> **4.       **I specified the initial token in the new node config file****
>
> **5.       **I did not specify auto_bootstrap option anywhere because
> there is no such option available in Cassandra 1.1.0****
>
> **6.       **Then I restarted the first node and the new node****
>
> Now after adding the second node when I run the same Hector code, I am
> getting the following result:****
>
> Total number of rows in the database are 183****
>
> Total number of columns in the database are 7903753****
>
> I am using the consistency level 1 and I did not specify any replication
> factor while creating the keyspace. I used the following link for the
> reference: ****
>
> http://www.datastax.com/docs/0.7/getting_started/configuring****
>
> Please tell me what step am I doing wrong to not get the entire data on a
> 2 node cluster ?****
>
> Thanks and Regards****
>
> Prakrati****
>
> ** **
>
> ** **
>
> ** **
>
> ------------------------------
> This email message may contain proprietary, private and confidential
> information. The information transmitted is intended only for the person(s)
> or entities to which it is addressed. Any review, retransmission,
> dissemination or other use of, or taking of any action in reliance upon,
> this information by persons or entities other than the intended recipient
> is prohibited and may be illegal. If you received this in error, please
> contact the sender and delete the message from your system.
>
> Mu Sigma takes all reasonable steps to ensure that its electronic
> communications are free from viruses. However, given Internet
> accessibility, the Company cannot accept liability for any virus introduced
> by this e-mail or any attachment and you are advised to use up-to-date
> virus checking software.
>