You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by James Lee <Ja...@metaswitch.com> on 2013/06/18 14:02:53 UTC

Data not fully replicated with 2 nodes and replication factor 2

Hello,

I'm seeing a strange problem with a 2-node Cassandra test deployment, where it seems that data isn't being replicated among the nodes as I would expect.  I suspect this may be a configuration issue of some kind, but have been unable to figure what I should change.

The setup is as follows:

*         Two Cassandra nodes in the cluster (they each have themselves and the other node as seeds in cassandra.yaml).

*         Create 40 keyspaces, each with simple replication strategy and replication factor 2.

*         Populate 125,000 rows into each keyspace, using a pycassa client with a connection pool pointed at both nodes (I've verified that pycassa does indeed send roughly half the writes to each node).  These are populated with writes using consistency level of 1.

*         Wait 30 minutes (to give replications a chance to complete).

*         Do random reads of the rows in the keyspaces, again using a pycassa client with a connection pool pointed at both nodes.  These are read using consistency level 1.

I'm finding that the vast majority of reads are successful, but a small proportion (~0.1%) are returned as Not Found.  If I manually try to look up those keys using cassandra-cli, I see that they are returned when querying one of the nodes, but not when querying the other.  So it seems like some of the rows have simply not been replicated.

I'm not sure how I can monitor the status of ongoing replications, but the system has been idle for many 10s of minutes and the total database size is only about 5GB, so I don't think there are any further ongoing operations.

Any suggestions?  In case it's relevant, my setup is:

*         Cassandra 1.2.2, running on Linux

*         Sun Java 1.7.0_10-b18 64-bit

*         Java heap settings: -Xms8192M -Xmx8192M -Xmn2048M

Thank you,
James Lee