You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Maxime <ma...@gmail.com> on 2014/11/11 17:36:15 UTC

replication_factor mismatch

Hello, I have a curious behaviour occurring.

- 7 Nodes custer
- RF on the Keyspace is 3
- Latest version of everything (C* and Python Drivers)
- All queries are at QUORUM level

Some of my larger queries are timing out, which is ok, it can happen. But
looking at the log, I see the following:

ReadTimeout: code=1200 [Timeout during read request] message="Operation
timed out - received only 2 responses." info={'received_responses': 2,
'data_retrieved': True, 'required_responses': 3, 'consistency': 5}

So the part confusing me is the "consistency", it says 5 while I would
normally expect 3 (the RF). So I received 2 responses, which should be ok
in a Quorum of RF 3 (quorum 1/2 of 3 = 2).

Why is the consistency 5? Could it be because the data is actually located
physically on 5 nodes despite the RF of 3? I ask about this possibility
because I know for a fact my cluster is not in a good repaired state. My
attempts at repairing resulted in different OOMs and extreme numbers of
SSTables (which I assume are all remnants of my previous issues with C* and
Secondary Indexes (!!!)). I've had to reboot the nodes after each attempt
and do a cleanup, but something tells me things are still messed up.

Is the python driver somehow automatically determining where the data is
located (despite the RF being different) and using this number instead of
the RF in the Quorum computation?

Re: replication_factor mismatch

Posted by Philip Thompson <ph...@datastax.com>.

This question probably belongs in the python driver's mailing list.

However, if your RF is in fact 3, then that failed query is running at
CL.ALL, not CL.QUORUM, which you can see because the required_responses is
3. The consistency: 5 is because consistency levels are an enum class, as
seen here:
http://datastax.github.io/python-driver/_modules/cassandra.html#ConsistencyLevel
and 5 is corresponding to CL.ALL

On Tue, Nov 11, 2014 at 10:36 AM, Maxime <ma...@gmail.com> wrote:

> Hello, I have a curious behaviour occurring.
>
> - 7 Nodes custer
> - RF on the Keyspace is 3
> - Latest version of everything (C* and Python Drivers)
> - All queries are at QUORUM level
>
> Some of my larger queries are timing out, which is ok, it can happen. But
> looking at the log, I see the following:
>
> ReadTimeout: code=1200 [Timeout during read request] message="Operation
> timed out - received only 2 responses." info={'received_responses': 2,
> 'data_retrieved': True, 'required_responses': 3, 'consistency': 5}
>
> So the part confusing me is the "consistency", it says 5 while I would
> normally expect 3 (the RF). So I received 2 responses, which should be ok
> in a Quorum of RF 3 (quorum 1/2 of 3 = 2).
>
> Why is the consistency 5? Could it be because the data is actually located
> physically on 5 nodes despite the RF of 3? I ask about this possibility
> because I know for a fact my cluster is not in a good repaired state. My
> attempts at repairing resulted in different OOMs and extreme numbers of
> SSTables (which I assume are all remnants of my previous issues with C* and
> Secondary Indexes (!!!)). I've had to reboot the nodes after each attempt
> and do a cleanup, but something tells me things are still messed up.
>
> Is the python driver somehow automatically determining where the data is
> located (despite the RF being different) and using this number instead of
> the RF in the Quorum computation?
>