You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Drew Kutcharian <dr...@venarc.com> on 2014/02/21 06:21:43 UTC

Consistency Level One Question

Hi Guys,

I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows?

Thanks,

Drew

Re: Consistency Level One Question

Posted by graham sanderson <gr...@vast.com>.

Writing at a consistency level of ONE means that your write will be acknowledged as soon as one replica confirms that it has made the write to memtable and the commit log (might not be quite synced to disk, but that’s a separate issue).
All the writes are submitted in parallel, so it is very possible that the data will be on the other nodes very quickly

Reading at ONE means that only one node will be asked for the data (unless you have rapid-read-protection AND the node you asked is very slow to respond).

So writing/reading at ONE means that it is possible (depending on how long you wait and a bunch of other factors) that the read - if it goes to another replica - *may* not return the data.

The safest thing to do is QUORUM writes and reads - this way the write only is acknowledged when 2 of the 3 replicas have confirmed the data is written; subsequently your read will go to at least 2 nodes, at least one of which must therefore have the latest data, and the read command returns the most up to date data amongst the responding nodes.

On Feb 20, 2014, at 11:21 PM, Drew Kutcharian <dr...@venarc.com> wrote:

> Hi Guys,
> 
> I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows?
> 
> Thanks,
> 
> Drew
>

Re: Consistency Level One Question

Posted by Drew Kutcharian <dr...@venarc.com>.

Thanks, this clears things up. 

> On Feb 21, 2014, at 6:47 AM, Edward Capriolo <ed...@gmail.com> wrote:
> 
> When you write at one, as soon as one node acknowledges the write the ack is returned to the client. This means if you quickly read from aome other node
> 1)you may get the result because by the time the read is processed the data may be on that node
> 2)the node you read from may proxy the request to the node woth the data or not
> 3)you may get a column not found because the read might hit a node where the data does not exist yet.
> 
> Generally even at level one the replication is fast. I have done an experiment on what you are asking. Write.one read from another as soon as client gets  an ack. Most of the time the data is replicated by the time the second requeat is received. However "most of the time" is not a guarentee. If the nodes are geographically separate who is to say if the firat request and the second route around the internet a different way and the second action arrives on a node before the first. That is eventual consistency for you.
> 
> On Friday, February 21, 2014, graham sanderson <gr...@vast.com> wrote:
> > My bad; should have checked the code:
> >
> >     /**
> >      * This function executes local and remote reads, and blocks for the results:
> >      *
> >      * 1. Get the replica locations, sorted by response time according to the snitch
> >      * 2. Send a data request to the closest replica, and digest requests to either
> >      *    a) all the replicas, if read repair is enabled
> >      *    b) the closest R-1 replicas, where R is the number required to satisfy the ConsistencyLevel
> >      * 3. Wait for a response from R replicas
> >      * 4. If the digests (if any) match the data return the data
> >      * 5. else carry out read repair by getting data from all the nodes.
> >      */
> >
> > On Feb 21, 2014, at 3:10 AM, Duncan Sands <du...@gmail.com> wrote:
> >
> >> Hi Graham,
> >>
> >> On 21/02/14 07:54, graham sanderson wrote:
> >>> Note also; that reading at ONE there will be no read repair, since the coordinator does not know that another replica has stale data (remember at ONE, basically only one node is asked for the answer).
> >>
> >> I don't think this is right.  My understanding is that while only one node will be sent a direct read request, all other replicas will (not on every query - it depends on the value of read_repair_chance) get a background read repair request.  You can test this experimentally using cqlsh and turning tracing on: issue a read request many times.  Most of the time you will see that the coordinator sends a message to one node, but from time to time (depending on read_repair_chance) you will see it sending messages to many nodes.
> >>
> >> Best wishes, Duncan.
> >>
> >>>
> >>> In practice for our use cases, we always write at LOCAL_QUORUM (failing the whole update if that doesn’t work - stale data is OK if >1 node is down), and we read at LOCAL_QUORUM, but (because stale data is better than no data), we will fall back per read request to LOCAL_ONE if we detect that there were insufficient nodes - this lets us cope with 2 down nodes in a 3 replica environment (or more if the nodes are not consecutive in the ring).
> >>>
> >>> On Feb 20, 2014, at 11:21 PM, Drew Kutcharian <dr...@venarc.com> wrote:
> >>>
> >>>> Hi Guys,
> >>>>
> >>>> I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Drew
> >>>>
> >>>
> >>
> >
> >
> 
> -- 
> Sorry this was sent from mobile. Will do less grammar and spell check than usual.

Re: Consistency Level One Question

Posted by Edward Capriolo <ed...@gmail.com>.

When you write at one, as soon as one node acknowledges the write the ack
is returned to the client. This means if you quickly read from aome other
node
1)you may get the result because by the time the read is processed the data
may be on that node
2)the node you read from may proxy the request to the node woth the data or
not
3)you may get a column not found because the read might hit a node where
the data does not exist yet.

Generally even at level one the replication is fast. I have done an
experiment on what you are asking. Write.one read from another as soon as
client gets  an ack. Most of the time the data is replicated by the time
the second requeat is received. However "most of the time" is not a
guarentee. If the nodes are geographically separate who is to say if the
firat request and the second route around the internet a different way and
the second action arrives on a node before the first. That is eventual
consistency for you.

On Friday, February 21, 2014, graham sanderson <gr...@vast.com> wrote:
> My bad; should have checked the code:
>
>     /**
>      * This function executes local and remote reads, and blocks for the
results:
>      *
>      * 1. Get the replica locations, sorted by response time according to
the snitch
>      * 2. Send a data request to the closest replica, and digest requests
to either
>      *    a) all the replicas, if read repair is enabled
>      *    b) the closest R-1 replicas, where R is the number required to
satisfy the ConsistencyLevel
>      * 3. Wait for a response from R replicas
>      * 4. If the digests (if any) match the data return the data
>      * 5. else carry out read repair by getting data from all the nodes.
>      */
>
> On Feb 21, 2014, at 3:10 AM, Duncan Sands <du...@gmail.com> wrote:
>
>> Hi Graham,
>>
>> On 21/02/14 07:54, graham sanderson wrote:
>>> Note also; that reading at ONE there will be no read repair, since the
coordinator does not know that another replica has stale data (remember at
ONE, basically only one node is asked for the answer).
>>
>> I don't think this is right.  My understanding is that while only one
node will be sent a direct read request, all other replicas will (not on
every query - it depends on the value of read_repair_chance) get a
background read repair request.  You can test this experimentally using
cqlsh and turning tracing on: issue a read request many times.  Most of the
time you will see that the coordinator sends a message to one node, but
from time to time (depending on read_repair_chance) you will see it sending
messages to many nodes.
>>
>> Best wishes, Duncan.
>>
>>>
>>> In practice for our use cases, we always write at LOCAL_QUORUM (failing
the whole update if that doesn't work - stale data is OK if >1 node is
down), and we read at LOCAL_QUORUM, but (because stale data is better than
no data), we will fall back per read request to LOCAL_ONE if we detect that
there were insufficient nodes - this lets us cope with 2 down nodes in a 3
replica environment (or more if the nodes are not consecutive in the ring).
>>>
>>> On Feb 20, 2014, at 11:21 PM, Drew Kutcharian <dr...@venarc.com> wrote:
>>>
>>>> Hi Guys,
>>>>
>>>> I wanted to get some clarification on what happens when you write and
read at consistency level 1. Say I have a keyspace with replication factor
of 3 and a table which will contain write-once/read-only wide rows. If I
write at consistency level 1 and the write happens on node A and I read
back at consistency level 1 from another node other than A, say B, will C*
return "not found" or will it trigger a read-repair before responding? In
addition, what's the best consistency level for reading/writing
write-once/read-only wide rows?
>>>>
>>>> Thanks,
>>>>
>>>> Drew
>>>>
>>>
>>
>
>

-- 
Sorry this was sent from mobile. Will do less grammar and spell check than
usual.

Re: Consistency Level One Question

Posted by graham sanderson <gr...@vast.com>.

My bad; should have checked the code:

    /**
     * This function executes local and remote reads, and blocks for the results:
     *
     * 1. Get the replica locations, sorted by response time according to the snitch
     * 2. Send a data request to the closest replica, and digest requests to either
     *    a) all the replicas, if read repair is enabled
     *    b) the closest R-1 replicas, where R is the number required to satisfy the ConsistencyLevel
     * 3. Wait for a response from R replicas
     * 4. If the digests (if any) match the data return the data
     * 5. else carry out read repair by getting data from all the nodes.
     */

On Feb 21, 2014, at 3:10 AM, Duncan Sands <du...@gmail.com> wrote:

> Hi Graham,
> 
> On 21/02/14 07:54, graham sanderson wrote:
>> Note also; that reading at ONE there will be no read repair, since the coordinator does not know that another replica has stale data (remember at ONE, basically only one node is asked for the answer).
> 
> I don't think this is right.  My understanding is that while only one node will be sent a direct read request, all other replicas will (not on every query - it depends on the value of read_repair_chance) get a background read repair request.  You can test this experimentally using cqlsh and turning tracing on: issue a read request many times.  Most of the time you will see that the coordinator sends a message to one node, but from time to time (depending on read_repair_chance) you will see it sending messages to many nodes.
> 
> Best wishes, Duncan.
> 
>> 
>> In practice for our use cases, we always write at LOCAL_QUORUM (failing the whole update if that doesn’t work - stale data is OK if >1 node is down), and we read at LOCAL_QUORUM, but (because stale data is better than no data), we will fall back per read request to LOCAL_ONE if we detect that there were insufficient nodes - this lets us cope with 2 down nodes in a 3 replica environment (or more if the nodes are not consecutive in the ring).
>> 
>> On Feb 20, 2014, at 11:21 PM, Drew Kutcharian <dr...@venarc.com> wrote:
>> 
>>> Hi Guys,
>>> 
>>> I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows?
>>> 
>>> Thanks,
>>> 
>>> Drew
>>> 
>> 
>

Re: Consistency Level One Question

Posted by Duncan Sands <du...@gmail.com>.

Hi Graham,

On 21/02/14 07:54, graham sanderson wrote:
> Note also; that reading at ONE there will be no read repair, since the coordinator does not know that another replica has stale data (remember at ONE, basically only one node is asked for the answer).

I don't think this is right.  My understanding is that while only one node will 
be sent a direct read request, all other replicas will (not on every query - it 
depends on the value of read_repair_chance) get a background read repair 
request.  You can test this experimentally using cqlsh and turning tracing on: 
issue a read request many times.  Most of the time you will see that the 
coordinator sends a message to one node, but from time to time (depending on 
read_repair_chance) you will see it sending messages to many nodes.

Best wishes, Duncan.

>
> In practice for our use cases, we always write at LOCAL_QUORUM (failing the whole update if that doesn’t work - stale data is OK if >1 node is down), and we read at LOCAL_QUORUM, but (because stale data is better than no data), we will fall back per read request to LOCAL_ONE if we detect that there were insufficient nodes - this lets us cope with 2 down nodes in a 3 replica environment (or more if the nodes are not consecutive in the ring).
>
> On Feb 20, 2014, at 11:21 PM, Drew Kutcharian <dr...@venarc.com> wrote:
>
>> Hi Guys,
>>
>> I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows?
>>
>> Thanks,
>>
>> Drew
>>
>

Re: Consistency Level One Question

Posted by graham sanderson <gr...@vast.com>.

Note also; that reading at ONE there will be no read repair, since the coordinator does not know that another replica has stale data (remember at ONE, basically only one node is asked for the answer).

In practice for our use cases, we always write at LOCAL_QUORUM (failing the whole update if that doesn’t work - stale data is OK if >1 node is down), and we read at LOCAL_QUORUM, but (because stale data is better than no data), we will fall back per read request to LOCAL_ONE if we detect that there were insufficient nodes - this lets us cope with 2 down nodes in a 3 replica environment (or more if the nodes are not consecutive in the ring).

On Feb 20, 2014, at 11:21 PM, Drew Kutcharian <dr...@venarc.com> wrote:

> Hi Guys,
> 
> I wanted to get some clarification on what happens when you write and read at consistency level 1. Say I have a keyspace with replication factor of 3 and a table which will contain write-once/read-only wide rows. If I write at consistency level 1 and the write happens on node A and I read back at consistency level 1 from another node other than A, say B, will C* return “not found” or will it trigger a read-repair before responding? In addition, what’s the best consistency level for reading/writing write-once/read-only wide rows?
> 
> Thanks,
> 
> Drew
>