You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Pierre Chalamet <pi...@chalamet.net> on 2011/09/14 15:33:12 UTC

Get CL ONE / NTS

Hello,

I have 2 datacenters. Cassandra is configured as follow:
- RackInferringSnitch
- NetworkTopologyStrategy for CF
- strategy_options: DC1:3 DC2:3

Data are written using CL LOCAL_QUORUM so data written from one datacenter will eventually be replicated to the other datacenter. Data is always written exactly once. 

On the other side, I'd like to improve the read path. I'm using actually the CL ONE since data is only written once (ie: timestamp is more or less meaningless in my case).

This is where I have some doubts: if data is written on DC1 and tentatively read from DC2 while the data is still not replicated or partially replicated (for whatever good reason since replication is async), what is the behavior of Get with CL ONE / NTS ? 
1/ Will I have an error because DC2 does not have any copy of the data ? 
2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
3/ In case of partial replication to DC2, will I see sometimes errors about servers not holding the data in DC2 ?
4/ Does Get CL ONE failed as soon as the fastest server to answer tell it does not have the data or does it waits until all servers tell they do not have the data ? 

Thanks a lot,
- Pierre

Re: Get CL ONE / NTS

Posted by Pierre Chalamet <pi...@chalamet.net>.
I will look at that to understand the behavior of ONE with NTS.
Thanks.

- Pierre

-----Original Message-----
From: aaron morton <aa...@thelastpickle.com>
Date: Fri, 16 Sep 2011 10:11:30 
To: <us...@cassandra.apache.org>
Reply-To: user@cassandra.apache.org
Subject: Re: Get CL ONE / NTS

> What I’m missing is a clear behavior for CL.ONE. I’m unsure about what nodes are used by ONE and how the filtering of missing data/error is done. I’ve landed in ReadCallback.java but error handling is out of my reach for the moment.

Start with StorageProxy.fetch() to see which nodes are considered to be part of the request. ReadCallback.ctor() will decide which are actually involved based on the CL and RR been enabled.

At CL ONE there is no checkin of the replica responses for consistency, as there is only one response. If RR is enabled it will start from ReadCallback.maybeResolveForRepair(). 

Cheers



-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2011, at 7:21 PM, Pierre Chalamet wrote:

> I do not agree here. I trade “consistency” (it’s more data miss than consistency here) over performance in my case.
> I’m okay to handle the popping of the Spanish inquisition in the current DC by triggering a new read with a stronger CL somewhere else (for example in other DCs).
> If the data is nowhere to be found or nothing is reachable, well, it’s sad but true but it will be the end of the game. Fine.
>  
> What I’m missing is a clear behavior for CL.ONE. I’m unsure about what nodes are used by ONE and how the filtering of missing data/error is done. I’ve landed in ReadCallback.java but error handling is out of my reach for the moment.
>  
> Thanks,
> - Pierre
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Thursday, September 15, 2011 12:27 AM
> To: user@cassandra.apache.org
> Subject: Re: Get CL ONE / NTS
>  
> Are you advising CL.ONE does not worth the game when considering
> read performance ?
> Consistency is not performance, it's a whole new thing to tune in your application. If you have performance issues deal with those as performance issues, better code / data model / hard ware. 
>  
> By the way, I do not have consistency problem at all - data is only written
> once
> Nobody expects a consistency problem. It's chief weapon is surprise. Surprise and fear. It's two weapons are fear and surprise. And so forth http://www.youtube.com/watch?v=Ixgc_FGam3s
>  
> If you write at LOCAL QUORUM in DC 1 and DC 2 is down at the start of the request, a hint will be stored in DC 1. Some time later when DC 2 comes back that hint will be sent to DC 2. If in the mean time you read from DC 2 at CL ONE you will not get that change. With Read Repair enabled it will repair in the background and you may get a different response on the next read (Am guessing here, cannot remember exactly how RR works cross DC) 
>  
>  Cheers
>  
>  
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 15/09/2011, at 10:07 AM, Pierre Chalamet wrote:
> 
> 
> Thanks Aaron, didn't seen your answer before mine.
> 
> I do agree for 2/ I might have read error. Good suggestion to use
> EACH_QUORUM  - it could be a good trade off to read at this level if ONE
> fails.
> 
> Maybe using LOCAL_QUORUM might be a good answer and will avoid headache
> after all. Are you advising CL.ONE does not worth the game when considering
> read performance ?
> 
> By the way, I do not have consistency problem at all - data is only written
> once (and if more it is always the same data) and read several times across
> DC. I only have replication problems. That's why I'm more inclined to use
> CL.ONE for read if possible.
> 
> Thanks,
> - Pierre
> 
> 
> -----Original Message-----
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Wednesday, September 14, 2011 11:48 PM
> To: user@cassandra.apache.org; pierre@chalamet.net
> Subject: Re: Get CL ONE / NTS
> 
> Your current approach to Consistency opens the door to some inconsistent
> behavior. 
> 
> 
> 1/ Will I have an error because DC2 does not have any copy of the data ?
> If you read from DC2 at CL ONE and the data is not replicated it will not be
> returned. 
> 
> 
> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
> ?
> Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the
> DC's. If DC2 is behind DC1 then you will get the data form DC1. 
> 
> 
> 3/ In case of partial replication to DC2, will I see sometimes errors
> about servers not holding the data in DC2 ?
> Depending on the API call and the client, working at CL ONE, you will see
> either errors or missing data. 
> 
> 
> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
> does not have the data or does it waits until all servers tell they do not
> have the data ? 
> yes
> 
> Consider 
> 
> using LOCAL QUORUM for write and read, will make things a bit more
> consistent but not add inter DC overhead into the request latency. Still
> possible to not get data in DC2 if it is totally disconnected from the DC1 
> 
> write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read,
> requests in DC2 will fail if DC1 is not reachable. 
> 
> Hope that helps. 
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote:
> 
> 
> Hello,
>  
> I have 2 datacenters. Cassandra is configured as follow:
> - RackInferringSnitch
> - NetworkTopologyStrategy for CF
> - strategy_options: DC1:3 DC2:3
>  
> Data are written using CL LOCAL_QUORUM so data written from one datacenter
> will eventually be replicated to the other datacenter. Data is always
> written exactly once. 
> 
>  
> On the other side, I'd like to improve the read path. I'm using actually
> the CL ONE since data is only written once (ie: timestamp is more or less
> meaningless in my case).
> 
>  
> This is where I have some doubts: if data is written on DC1 and
> tentatively read from DC2 while the data is still not replicated or
> partially replicated (for whatever good reason since replication is async),
> what is the behavior of Get with CL ONE / NTS ? 
> 
> 1/ Will I have an error because DC2 does not have any copy of the data ?
> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
> ?
> 
> 3/ In case of partial replication to DC2, will I see sometimes errors
> about servers not holding the data in DC2 ?
> 
> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
> does not have the data or does it waits until all servers tell they do not
> have the data ? 
> 
>  
> Thanks a lot,
> - Pierre
>  
> 



Re: Get CL ONE / NTS

Posted by aaron morton <aa...@thelastpickle.com>.
> What I’m missing is a clear behavior for CL.ONE. I’m unsure about what nodes are used by ONE and how the filtering of missing data/error is done. I’ve landed in ReadCallback.java but error handling is out of my reach for the moment.

Start with StorageProxy.fetch() to see which nodes are considered to be part of the request. ReadCallback.ctor() will decide which are actually involved based on the CL and RR been enabled.

At CL ONE there is no checkin of the replica responses for consistency, as there is only one response. If RR is enabled it will start from ReadCallback.maybeResolveForRepair(). 

Cheers



-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2011, at 7:21 PM, Pierre Chalamet wrote:

> I do not agree here. I trade “consistency” (it’s more data miss than consistency here) over performance in my case.
> I’m okay to handle the popping of the Spanish inquisition in the current DC by triggering a new read with a stronger CL somewhere else (for example in other DCs).
> If the data is nowhere to be found or nothing is reachable, well, it’s sad but true but it will be the end of the game. Fine.
>  
> What I’m missing is a clear behavior for CL.ONE. I’m unsure about what nodes are used by ONE and how the filtering of missing data/error is done. I’ve landed in ReadCallback.java but error handling is out of my reach for the moment.
>  
> Thanks,
> - Pierre
>  
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Thursday, September 15, 2011 12:27 AM
> To: user@cassandra.apache.org
> Subject: Re: Get CL ONE / NTS
>  
> Are you advising CL.ONE does not worth the game when considering
> read performance ?
> Consistency is not performance, it's a whole new thing to tune in your application. If you have performance issues deal with those as performance issues, better code / data model / hard ware. 
>  
> By the way, I do not have consistency problem at all - data is only written
> once
> Nobody expects a consistency problem. It's chief weapon is surprise. Surprise and fear. It's two weapons are fear and surprise. And so forth http://www.youtube.com/watch?v=Ixgc_FGam3s
>  
> If you write at LOCAL QUORUM in DC 1 and DC 2 is down at the start of the request, a hint will be stored in DC 1. Some time later when DC 2 comes back that hint will be sent to DC 2. If in the mean time you read from DC 2 at CL ONE you will not get that change. With Read Repair enabled it will repair in the background and you may get a different response on the next read (Am guessing here, cannot remember exactly how RR works cross DC) 
>  
>  Cheers
>  
>  
>  
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 15/09/2011, at 10:07 AM, Pierre Chalamet wrote:
> 
> 
> Thanks Aaron, didn't seen your answer before mine.
> 
> I do agree for 2/ I might have read error. Good suggestion to use
> EACH_QUORUM  - it could be a good trade off to read at this level if ONE
> fails.
> 
> Maybe using LOCAL_QUORUM might be a good answer and will avoid headache
> after all. Are you advising CL.ONE does not worth the game when considering
> read performance ?
> 
> By the way, I do not have consistency problem at all - data is only written
> once (and if more it is always the same data) and read several times across
> DC. I only have replication problems. That's why I'm more inclined to use
> CL.ONE for read if possible.
> 
> Thanks,
> - Pierre
> 
> 
> -----Original Message-----
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Wednesday, September 14, 2011 11:48 PM
> To: user@cassandra.apache.org; pierre@chalamet.net
> Subject: Re: Get CL ONE / NTS
> 
> Your current approach to Consistency opens the door to some inconsistent
> behavior. 
> 
> 
> 1/ Will I have an error because DC2 does not have any copy of the data ?
> If you read from DC2 at CL ONE and the data is not replicated it will not be
> returned. 
> 
> 
> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
> ?
> Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the
> DC's. If DC2 is behind DC1 then you will get the data form DC1. 
> 
> 
> 3/ In case of partial replication to DC2, will I see sometimes errors
> about servers not holding the data in DC2 ?
> Depending on the API call and the client, working at CL ONE, you will see
> either errors or missing data. 
> 
> 
> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
> does not have the data or does it waits until all servers tell they do not
> have the data ? 
> yes
> 
> Consider 
> 
> using LOCAL QUORUM for write and read, will make things a bit more
> consistent but not add inter DC overhead into the request latency. Still
> possible to not get data in DC2 if it is totally disconnected from the DC1 
> 
> write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read,
> requests in DC2 will fail if DC1 is not reachable. 
> 
> Hope that helps. 
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote:
> 
> 
> Hello,
>  
> I have 2 datacenters. Cassandra is configured as follow:
> - RackInferringSnitch
> - NetworkTopologyStrategy for CF
> - strategy_options: DC1:3 DC2:3
>  
> Data are written using CL LOCAL_QUORUM so data written from one datacenter
> will eventually be replicated to the other datacenter. Data is always
> written exactly once. 
> 
>  
> On the other side, I'd like to improve the read path. I'm using actually
> the CL ONE since data is only written once (ie: timestamp is more or less
> meaningless in my case).
> 
>  
> This is where I have some doubts: if data is written on DC1 and
> tentatively read from DC2 while the data is still not replicated or
> partially replicated (for whatever good reason since replication is async),
> what is the behavior of Get with CL ONE / NTS ? 
> 
> 1/ Will I have an error because DC2 does not have any copy of the data ?
> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
> ?
> 
> 3/ In case of partial replication to DC2, will I see sometimes errors
> about servers not holding the data in DC2 ?
> 
> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
> does not have the data or does it waits until all servers tell they do not
> have the data ? 
> 
>  
> Thanks a lot,
> - Pierre
>  
> 


RE: Get CL ONE / NTS

Posted by Pierre Chalamet <pi...@chalamet.net>.
I do not agree here. I trade "consistency" (it's more data miss than
consistency here) over performance in my case. 

I'm okay to handle the popping of the Spanish inquisition in the current DC
by triggering a new read with a stronger CL somewhere else (for example in
other DCs).

If the data is nowhere to be found or nothing is reachable, well, it's sad
but true but it will be the end of the game. Fine.

 

What I'm missing is a clear behavior for CL.ONE. I'm unsure about what nodes
are used by ONE and how the filtering of missing data/error is done. I've
landed in ReadCallback.java but error handling is out of my reach for the
moment.

 

Thanks,

- Pierre

 

From: aaron morton [mailto:aaron@thelastpickle.com] 
Sent: Thursday, September 15, 2011 12:27 AM
To: user@cassandra.apache.org
Subject: Re: Get CL ONE / NTS

 

Are you advising CL.ONE does not worth the game when considering
read performance ?

Consistency is not performance, it's a whole new thing to tune in your
application. If you have performance issues deal with those as performance
issues, better code / data model / hard ware. 

 

By the way, I do not have consistency problem at all - data is only written
once

Nobody expects a consistency problem. It's chief weapon is surprise.
Surprise and fear. It's two weapons are fear and surprise. And so forth
http://www.youtube.com/watch?v=Ixgc_FGam3s

 

If you write at LOCAL QUORUM in DC 1 and DC 2 is down at the start of the
request, a hint will be stored in DC 1. Some time later when DC 2 comes back
that hint will be sent to DC 2. If in the mean time you read from DC 2 at CL
ONE you will not get that change. With Read Repair enabled it will repair in
the background and you may get a different response on the next read (Am
guessing here, cannot remember exactly how RR works cross DC) 

 

 Cheers

 

 

 

-----------------

Aaron Morton

Freelance Cassandra Developer

@aaronmorton

http://www.thelastpickle.com

 

On 15/09/2011, at 10:07 AM, Pierre Chalamet wrote:





Thanks Aaron, didn't seen your answer before mine.

I do agree for 2/ I might have read error. Good suggestion to use
EACH_QUORUM  - it could be a good trade off to read at this level if ONE
fails.

Maybe using LOCAL_QUORUM might be a good answer and will avoid headache
after all. Are you advising CL.ONE does not worth the game when considering
read performance ?

By the way, I do not have consistency problem at all - data is only written
once (and if more it is always the same data) and read several times across
DC. I only have replication problems. That's why I'm more inclined to use
CL.ONE for read if possible.

Thanks,
- Pierre


-----Original Message-----
From: aaron morton [mailto:aaron@thelastpickle.com] 
Sent: Wednesday, September 14, 2011 11:48 PM
To: user@cassandra.apache.org; pierre@chalamet.net
Subject: Re: Get CL ONE / NTS

Your current approach to Consistency opens the door to some inconsistent
behavior. 




1/ Will I have an error because DC2 does not have any copy of the data ? 

If you read from DC2 at CL ONE and the data is not replicated it will not be
returned. 




2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2

?
Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the
DC's. If DC2 is behind DC1 then you will get the data form DC1. 




3/ In case of partial replication to DC2, will I see sometimes errors

about servers not holding the data in DC2 ?
Depending on the API call and the client, working at CL ONE, you will see
either errors or missing data. 




4/ Does Get CL ONE failed as soon as the fastest server to answer tell it

does not have the data or does it waits until all servers tell they do not
have the data ? 
yes

Consider 

using LOCAL QUORUM for write and read, will make things a bit more
consistent but not add inter DC overhead into the request latency. Still
possible to not get data in DC2 if it is totally disconnected from the DC1 

write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read,
requests in DC2 will fail if DC1 is not reachable. 

Hope that helps. 


-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote:




Hello,

 

I have 2 datacenters. Cassandra is configured as follow:

- RackInferringSnitch

- NetworkTopologyStrategy for CF

- strategy_options: DC1:3 DC2:3

 

Data are written using CL LOCAL_QUORUM so data written from one datacenter

will eventually be replicated to the other datacenter. Data is always
written exactly once. 



 

On the other side, I'd like to improve the read path. I'm using actually

the CL ONE since data is only written once (ie: timestamp is more or less
meaningless in my case).



 

This is where I have some doubts: if data is written on DC1 and

tentatively read from DC2 while the data is still not replicated or
partially replicated (for whatever good reason since replication is async),
what is the behavior of Get with CL ONE / NTS ? 



1/ Will I have an error because DC2 does not have any copy of the data ? 

2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2

?



3/ In case of partial replication to DC2, will I see sometimes errors

about servers not holding the data in DC2 ?



4/ Does Get CL ONE failed as soon as the fastest server to answer tell it

does not have the data or does it waits until all servers tell they do not
have the data ? 



 

Thanks a lot,

- Pierre

 

 


Re: Get CL ONE / NTS

Posted by aaron morton <aa...@thelastpickle.com>.
> Are you advising CL.ONE does not worth the game when considering
> read performance ?
Consistency is not performance, it's a whole new thing to tune in your application. If you have performance issues deal with those as performance issues, better code / data model / hard ware. 

> By the way, I do not have consistency problem at all - data is only written
> once
Nobody expects a consistency problem. It's chief weapon is surprise. Surprise and fear. It's two weapons are fear and surprise. And so forth http://www.youtube.com/watch?v=Ixgc_FGam3s

If you write at LOCAL QUORUM in DC 1 and DC 2 is down at the start of the request, a hint will be stored in DC 1. Some time later when DC 2 comes back that hint will be sent to DC 2. If in the mean time you read from DC 2 at CL ONE you will not get that change. With Read Repair enabled it will repair in the background and you may get a different response on the next read (Am guessing here, cannot remember exactly how RR works cross DC) 

 Cheers



-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2011, at 10:07 AM, Pierre Chalamet wrote:

> Thanks Aaron, didn't seen your answer before mine.
> 
> I do agree for 2/ I might have read error. Good suggestion to use
> EACH_QUORUM  - it could be a good trade off to read at this level if ONE
> fails.
> 
> Maybe using LOCAL_QUORUM might be a good answer and will avoid headache
> after all. Are you advising CL.ONE does not worth the game when considering
> read performance ?
> 
> By the way, I do not have consistency problem at all - data is only written
> once (and if more it is always the same data) and read several times across
> DC. I only have replication problems. That's why I'm more inclined to use
> CL.ONE for read if possible.
> 
> Thanks,
> - Pierre
> 
> 
> -----Original Message-----
> From: aaron morton [mailto:aaron@thelastpickle.com] 
> Sent: Wednesday, September 14, 2011 11:48 PM
> To: user@cassandra.apache.org; pierre@chalamet.net
> Subject: Re: Get CL ONE / NTS
> 
> Your current approach to Consistency opens the door to some inconsistent
> behavior. 
> 
>> 1/ Will I have an error because DC2 does not have any copy of the data ? 
> If you read from DC2 at CL ONE and the data is not replicated it will not be
> returned. 
> 
>> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
> ?
> Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the
> DC's. If DC2 is behind DC1 then you will get the data form DC1. 
> 
>> 3/ In case of partial replication to DC2, will I see sometimes errors
> about servers not holding the data in DC2 ?
> Depending on the API call and the client, working at CL ONE, you will see
> either errors or missing data. 
> 
>> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
> does not have the data or does it waits until all servers tell they do not
> have the data ? 
> yes
> 
> Consider 
> 
> using LOCAL QUORUM for write and read, will make things a bit more
> consistent but not add inter DC overhead into the request latency. Still
> possible to not get data in DC2 if it is totally disconnected from the DC1 
> 
> write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read,
> requests in DC2 will fail if DC1 is not reachable. 
> 
> Hope that helps. 
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote:
> 
>> Hello,
>> 
>> I have 2 datacenters. Cassandra is configured as follow:
>> - RackInferringSnitch
>> - NetworkTopologyStrategy for CF
>> - strategy_options: DC1:3 DC2:3
>> 
>> Data are written using CL LOCAL_QUORUM so data written from one datacenter
> will eventually be replicated to the other datacenter. Data is always
> written exactly once. 
>> 
>> On the other side, I'd like to improve the read path. I'm using actually
> the CL ONE since data is only written once (ie: timestamp is more or less
> meaningless in my case).
>> 
>> This is where I have some doubts: if data is written on DC1 and
> tentatively read from DC2 while the data is still not replicated or
> partially replicated (for whatever good reason since replication is async),
> what is the behavior of Get with CL ONE / NTS ? 
>> 1/ Will I have an error because DC2 does not have any copy of the data ? 
>> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
> ?
>> 3/ In case of partial replication to DC2, will I see sometimes errors
> about servers not holding the data in DC2 ?
>> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
> does not have the data or does it waits until all servers tell they do not
> have the data ? 
>> 
>> Thanks a lot,
>> - Pierre
> 
> 


RE: Get CL ONE / NTS

Posted by Pierre Chalamet <pi...@chalamet.net>.
Thanks Aaron, didn't seen your answer before mine.

I do agree for 2/ I might have read error. Good suggestion to use
EACH_QUORUM  - it could be a good trade off to read at this level if ONE
fails.

Maybe using LOCAL_QUORUM might be a good answer and will avoid headache
after all. Are you advising CL.ONE does not worth the game when considering
read performance ?

By the way, I do not have consistency problem at all - data is only written
once (and if more it is always the same data) and read several times across
DC. I only have replication problems. That's why I'm more inclined to use
CL.ONE for read if possible.

Thanks,
- Pierre


-----Original Message-----
From: aaron morton [mailto:aaron@thelastpickle.com] 
Sent: Wednesday, September 14, 2011 11:48 PM
To: user@cassandra.apache.org; pierre@chalamet.net
Subject: Re: Get CL ONE / NTS

Your current approach to Consistency opens the door to some inconsistent
behavior. 

> 1/ Will I have an error because DC2 does not have any copy of the data ? 
If you read from DC2 at CL ONE and the data is not replicated it will not be
returned. 

> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
?
Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the
DC's. If DC2 is behind DC1 then you will get the data form DC1. 

> 3/ In case of partial replication to DC2, will I see sometimes errors
about servers not holding the data in DC2 ?
Depending on the API call and the client, working at CL ONE, you will see
either errors or missing data. 

> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
does not have the data or does it waits until all servers tell they do not
have the data ? 
 yes

Consider 

using LOCAL QUORUM for write and read, will make things a bit more
consistent but not add inter DC overhead into the request latency. Still
possible to not get data in DC2 if it is totally disconnected from the DC1 

write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read,
requests in DC2 will fail if DC1 is not reachable. 

Hope that helps. 

 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote:

> Hello,
> 
> I have 2 datacenters. Cassandra is configured as follow:
> - RackInferringSnitch
> - NetworkTopologyStrategy for CF
> - strategy_options: DC1:3 DC2:3
> 
> Data are written using CL LOCAL_QUORUM so data written from one datacenter
will eventually be replicated to the other datacenter. Data is always
written exactly once. 
> 
> On the other side, I'd like to improve the read path. I'm using actually
the CL ONE since data is only written once (ie: timestamp is more or less
meaningless in my case).
> 
> This is where I have some doubts: if data is written on DC1 and
tentatively read from DC2 while the data is still not replicated or
partially replicated (for whatever good reason since replication is async),
what is the behavior of Get with CL ONE / NTS ? 
> 1/ Will I have an error because DC2 does not have any copy of the data ? 
> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2
?
> 3/ In case of partial replication to DC2, will I see sometimes errors
about servers not holding the data in DC2 ?
> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
does not have the data or does it waits until all servers tell they do not
have the data ? 
> 
> Thanks a lot,
> - Pierre



Re: Get CL ONE / NTS

Posted by aaron morton <aa...@thelastpickle.com>.
Your current approach to Consistency opens the door to some inconsistent behavior. 

> 1/ Will I have an error because DC2 does not have any copy of the data ? 
If you read from DC2 at CL ONE and the data is not replicated it will not be returned. 

> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
Not at CL ONE. If you used CL EACH QUORUM then the read will go to all the DC's. If DC2 is behind DC1 then you will get the data form DC1. 

> 3/ In case of partial replication to DC2, will I see sometimes errors about servers not holding the data in DC2 ?
Depending on the API call and the client, working at CL ONE, you will see either errors or missing data. 

> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it does not have the data or does it waits until all servers tell they do not have the data ? 
 yes

Consider 

using LOCAL QUORUM for write and read, will make things a bit more consistent but not add inter DC overhead into the request latency. Still possible to not get data in DC2 if it is totally disconnected from the DC1 

write at LOCAL QUORUM and read at EACH QUORUM . Will so you can always read, requests in DC2 will fail if DC1 is not reachable. 

Hope that helps. 

 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2011, at 1:33 AM, Pierre Chalamet wrote:

> Hello,
> 
> I have 2 datacenters. Cassandra is configured as follow:
> - RackInferringSnitch
> - NetworkTopologyStrategy for CF
> - strategy_options: DC1:3 DC2:3
> 
> Data are written using CL LOCAL_QUORUM so data written from one datacenter will eventually be replicated to the other datacenter. Data is always written exactly once. 
> 
> On the other side, I'd like to improve the read path. I'm using actually the CL ONE since data is only written once (ie: timestamp is more or less meaningless in my case).
> 
> This is where I have some doubts: if data is written on DC1 and tentatively read from DC2 while the data is still not replicated or partially replicated (for whatever good reason since replication is async), what is the behavior of Get with CL ONE / NTS ? 
> 1/ Will I have an error because DC2 does not have any copy of the data ? 
> 2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
> 3/ In case of partial replication to DC2, will I see sometimes errors about servers not holding the data in DC2 ?
> 4/ Does Get CL ONE failed as soon as the fastest server to answer tell it does not have the data or does it waits until all servers tell they do not have the data ? 
> 
> Thanks a lot,
> - Pierre


RE: Get CL ONE / NTS

Posted by Pierre Chalamet <pi...@chalamet.net>.
After reading Cassandra source code, I will try to answer myself. It's kind
of good exercise :)

>1/ Will I have an error because DC2 does not have any copy of the data ? 
I've not been able to find how endpoints are determined for the read
request, but I guess endpoints are just coming from the current datacenter.

>2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
Probably no since 1/

>3/ In case of partial replication to DC2, will I see sometimes errors about
servers not holding the data in DC2 ?
It seems to depend on RR. If read_repair_chance is set to 1 (default value),
RR happens all the time : the answer is no.
In case read_repair_chance is below 1, it seems CL.ONE will fail if the
single read request fails.

>4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
does not have the data or does it waits until all servers tell they do not
have the data ?
It seems to depend on RR as in 3/

Are the answers right ?

- Pierre


-----Original Message-----
From: Pierre Chalamet [mailto:pierre@chalamet.net] 
Sent: Wednesday, September 14, 2011 3:33 PM
To: user@cassandra.apache.org
Subject: Get CL ONE / NTS

Hello,

I have 2 datacenters. Cassandra is configured as follow:
- RackInferringSnitch
- NetworkTopologyStrategy for CF
- strategy_options: DC1:3 DC2:3

Data are written using CL LOCAL_QUORUM so data written from one datacenter
will eventually be replicated to the other datacenter. Data is always
written exactly once. 

On the other side, I'd like to improve the read path. I'm using actually the
CL ONE since data is only written once (ie: timestamp is more or less
meaningless in my case).

This is where I have some doubts: if data is written on DC1 and tentatively
read from DC2 while the data is still not replicated or partially replicated
(for whatever good reason since replication is async), what is the behavior
of Get with CL ONE / NTS ? 
1/ Will I have an error because DC2 does not have any copy of the data ? 
2/ Will Cassandra try to get the data from DC1 if nothing is found in DC2 ?
3/ In case of partial replication to DC2, will I see sometimes errors about
servers not holding the data in DC2 ?
4/ Does Get CL ONE failed as soon as the fastest server to answer tell it
does not have the data or does it waits until all servers tell they do not
have the data ? 

Thanks a lot,
- Pierre