You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Dikang Gu <di...@gmail.com> on 2017/06/09 02:28:56 UTC

Definition of QUORUM consistency level

Hello there,

We have some use cases are doing consistent read/write requests, and we
have 4 replicas in that cluster, according to our setup.

What's interesting to me is that, for both read and write quorum requests,
they are blocked for 4/2+1 = 3 replicas, so we are accessing 3 (for write)
+ 3 (for reads) = 6 replicas in quorum requests, which is 2 replicas more
than 4.

I think it's not necessary to have 2 overlap nodes in even replication
factor case.

I suggest to change the `quorumFor(keyspace)` code, separate the case for
read and write requests, so that we can reduce one replica request in read
path.

Any concerns?

Thanks!


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Nate McCall <na...@thelastpickle.com>.

>
>
> So, for the quorum, what we really want is that there is one overlap among
>> the nodes in write path and read path. It actually was my assumption for a
>> long time that we need (N/2 + 1) for write and just need (N/2) for read,
>> because it's enough to provide the strong consistency.
>>
>
> You are write about ...
>
*right (lol!).

Re: Definition of QUORUM consistency level

Posted by Nate McCall <na...@thelastpickle.com>.

>
>
> So, for the quorum, what we really want is that there is one overlap among
>> the nodes in write path and read path. It actually was my assumption for a
>> long time that we need (N/2 + 1) for write and just need (N/2) for read,
>> because it's enough to provide the strong consistency.
>>
>
> You are write about ...
>
*right (lol!).

Re: Definition of QUORUM consistency level

Posted by Nate McCall <na...@thelastpickle.com>.

> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>

You are write about strong consistency with that calculation, but if I want
to issue a QUORUM read just by itself, I would expect a majority of nodes
to reply. How it was written might be immaterial to my use case of reading
'from a majority.'

-- 
-----------------
Nate McCall
Wellington, NZ
@zznate

CTO
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Definition of QUORUM consistency level

Posted by Jeff Jirsa <jj...@gmail.com>.

Would love to see real pluggable consistency levels. Sorta sad it got
wont-fixed - may be time to revisit that, perhaps it's more feasible now.

https://issues.apache.org/jira/browse/CASSANDRA-8119 is also semi-related,
but a different approach (CL-as-UDF)

On Thu, Jun 8, 2017 at 9:26 PM, Brandon Williams <dr...@gmail.com> wrote:

> I don't disagree with you there and have never liked TWO/THREE.  This is
> somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>
> I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
> also not sure what is.
>
>
> On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>
> > To me, CL.TWO and CL.THREE are more like work around of the problem, for
> > example, they do not work if the number of replicas go to 8, which does
> > possible in our environment (2 replicas in each of 4 DCs).
> >
> > What people want from quorum is strong consistency guarantee, as long as
> > R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1);
> c)
> > R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
> which
> > is the most expensive option.
> >
> > I can not think of a reason, that people want the quorum read, not for
> > strong consistency reason, but just to read from (n/2+1) nodes. If they
> > want strong consistency, then the read just needs (n/2) nodes, we are
> > purely waste the one extra request, and hurts read latency as well.
> >
> > Thanks
> > Dikang.
> >
> > On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
> > wrote:
> >
> >>
> >> We have CL.TWO.
> >>>
> >>>
> >>>
> >> This was actually the original motivation for CL.TWO and CL.THREE if
> >> memory serves:
> >> https://issues.apache.org/jira/browse/CASSANDRA-2013
> >>
> >
> >
> >
> > --
> > Dikang
> >
> >
>

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

https://issues.apache.org/jira/browse/CASSANDRA-13645

On Wed, Jun 28, 2017 at 4:59 PM, Dikang Gu <di...@gmail.com> wrote:

> We implement the patch internally, and deploy to our production clusters,
> we see 2X drop of the P99 quorum read latency, because we can reduce one
> unnecessary cross region read. This is a huge improvement since performance
> is very critical to our customers.
>
> Again, I'm not trying to change the definition of the QUORUM consistency
> level, instead, we want to improve the quorum read latency, by removing
> unnecessary replica requests, which I think can benefit Cassandra users in
> general.
>
> I will create a JIRA, and we can move discussions there.
>
>
> Thanks!
> 
>
> On Thu, Jun 8, 2017 at 10:12 PM, Jeff Jirsa <jj...@gmail.com> wrote:
>
>> Short of actually making ConsistencyLevel pluggable or adding/changing
>> one of the existing levels, an alternative approach would be to divide up
>> the cluster into either real or pseudo-datacenters (with RF=2 in each DC),
>> and then write with QUORUM (which would be 3 nodes, across any combination
>> of datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
>> datacenter of the coordinator). You don't have to have distinct physical
>> DCs for this, but you'd need tooling to guarantee an even number of
>> replicas in each virtual datacenter.
>>
>> It's an ugly workaround, but it'd work.
>>
>> Pluggable CL would be nicer, though.
>>
>>
>> On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <ju...@instaclustr.com>
>> wrote:
>>
>>> Firstly, this situation only occurs if you need strong consistency and
>>> are
>>> using an even replication factor (RF4, RF6, etc).
>>> Secondly, either the read or write still need to be performed at a
>>> minimum
>>> level of QUORUM. This means there are no extra availability benefits from
>>> your proposal (i.e. a minimum of QUORUM replicas still need to be online
>>> and available)
>>>
>>> So the only potential benefit I can think of is a theoretical performance
>>> boost. If you write with QUORUM, then you'll need to read with
>>> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
>>> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
>>> most you'd only reduce the number of replicas that the client needs to
>>> block on by 1.
>>>
>>> I'd guess that the performance benefits that you'd gain will probably be
>>> quite small - but I'd happily be proven wrong if you feel like running
>>> some
>>> benchmarks :)
>>>
>>> Cheers,
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:
>>>
>>> > I don't disagree with you there and have never liked TWO/THREE.  This
>>> is
>>> > somewhat relevant: https://issues.apache.org/jira
>>> /browse/CASSANDRA-2338
>>> >
>>> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
>>> I'm
>>> > also not sure what is.
>>> >
>>> >
>>> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>>> >
>>> >> To me, CL.TWO and CL.THREE are more like work around of the problem,
>>> for
>>> >> example, they do not work if the number of replicas go to 8, which
>>> does
>>> >> possible in our environment (2 replicas in each of 4 DCs).
>>> >>
>>> >> What people want from quorum is strong consistency guarantee, as long
>>> as
>>> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
>>> W=(n/2+1); c)
>>> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
>>> which
>>> >> is the most expensive option.
>>> >>
>>> >> I can not think of a reason, that people want the quorum read, not for
>>> >> strong consistency reason, but just to read from (n/2+1) nodes. If
>>> they
>>> >> want strong consistency, then the read just needs (n/2) nodes, we are
>>> >> purely waste the one extra request, and hurts read latency as well.
>>> >>
>>> >> Thanks
>>> >> Dikang.
>>> >>
>>> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
>>> >> wrote:
>>> >>
>>> >>>
>>> >>> We have CL.TWO.
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>> This was actually the original motivation for CL.TWO and CL.THREE if
>>> >>> memory serves:
>>> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dikang
>>> >>
>>> >>
>>> > --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> <https://www.instaclustr.com/>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia)
>>> and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not
>>> copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the
>>> message.
>>>
>>
>>
>
>
> --
> Dikang
>
>


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

https://issues.apache.org/jira/browse/CASSANDRA-13645

On Wed, Jun 28, 2017 at 4:59 PM, Dikang Gu <di...@gmail.com> wrote:

> We implement the patch internally, and deploy to our production clusters,
> we see 2X drop of the P99 quorum read latency, because we can reduce one
> unnecessary cross region read. This is a huge improvement since performance
> is very critical to our customers.
>
> Again, I'm not trying to change the definition of the QUORUM consistency
> level, instead, we want to improve the quorum read latency, by removing
> unnecessary replica requests, which I think can benefit Cassandra users in
> general.
>
> I will create a JIRA, and we can move discussions there.
>
>
> Thanks!
> 
>
> On Thu, Jun 8, 2017 at 10:12 PM, Jeff Jirsa <jj...@gmail.com> wrote:
>
>> Short of actually making ConsistencyLevel pluggable or adding/changing
>> one of the existing levels, an alternative approach would be to divide up
>> the cluster into either real or pseudo-datacenters (with RF=2 in each DC),
>> and then write with QUORUM (which would be 3 nodes, across any combination
>> of datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
>> datacenter of the coordinator). You don't have to have distinct physical
>> DCs for this, but you'd need tooling to guarantee an even number of
>> replicas in each virtual datacenter.
>>
>> It's an ugly workaround, but it'd work.
>>
>> Pluggable CL would be nicer, though.
>>
>>
>> On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <ju...@instaclustr.com>
>> wrote:
>>
>>> Firstly, this situation only occurs if you need strong consistency and
>>> are
>>> using an even replication factor (RF4, RF6, etc).
>>> Secondly, either the read or write still need to be performed at a
>>> minimum
>>> level of QUORUM. This means there are no extra availability benefits from
>>> your proposal (i.e. a minimum of QUORUM replicas still need to be online
>>> and available)
>>>
>>> So the only potential benefit I can think of is a theoretical performance
>>> boost. If you write with QUORUM, then you'll need to read with
>>> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
>>> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
>>> most you'd only reduce the number of replicas that the client needs to
>>> block on by 1.
>>>
>>> I'd guess that the performance benefits that you'd gain will probably be
>>> quite small - but I'd happily be proven wrong if you feel like running
>>> some
>>> benchmarks :)
>>>
>>> Cheers,
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:
>>>
>>> > I don't disagree with you there and have never liked TWO/THREE.  This
>>> is
>>> > somewhat relevant: https://issues.apache.org/jira
>>> /browse/CASSANDRA-2338
>>> >
>>> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
>>> I'm
>>> > also not sure what is.
>>> >
>>> >
>>> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>>> >
>>> >> To me, CL.TWO and CL.THREE are more like work around of the problem,
>>> for
>>> >> example, they do not work if the number of replicas go to 8, which
>>> does
>>> >> possible in our environment (2 replicas in each of 4 DCs).
>>> >>
>>> >> What people want from quorum is strong consistency guarantee, as long
>>> as
>>> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
>>> W=(n/2+1); c)
>>> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
>>> which
>>> >> is the most expensive option.
>>> >>
>>> >> I can not think of a reason, that people want the quorum read, not for
>>> >> strong consistency reason, but just to read from (n/2+1) nodes. If
>>> they
>>> >> want strong consistency, then the read just needs (n/2) nodes, we are
>>> >> purely waste the one extra request, and hurts read latency as well.
>>> >>
>>> >> Thanks
>>> >> Dikang.
>>> >>
>>> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
>>> >> wrote:
>>> >>
>>> >>>
>>> >>> We have CL.TWO.
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>> This was actually the original motivation for CL.TWO and CL.THREE if
>>> >>> memory serves:
>>> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Dikang
>>> >>
>>> >>
>>> > --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> <https://www.instaclustr.com/>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia)
>>> and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not
>>> copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the
>>> message.
>>>
>>
>>
>
>
> --
> Dikang
>
>


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

We implement the patch internally, and deploy to our production clusters,
we see 2X drop of the P99 quorum read latency, because we can reduce one
unnecessary cross region read. This is a huge improvement since performance
is very critical to our customers.

Again, I'm not trying to change the definition of the QUORUM consistency
level, instead, we want to improve the quorum read latency, by removing
unnecessary replica requests, which I think can benefit Cassandra users in
general.

I will create a JIRA, and we can move discussions there.


Thanks!


On Thu, Jun 8, 2017 at 10:12 PM, Jeff Jirsa <jj...@gmail.com> wrote:

> Short of actually making ConsistencyLevel pluggable or adding/changing one
> of the existing levels, an alternative approach would be to divide up the
> cluster into either real or pseudo-datacenters (with RF=2 in each DC), and
> then write with QUORUM (which would be 3 nodes, across any combination of
> datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
> datacenter of the coordinator). You don't have to have distinct physical
> DCs for this, but you'd need tooling to guarantee an even number of
> replicas in each virtual datacenter.
>
> It's an ugly workaround, but it'd work.
>
> Pluggable CL would be nicer, though.
>
>
> On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <ju...@instaclustr.com>
> wrote:
>
>> Firstly, this situation only occurs if you need strong consistency and are
>> using an even replication factor (RF4, RF6, etc).
>> Secondly, either the read or write still need to be performed at a minimum
>> level of QUORUM. This means there are no extra availability benefits from
>> your proposal (i.e. a minimum of QUORUM replicas still need to be online
>> and available)
>>
>> So the only potential benefit I can think of is a theoretical performance
>> boost. If you write with QUORUM, then you'll need to read with
>> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
>> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
>> most you'd only reduce the number of replicas that the client needs to
>> block on by 1.
>>
>> I'd guess that the performance benefits that you'd gain will probably be
>> quite small - but I'd happily be proven wrong if you feel like running
>> some
>> benchmarks :)
>>
>> Cheers,
>> Justin
>>
>> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:
>>
>> > I don't disagree with you there and have never liked TWO/THREE.  This is
>> > somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>> >
>> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
>> I'm
>> > also not sure what is.
>> >
>> >
>> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>> >
>> >> To me, CL.TWO and CL.THREE are more like work around of the problem,
>> for
>> >> example, they do not work if the number of replicas go to 8, which does
>> >> possible in our environment (2 replicas in each of 4 DCs).
>> >>
>> >> What people want from quorum is strong consistency guarantee, as long
>> as
>> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
>> W=(n/2+1); c)
>> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
>> which
>> >> is the most expensive option.
>> >>
>> >> I can not think of a reason, that people want the quorum read, not for
>> >> strong consistency reason, but just to read from (n/2+1) nodes. If they
>> >> want strong consistency, then the read just needs (n/2) nodes, we are
>> >> purely waste the one extra request, and hurts read latency as well.
>> >>
>> >> Thanks
>> >> Dikang.
>> >>
>> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
>> >> wrote:
>> >>
>> >>>
>> >>> We have CL.TWO.
>> >>>>
>> >>>>
>> >>>>
>> >>> This was actually the original motivation for CL.TWO and CL.THREE if
>> >>> memory serves:
>> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Dikang
>> >>
>> >>
>> > --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> <https://www.instaclustr.com/>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
>> and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not
>> copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

We implement the patch internally, and deploy to our production clusters,
we see 2X drop of the P99 quorum read latency, because we can reduce one
unnecessary cross region read. This is a huge improvement since performance
is very critical to our customers.

Again, I'm not trying to change the definition of the QUORUM consistency
level, instead, we want to improve the quorum read latency, by removing
unnecessary replica requests, which I think can benefit Cassandra users in
general.

I will create a JIRA, and we can move discussions there.


Thanks!


On Thu, Jun 8, 2017 at 10:12 PM, Jeff Jirsa <jj...@gmail.com> wrote:

> Short of actually making ConsistencyLevel pluggable or adding/changing one
> of the existing levels, an alternative approach would be to divide up the
> cluster into either real or pseudo-datacenters (with RF=2 in each DC), and
> then write with QUORUM (which would be 3 nodes, across any combination of
> datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
> datacenter of the coordinator). You don't have to have distinct physical
> DCs for this, but you'd need tooling to guarantee an even number of
> replicas in each virtual datacenter.
>
> It's an ugly workaround, but it'd work.
>
> Pluggable CL would be nicer, though.
>
>
> On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <ju...@instaclustr.com>
> wrote:
>
>> Firstly, this situation only occurs if you need strong consistency and are
>> using an even replication factor (RF4, RF6, etc).
>> Secondly, either the read or write still need to be performed at a minimum
>> level of QUORUM. This means there are no extra availability benefits from
>> your proposal (i.e. a minimum of QUORUM replicas still need to be online
>> and available)
>>
>> So the only potential benefit I can think of is a theoretical performance
>> boost. If you write with QUORUM, then you'll need to read with
>> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
>> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
>> most you'd only reduce the number of replicas that the client needs to
>> block on by 1.
>>
>> I'd guess that the performance benefits that you'd gain will probably be
>> quite small - but I'd happily be proven wrong if you feel like running
>> some
>> benchmarks :)
>>
>> Cheers,
>> Justin
>>
>> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:
>>
>> > I don't disagree with you there and have never liked TWO/THREE.  This is
>> > somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>> >
>> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
>> I'm
>> > also not sure what is.
>> >
>> >
>> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>> >
>> >> To me, CL.TWO and CL.THREE are more like work around of the problem,
>> for
>> >> example, they do not work if the number of replicas go to 8, which does
>> >> possible in our environment (2 replicas in each of 4 DCs).
>> >>
>> >> What people want from quorum is strong consistency guarantee, as long
>> as
>> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
>> W=(n/2+1); c)
>> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
>> which
>> >> is the most expensive option.
>> >>
>> >> I can not think of a reason, that people want the quorum read, not for
>> >> strong consistency reason, but just to read from (n/2+1) nodes. If they
>> >> want strong consistency, then the read just needs (n/2) nodes, we are
>> >> purely waste the one extra request, and hurts read latency as well.
>> >>
>> >> Thanks
>> >> Dikang.
>> >>
>> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
>> >> wrote:
>> >>
>> >>>
>> >>> We have CL.TWO.
>> >>>>
>> >>>>
>> >>>>
>> >>> This was actually the original motivation for CL.TWO and CL.THREE if
>> >>> memory serves:
>> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Dikang
>> >>
>> >>
>> > --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> <https://www.instaclustr.com/>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
>> and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not
>> copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Jeff Jirsa <jj...@gmail.com>.

Short of actually making ConsistencyLevel pluggable or adding/changing one
of the existing levels, an alternative approach would be to divide up the
cluster into either real or pseudo-datacenters (with RF=2 in each DC), and
then write with QUORUM (which would be 3 nodes, across any combination of
datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
datacenter of the coordinator). You don't have to have distinct physical
DCs for this, but you'd need tooling to guarantee an even number of
replicas in each virtual datacenter.

It's an ugly workaround, but it'd work.

Pluggable CL would be nicer, though.


On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <ju...@instaclustr.com>
wrote:

> Firstly, this situation only occurs if you need strong consistency and are
> using an even replication factor (RF4, RF6, etc).
> Secondly, either the read or write still need to be performed at a minimum
> level of QUORUM. This means there are no extra availability benefits from
> your proposal (i.e. a minimum of QUORUM replicas still need to be online
> and available)
>
> So the only potential benefit I can think of is a theoretical performance
> boost. If you write with QUORUM, then you'll need to read with
> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
> most you'd only reduce the number of replicas that the client needs to
> block on by 1.
>
> I'd guess that the performance benefits that you'd gain will probably be
> quite small - but I'd happily be proven wrong if you feel like running some
> benchmarks :)
>
> Cheers,
> Justin
>
> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:
>
> > I don't disagree with you there and have never liked TWO/THREE.  This is
> > somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
> >
> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
> I'm
> > also not sure what is.
> >
> >
> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
> >
> >> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> >> example, they do not work if the number of replicas go to 8, which does
> >> possible in our environment (2 replicas in each of 4 DCs).
> >>
> >> What people want from quorum is strong consistency guarantee, as long as
> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
> W=(n/2+1); c)
> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
> which
> >> is the most expensive option.
> >>
> >> I can not think of a reason, that people want the quorum read, not for
> >> strong consistency reason, but just to read from (n/2+1) nodes. If they
> >> want strong consistency, then the read just needs (n/2) nodes, we are
> >> purely waste the one extra request, and hurts read latency as well.
> >>
> >> Thanks
> >> Dikang.
> >>
> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
> >> wrote:
> >>
> >>>
> >>> We have CL.TWO.
> >>>>
> >>>>
> >>>>
> >>> This was actually the original motivation for CL.TWO and CL.THREE if
> >>> memory serves:
> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
> >>>
> >>
> >>
> >>
> >> --
> >> Dikang
> >>
> >>
> > --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

Re: Definition of QUORUM consistency level

Posted by Jeff Jirsa <jj...@gmail.com>.

Short of actually making ConsistencyLevel pluggable or adding/changing one
of the existing levels, an alternative approach would be to divide up the
cluster into either real or pseudo-datacenters (with RF=2 in each DC), and
then write with QUORUM (which would be 3 nodes, across any combination of
datacenters), and read with LOCAL_QUORUM (which would be 2 nodes in the
datacenter of the coordinator). You don't have to have distinct physical
DCs for this, but you'd need tooling to guarantee an even number of
replicas in each virtual datacenter.

It's an ugly workaround, but it'd work.

Pluggable CL would be nicer, though.


On Thu, Jun 8, 2017 at 9:51 PM, Justin Cameron <ju...@instaclustr.com>
wrote:

> Firstly, this situation only occurs if you need strong consistency and are
> using an even replication factor (RF4, RF6, etc).
> Secondly, either the read or write still need to be performed at a minimum
> level of QUORUM. This means there are no extra availability benefits from
> your proposal (i.e. a minimum of QUORUM replicas still need to be online
> and available)
>
> So the only potential benefit I can think of is a theoretical performance
> boost. If you write with QUORUM, then you'll need to read with
> QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
> QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
> most you'd only reduce the number of replicas that the client needs to
> block on by 1.
>
> I'd guess that the performance benefits that you'd gain will probably be
> quite small - but I'd happily be proven wrong if you feel like running some
> benchmarks :)
>
> Cheers,
> Justin
>
> On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:
>
> > I don't disagree with you there and have never liked TWO/THREE.  This is
> > somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
> >
> > I don't think going to CL.FOUR, etc, is a good long-term solution, but
> I'm
> > also not sure what is.
> >
> >
> > On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
> >
> >> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> >> example, they do not work if the number of replicas go to 8, which does
> >> possible in our environment (2 replicas in each of 4 DCs).
> >>
> >> What people want from quorum is strong consistency guarantee, as long as
> >> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2),
> W=(n/2+1); c)
> >> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
> which
> >> is the most expensive option.
> >>
> >> I can not think of a reason, that people want the quorum read, not for
> >> strong consistency reason, but just to read from (n/2+1) nodes. If they
> >> want strong consistency, then the read just needs (n/2) nodes, we are
> >> purely waste the one extra request, and hurts read latency as well.
> >>
> >> Thanks
> >> Dikang.
> >>
> >> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
> >> wrote:
> >>
> >>>
> >>> We have CL.TWO.
> >>>>
> >>>>
> >>>>
> >>> This was actually the original motivation for CL.TWO and CL.THREE if
> >>> memory serves:
> >>> https://issues.apache.org/jira/browse/CASSANDRA-2013
> >>>
> >>
> >>
> >>
> >> --
> >> Dikang
> >>
> >>
> > --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

Re: Definition of QUORUM consistency level

Posted by Justin Cameron <ju...@instaclustr.com>.

Firstly, this situation only occurs if you need strong consistency and are
using an even replication factor (RF4, RF6, etc).
Secondly, either the read or write still need to be performed at a minimum
level of QUORUM. This means there are no extra availability benefits from
your proposal (i.e. a minimum of QUORUM replicas still need to be online
and available)

So the only potential benefit I can think of is a theoretical performance
boost. If you write with QUORUM, then you'll need to read with
QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
most you'd only reduce the number of replicas that the client needs to
block on by 1.

I'd guess that the performance benefits that you'd gain will probably be
quite small - but I'd happily be proven wrong if you feel like running some
benchmarks :)

Cheers,
Justin

On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:

> I don't disagree with you there and have never liked TWO/THREE.  This is
> somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>
> I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
> also not sure what is.
>
>
> On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>
>> To me, CL.TWO and CL.THREE are more like work around of the problem, for
>> example, they do not work if the number of replicas go to 8, which does
>> possible in our environment (2 replicas in each of 4 DCs).
>>
>> What people want from quorum is strong consistency guarantee, as long as
>> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
>> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
>> is the most expensive option.
>>
>> I can not think of a reason, that people want the quorum read, not for
>> strong consistency reason, but just to read from (n/2+1) nodes. If they
>> want strong consistency, then the read just needs (n/2) nodes, we are
>> purely waste the one extra request, and hurts read latency as well.
>>
>> Thanks
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
>> wrote:
>>
>>>
>>> We have CL.TWO.
>>>>
>>>>
>>>>
>>> This was actually the original motivation for CL.TWO and CL.THREE if
>>> memory serves:
>>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>
> --

*Justin Cameron*Senior Software Engineer

<https://www.instaclustr.com/>

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: Definition of QUORUM consistency level

Posted by Jeff Jirsa <jj...@gmail.com>.

Would love to see real pluggable consistency levels. Sorta sad it got
wont-fixed - may be time to revisit that, perhaps it's more feasible now.

https://issues.apache.org/jira/browse/CASSANDRA-8119 is also semi-related,
but a different approach (CL-as-UDF)

On Thu, Jun 8, 2017 at 9:26 PM, Brandon Williams <dr...@gmail.com> wrote:

> I don't disagree with you there and have never liked TWO/THREE.  This is
> somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>
> I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
> also not sure what is.
>
>
> On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>
> > To me, CL.TWO and CL.THREE are more like work around of the problem, for
> > example, they do not work if the number of replicas go to 8, which does
> > possible in our environment (2 replicas in each of 4 DCs).
> >
> > What people want from quorum is strong consistency guarantee, as long as
> > R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1);
> c)
> > R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a),
> which
> > is the most expensive option.
> >
> > I can not think of a reason, that people want the quorum read, not for
> > strong consistency reason, but just to read from (n/2+1) nodes. If they
> > want strong consistency, then the read just needs (n/2) nodes, we are
> > purely waste the one extra request, and hurts read latency as well.
> >
> > Thanks
> > Dikang.
> >
> > On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
> > wrote:
> >
> >>
> >> We have CL.TWO.
> >>>
> >>>
> >>>
> >> This was actually the original motivation for CL.TWO and CL.THREE if
> >> memory serves:
> >> https://issues.apache.org/jira/browse/CASSANDRA-2013
> >>
> >
> >
> >
> > --
> > Dikang
> >
> >
>

Re: Definition of QUORUM consistency level

Posted by Justin Cameron <ju...@instaclustr.com>.

Firstly, this situation only occurs if you need strong consistency and are
using an even replication factor (RF4, RF6, etc).
Secondly, either the read or write still need to be performed at a minimum
level of QUORUM. This means there are no extra availability benefits from
your proposal (i.e. a minimum of QUORUM replicas still need to be online
and available)

So the only potential benefit I can think of is a theoretical performance
boost. If you write with QUORUM, then you'll need to read with
QUORUM-1/HALF (e.g. RF4, write with QUORUM, read with TWO, RF6 write with
QUORUM, read with THREE, RF8 write with QUORUM, read with FOUR, ...). At
most you'd only reduce the number of replicas that the client needs to
block on by 1.

I'd guess that the performance benefits that you'd gain will probably be
quite small - but I'd happily be proven wrong if you feel like running some
benchmarks :)

Cheers,
Justin

On Fri, 9 Jun 2017 at 14:26 Brandon Williams <dr...@gmail.com> wrote:

> I don't disagree with you there and have never liked TWO/THREE.  This is
> somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338
>
> I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
> also not sure what is.
>
>
> On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:
>
>> To me, CL.TWO and CL.THREE are more like work around of the problem, for
>> example, they do not work if the number of replicas go to 8, which does
>> possible in our environment (2 replicas in each of 4 DCs).
>>
>> What people want from quorum is strong consistency guarantee, as long as
>> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
>> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
>> is the most expensive option.
>>
>> I can not think of a reason, that people want the quorum read, not for
>> strong consistency reason, but just to read from (n/2+1) nodes. If they
>> want strong consistency, then the read just needs (n/2) nodes, we are
>> purely waste the one extra request, and hurts read latency as well.
>>
>> Thanks
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
>> wrote:
>>
>>>
>>> We have CL.TWO.
>>>>
>>>>
>>>>
>>> This was actually the original motivation for CL.TWO and CL.THREE if
>>> memory serves:
>>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>
> --

*Justin Cameron*Senior Software Engineer

<https://www.instaclustr.com/>

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: Definition of QUORUM consistency level

Posted by Brandon Williams <dr...@gmail.com>.

I don't disagree with you there and have never liked TWO/THREE.  This is
somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338

I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
also not sure what is.


On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:

> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> example, they do not work if the number of replicas go to 8, which does
> possible in our environment (2 replicas in each of 4 DCs).
>
> What people want from quorum is strong consistency guarantee, as long as
> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
> is the most expensive option.
>
> I can not think of a reason, that people want the quorum read, not for
> strong consistency reason, but just to read from (n/2+1) nodes. If they
> want strong consistency, then the read just needs (n/2) nodes, we are
> purely waste the one extra request, and hurts read latency as well.
>
> Thanks
> Dikang.
>
> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
> wrote:
>
>>
>> We have CL.TWO.
>>>
>>>
>>>
>> This was actually the original motivation for CL.TWO and CL.THREE if
>> memory serves:
>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>
>
>
>
> --
> Dikang
>
>

Re: Definition of QUORUM consistency level

Posted by Brandon Williams <dr...@gmail.com>.

I don't disagree with you there and have never liked TWO/THREE.  This is
somewhat relevant: https://issues.apache.org/jira/browse/CASSANDRA-2338

I don't think going to CL.FOUR, etc, is a good long-term solution, but I'm
also not sure what is.


On Thu, Jun 8, 2017 at 11:20 PM, Dikang Gu <di...@gmail.com> wrote:

> To me, CL.TWO and CL.THREE are more like work around of the problem, for
> example, they do not work if the number of replicas go to 8, which does
> possible in our environment (2 replicas in each of 4 DCs).
>
> What people want from quorum is strong consistency guarantee, as long as
> R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
> R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
> is the most expensive option.
>
> I can not think of a reason, that people want the quorum read, not for
> strong consistency reason, but just to read from (n/2+1) nodes. If they
> want strong consistency, then the read just needs (n/2) nodes, we are
> purely waste the one extra request, and hurts read latency as well.
>
> Thanks
> Dikang.
>
> On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com>
> wrote:
>
>>
>> We have CL.TWO.
>>>
>>>
>>>
>> This was actually the original motivation for CL.TWO and CL.THREE if
>> memory serves:
>> https://issues.apache.org/jira/browse/CASSANDRA-2013
>>
>
>
>
> --
> Dikang
>
>

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

To me, CL.TWO and CL.THREE are more like work around of the problem, for
example, they do not work if the number of replicas go to 8, which does
possible in our environment (2 replicas in each of 4 DCs).

What people want from quorum is strong consistency guarantee, as long as
R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
is the most expensive option.

I can not think of a reason, that people want the quorum read, not for
strong consistency reason, but just to read from (n/2+1) nodes. If they
want strong consistency, then the read just needs (n/2) nodes, we are
purely waste the one extra request, and hurts read latency as well.

Thanks
Dikang.

On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com> wrote:

>
> We have CL.TWO.
>>
>>
>>
> This was actually the original motivation for CL.TWO and CL.THREE if
> memory serves:
> https://issues.apache.org/jira/browse/CASSANDRA-2013
>

-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

To me, CL.TWO and CL.THREE are more like work around of the problem, for
example, they do not work if the number of replicas go to 8, which does
possible in our environment (2 replicas in each of 4 DCs).

What people want from quorum is strong consistency guarantee, as long as
R+W > N, there are three options: a) R=W=(n/2+1); b) R=(n/2), W=(n/2+1); c)
R=(n/2+1), W=(n/2). What Cassandra doing right now, is the option a), which
is the most expensive option.

I can not think of a reason, that people want the quorum read, not for
strong consistency reason, but just to read from (n/2+1) nodes. If they
want strong consistency, then the read just needs (n/2) nodes, we are
purely waste the one extra request, and hurts read latency as well.

Thanks
Dikang.

On Thu, Jun 8, 2017 at 8:20 PM, Nate McCall <na...@thelastpickle.com> wrote:

>
> We have CL.TWO.
>>
>>
>>
> This was actually the original motivation for CL.TWO and CL.THREE if
> memory serves:
> https://issues.apache.org/jira/browse/CASSANDRA-2013
>

-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Nate McCall <na...@thelastpickle.com>.

> We have CL.TWO.
>
>
>
This was actually the original motivation for CL.TWO and CL.THREE if memory
serves:
https://issues.apache.org/jira/browse/CASSANDRA-2013

Re: Definition of QUORUM consistency level

Posted by Nate McCall <na...@thelastpickle.com>.

> We have CL.TWO.
>
>
>
This was actually the original motivation for CL.TWO and CL.THREE if memory
serves:
https://issues.apache.org/jira/browse/CASSANDRA-2013

Re: Definition of QUORUM consistency level

Posted by Brandon Williams <dr...@gmail.com>.

We have CL.TWO.

On Thu, Jun 8, 2017 at 10:03 PM, Dikang Gu <di...@gmail.com> wrote:

> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>
> On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
>
>> It would be a little weird to change the definition of QUORUM, which
>> means majority, to mean something other than majority for a single use
>> case. Sounds like you want to introduce a new CL, HALF.
>> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <di...@gmail.com> wrote:
>>
>>> Justin, what I suggest is that for QUORUM consistent level, the block
>>> for write should be (num_replica/2)+1, this is same as today, but for read
>>> request, we just need to access (num_replica/2) nodes, which should provide
>>> enough strong consistency.
>>>
>>> Dikang.
>>>
>>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
>>> wrote:
>>>
>>>> 2/4 for write and 2/4 for read would not be sufficient to achieve
>>>> strong consistency, as there is no overlap.
>>>>
>>>> In your particular case you could potentially use QUORUM for write and
>>>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>>>> add additional nodes in the future this would obviously no longer work.
>>>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>>>> accessible to perform writes. I'd also guess that it's unlikely to provide
>>>> any significant performance increase.
>>>>
>>>> Justin
>>>>
>>>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>>>>
>>>>> Hello there,
>>>>>
>>>>> We have some use cases are doing consistent read/write requests, and
>>>>> we have 4 replicas in that cluster, according to our setup.
>>>>>
>>>>> What's interesting to me is that, for both read and write quorum
>>>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>>>> replicas more than 4.
>>>>>
>>>>> I think it's not necessary to have 2 overlap nodes in even replication
>>>>> factor case.
>>>>>
>>>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>>>> for read and write requests, so that we can reduce one replica request in
>>>>> read path.
>>>>>
>>>>> Any concerns?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> --
>>>>> Dikang
>>>>>
>>>>> --
>>>>
>>>>
>>>> *Justin Cameron*Senior Software Engineer
>>>>
>>>>
>>>> <https://www.instaclustr.com/>
>>>>
>>>>
>>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>>> (Australia) and Instaclustr Inc (USA).
>>>>
>>>> This email and any attachments may contain confidential and legally
>>>> privileged information.  If you are not the intended recipient, do not copy
>>>> or disclose its content, but please reply to this email immediately and
>>>> highlight the error to the sender and then immediately delete the message.
>>>>
>>>
>>>
>>>
>>> --
>>> Dikang
>>>
>>>
>
>
> --
> Dikang
>
>

Re: Definition of QUORUM consistency level

Posted by Brandon Williams <dr...@gmail.com>.

We have CL.TWO.

On Thu, Jun 8, 2017 at 10:03 PM, Dikang Gu <di...@gmail.com> wrote:

> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>
> On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
>
>> It would be a little weird to change the definition of QUORUM, which
>> means majority, to mean something other than majority for a single use
>> case. Sounds like you want to introduce a new CL, HALF.
>> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <di...@gmail.com> wrote:
>>
>>> Justin, what I suggest is that for QUORUM consistent level, the block
>>> for write should be (num_replica/2)+1, this is same as today, but for read
>>> request, we just need to access (num_replica/2) nodes, which should provide
>>> enough strong consistency.
>>>
>>> Dikang.
>>>
>>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
>>> wrote:
>>>
>>>> 2/4 for write and 2/4 for read would not be sufficient to achieve
>>>> strong consistency, as there is no overlap.
>>>>
>>>> In your particular case you could potentially use QUORUM for write and
>>>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>>>> add additional nodes in the future this would obviously no longer work.
>>>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>>>> accessible to perform writes. I'd also guess that it's unlikely to provide
>>>> any significant performance increase.
>>>>
>>>> Justin
>>>>
>>>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>>>>
>>>>> Hello there,
>>>>>
>>>>> We have some use cases are doing consistent read/write requests, and
>>>>> we have 4 replicas in that cluster, according to our setup.
>>>>>
>>>>> What's interesting to me is that, for both read and write quorum
>>>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>>>> replicas more than 4.
>>>>>
>>>>> I think it's not necessary to have 2 overlap nodes in even replication
>>>>> factor case.
>>>>>
>>>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>>>> for read and write requests, so that we can reduce one replica request in
>>>>> read path.
>>>>>
>>>>> Any concerns?
>>>>>
>>>>> Thanks!
>>>>>
>>>>>
>>>>> --
>>>>> Dikang
>>>>>
>>>>> --
>>>>
>>>>
>>>> *Justin Cameron*Senior Software Engineer
>>>>
>>>>
>>>> <https://www.instaclustr.com/>
>>>>
>>>>
>>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>>> (Australia) and Instaclustr Inc (USA).
>>>>
>>>> This email and any attachments may contain confidential and legally
>>>> privileged information.  If you are not the intended recipient, do not copy
>>>> or disclose its content, but please reply to this email immediately and
>>>> highlight the error to the sender and then immediately delete the message.
>>>>
>>>
>>>
>>>
>>> --
>>> Dikang
>>>
>>>
>
>
> --
> Dikang
>
>

Re: Definition of QUORUM consistency level

Posted by Nate McCall <na...@thelastpickle.com>.

> So, for the quorum, what we really want is that there is one overlap among
> the nodes in write path and read path. It actually was my assumption for a
> long time that we need (N/2 + 1) for write and just need (N/2) for read,
> because it's enough to provide the strong consistency.
>

You are write about strong consistency with that calculation, but if I want
to issue a QUORUM read just by itself, I would expect a majority of nodes
to reply. How it was written might be immaterial to my use case of reading
'from a majority.'

-- 
-----------------
Nate McCall
Wellington, NZ
@zznate

CTO
Apache Cassandra Consulting
http://www.thelastpickle.com

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

So, for the quorum, what we really want is that there is one overlap among
the nodes in write path and read path. It actually was my assumption for a
long time that we need (N/2 + 1) for write and just need (N/2) for read,
because it's enough to provide the strong consistency.

On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> It would be a little weird to change the definition of QUORUM, which means
> majority, to mean something other than majority for a single use case.
> Sounds like you want to introduce a new CL, HALF.
> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <di...@gmail.com> wrote:
>
>> Justin, what I suggest is that for QUORUM consistent level, the block for
>> write should be (num_replica/2)+1, this is same as today, but for read
>> request, we just need to access (num_replica/2) nodes, which should provide
>> enough strong consistency.
>>
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
>> wrote:
>>
>>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>>> consistency, as there is no overlap.
>>>
>>> In your particular case you could potentially use QUORUM for write and
>>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>>> add additional nodes in the future this would obviously no longer work.
>>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>>> accessible to perform writes. I'd also guess that it's unlikely to provide
>>> any significant performance increase.
>>>
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>>>
>>>> Hello there,
>>>>
>>>> We have some use cases are doing consistent read/write requests, and we
>>>> have 4 replicas in that cluster, according to our setup.
>>>>
>>>> What's interesting to me is that, for both read and write quorum
>>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>>> replicas more than 4.
>>>>
>>>> I think it's not necessary to have 2 overlap nodes in even replication
>>>> factor case.
>>>>
>>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>>> for read and write requests, so that we can reduce one replica request in
>>>> read path.
>>>>
>>>> Any concerns?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> --
>>>> Dikang
>>>>
>>>> --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> <https://www.instaclustr.com/>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

So, for the quorum, what we really want is that there is one overlap among
the nodes in write path and read path. It actually was my assumption for a
long time that we need (N/2 + 1) for write and just need (N/2) for read,
because it's enough to provide the strong consistency.

On Thu, Jun 8, 2017 at 7:47 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:

> It would be a little weird to change the definition of QUORUM, which means
> majority, to mean something other than majority for a single use case.
> Sounds like you want to introduce a new CL, HALF.
> On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <di...@gmail.com> wrote:
>
>> Justin, what I suggest is that for QUORUM consistent level, the block for
>> write should be (num_replica/2)+1, this is same as today, but for read
>> request, we just need to access (num_replica/2) nodes, which should provide
>> enough strong consistency.
>>
>> Dikang.
>>
>> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
>> wrote:
>>
>>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>>> consistency, as there is no overlap.
>>>
>>> In your particular case you could potentially use QUORUM for write and
>>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>>> add additional nodes in the future this would obviously no longer work.
>>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>>> accessible to perform writes. I'd also guess that it's unlikely to provide
>>> any significant performance increase.
>>>
>>> Justin
>>>
>>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>>>
>>>> Hello there,
>>>>
>>>> We have some use cases are doing consistent read/write requests, and we
>>>> have 4 replicas in that cluster, according to our setup.
>>>>
>>>> What's interesting to me is that, for both read and write quorum
>>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>>> replicas more than 4.
>>>>
>>>> I think it's not necessary to have 2 overlap nodes in even replication
>>>> factor case.
>>>>
>>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>>> for read and write requests, so that we can reduce one replica request in
>>>> read path.
>>>>
>>>> Any concerns?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>> --
>>>> Dikang
>>>>
>>>> --
>>>
>>>
>>> *Justin Cameron*Senior Software Engineer
>>>
>>>
>>> <https://www.instaclustr.com/>
>>>
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>
>>
>>
>> --
>> Dikang
>>
>>


-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

It would be a little weird to change the definition of QUORUM, which means
majority, to mean something other than majority for a single use case.
Sounds like you want to introduce a new CL, HALF.
On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <di...@gmail.com> wrote:

> Justin, what I suggest is that for QUORUM consistent level, the block for
> write should be (num_replica/2)+1, this is same as today, but for read
> request, we just need to access (num_replica/2) nodes, which should provide
> enough strong consistency.
>
> Dikang.
>
> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
> wrote:
>
>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>> consistency, as there is no overlap.
>>
>> In your particular case you could potentially use QUORUM for write and
>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>> add additional nodes in the future this would obviously no longer work.
>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>> accessible to perform writes. I'd also guess that it's unlikely to provide
>> any significant performance increase.
>>
>> Justin
>>
>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>>
>>> Hello there,
>>>
>>> We have some use cases are doing consistent read/write requests, and we
>>> have 4 replicas in that cluster, according to our setup.
>>>
>>> What's interesting to me is that, for both read and write quorum
>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>> replicas more than 4.
>>>
>>> I think it's not necessary to have 2 overlap nodes in even replication
>>> factor case.
>>>
>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>> for read and write requests, so that we can reduce one replica request in
>>> read path.
>>>
>>> Any concerns?
>>>
>>> Thanks!
>>>
>>>
>>> --
>>> Dikang
>>>
>>> --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> <https://www.instaclustr.com/>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>
>
> --
> Dikang
>
>

Re: Definition of QUORUM consistency level

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

It would be a little weird to change the definition of QUORUM, which means
majority, to mean something other than majority for a single use case.
Sounds like you want to introduce a new CL, HALF.
On Thu, Jun 8, 2017 at 7:43 PM Dikang Gu <di...@gmail.com> wrote:

> Justin, what I suggest is that for QUORUM consistent level, the block for
> write should be (num_replica/2)+1, this is same as today, but for read
> request, we just need to access (num_replica/2) nodes, which should provide
> enough strong consistency.
>
> Dikang.
>
> On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
> wrote:
>
>> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
>> consistency, as there is no overlap.
>>
>> In your particular case you could potentially use QUORUM for write and
>> TWO for read (or vice-versa) and still achieve strong consistency. If you
>> add additional nodes in the future this would obviously no longer work.
>> Also the benefit of this is dubious, since 3/4 nodes still need to be
>> accessible to perform writes. I'd also guess that it's unlikely to provide
>> any significant performance increase.
>>
>> Justin
>>
>> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>>
>>> Hello there,
>>>
>>> We have some use cases are doing consistent read/write requests, and we
>>> have 4 replicas in that cluster, according to our setup.
>>>
>>> What's interesting to me is that, for both read and write quorum
>>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>>> replicas more than 4.
>>>
>>> I think it's not necessary to have 2 overlap nodes in even replication
>>> factor case.
>>>
>>> I suggest to change the `quorumFor(keyspace)` code, separate the case
>>> for read and write requests, so that we can reduce one replica request in
>>> read path.
>>>
>>> Any concerns?
>>>
>>> Thanks!
>>>
>>>
>>> --
>>> Dikang
>>>
>>> --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> <https://www.instaclustr.com/>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>
>
> --
> Dikang
>
>

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

Justin, what I suggest is that for QUORUM consistent level, the block for
write should be (num_replica/2)+1, this is same as today, but for read
request, we just need to access (num_replica/2) nodes, which should provide
enough strong consistency.

Dikang.

On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
wrote:

> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
> consistency, as there is no overlap.
>
> In your particular case you could potentially use QUORUM for write and TWO
> for read (or vice-versa) and still achieve strong consistency. If you add
> additional nodes in the future this would obviously no longer work. Also
> the benefit of this is dubious, since 3/4 nodes still need to be accessible
> to perform writes. I'd also guess that it's unlikely to provide any
> significant performance increase.
>
> Justin
>
> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>
>> Hello there,
>>
>> We have some use cases are doing consistent read/write requests, and we
>> have 4 replicas in that cluster, according to our setup.
>>
>> What's interesting to me is that, for both read and write quorum
>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>> replicas more than 4.
>>
>> I think it's not necessary to have 2 overlap nodes in even replication
>> factor case.
>>
>> I suggest to change the `quorumFor(keyspace)` code, separate the case for
>> read and write requests, so that we can reduce one replica request in read
>> path.
>>
>> Any concerns?
>>
>> Thanks!
>>
>>
>> --
>> Dikang
>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>



-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Dikang Gu <di...@gmail.com>.

Justin, what I suggest is that for QUORUM consistent level, the block for
write should be (num_replica/2)+1, this is same as today, but for read
request, we just need to access (num_replica/2) nodes, which should provide
enough strong consistency.

Dikang.

On Thu, Jun 8, 2017 at 7:38 PM, Justin Cameron <ju...@instaclustr.com>
wrote:

> 2/4 for write and 2/4 for read would not be sufficient to achieve strong
> consistency, as there is no overlap.
>
> In your particular case you could potentially use QUORUM for write and TWO
> for read (or vice-versa) and still achieve strong consistency. If you add
> additional nodes in the future this would obviously no longer work. Also
> the benefit of this is dubious, since 3/4 nodes still need to be accessible
> to perform writes. I'd also guess that it's unlikely to provide any
> significant performance increase.
>
> Justin
>
> On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:
>
>> Hello there,
>>
>> We have some use cases are doing consistent read/write requests, and we
>> have 4 replicas in that cluster, according to our setup.
>>
>> What's interesting to me is that, for both read and write quorum
>> requests, they are blocked for 4/2+1 = 3 replicas, so we are accessing 3
>> (for write) + 3 (for reads) = 6 replicas in quorum requests, which is 2
>> replicas more than 4.
>>
>> I think it's not necessary to have 2 overlap nodes in even replication
>> factor case.
>>
>> I suggest to change the `quorumFor(keyspace)` code, separate the case for
>> read and write requests, so that we can reduce one replica request in read
>> path.
>>
>> Any concerns?
>>
>> Thanks!
>>
>>
>> --
>> Dikang
>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>



-- 
Dikang

Re: Definition of QUORUM consistency level

Posted by Justin Cameron <ju...@instaclustr.com>.

2/4 for write and 2/4 for read would not be sufficient to achieve strong
consistency, as there is no overlap.

In your particular case you could potentially use QUORUM for write and TWO
for read (or vice-versa) and still achieve strong consistency. If you add
additional nodes in the future this would obviously no longer work. Also
the benefit of this is dubious, since 3/4 nodes still need to be accessible
to perform writes. I'd also guess that it's unlikely to provide any
significant performance increase.

Justin

On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:

> Hello there,
>
> We have some use cases are doing consistent read/write requests, and we
> have 4 replicas in that cluster, according to our setup.
>
> What's interesting to me is that, for both read and write quorum requests,
> they are blocked for 4/2+1 = 3 replicas, so we are accessing 3 (for write)
> + 3 (for reads) = 6 replicas in quorum requests, which is 2 replicas more
> than 4.
>
> I think it's not necessary to have 2 overlap nodes in even replication
> factor case.
>
> I suggest to change the `quorumFor(keyspace)` code, separate the case for
> read and write requests, so that we can reduce one replica request in read
> path.
>
> Any concerns?
>
> Thanks!
>
>
> --
> Dikang
>
> --

*Justin Cameron*Senior Software Engineer

<https://www.instaclustr.com/>

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.

Re: Definition of QUORUM consistency level

Posted by Justin Cameron <ju...@instaclustr.com>.

2/4 for write and 2/4 for read would not be sufficient to achieve strong
consistency, as there is no overlap.

In your particular case you could potentially use QUORUM for write and TWO
for read (or vice-versa) and still achieve strong consistency. If you add
additional nodes in the future this would obviously no longer work. Also
the benefit of this is dubious, since 3/4 nodes still need to be accessible
to perform writes. I'd also guess that it's unlikely to provide any
significant performance increase.

Justin

On Fri, 9 Jun 2017 at 12:29 Dikang Gu <di...@gmail.com> wrote:

> Hello there,
>
> We have some use cases are doing consistent read/write requests, and we
> have 4 replicas in that cluster, according to our setup.
>
> What's interesting to me is that, for both read and write quorum requests,
> they are blocked for 4/2+1 = 3 replicas, so we are accessing 3 (for write)
> + 3 (for reads) = 6 replicas in quorum requests, which is 2 replicas more
> than 4.
>
> I think it's not necessary to have 2 overlap nodes in even replication
> factor case.
>
> I suggest to change the `quorumFor(keyspace)` code, separate the case for
> read and write requests, so that we can reduce one replica request in read
> path.
>
> Any concerns?
>
> Thanks!
>
>
> --
> Dikang
>
> --

*Justin Cameron*Senior Software Engineer

<https://www.instaclustr.com/>

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.