You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Benjamin Lerer <be...@datastax.com> on 2020/11/11 16:03:11 UTC

[DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

CASSANDRA-12126 addresses one correctness issue of Light Weight
Transactions. Unfortunately, the current patch developed by Sylvain and
Benedict requires an extra round trip between the coordinator and the
replicas for SERIAL and LOCAL_SERIAL reads.
After some experimentations, Benedict discovered that this extra round trip
could lead to a significant increase in timeouts for read-heavy workloads.

Users for which this behavior is a problem will be able to switch back to
the old behavior using a system property, therefore choosing performance
versus correctness.

On the side, Benedict has worked on another approach that does not suffer
from that performance problem and also addresses some LWT correctness
issues that can happen when adding or removing nodes. He initially intended
to deliver that improvement in 4.X but can try to incorporate it into 4.0.

Regarding CASSANDRA-12126 and 4.0 we are facing several options and
Benedict, Sylvain and I wanted to get the community feedback on them.

We can:

   1. Try to use Benedict proposal for 4.0 if the community has the
   appetite for it. The main issue there is some potential extra delay for 4.0
   2. Do nothing for 4.0. Meaning do not commit the current patch. We have
   lived a long time with that issue and we can probably wait a bit more for a
   proper solution.
   3. Commit the patch as such, fixing the correctness but introducing
   potentially some performance issue until we release a better solution.
   4. Changing the patch to default to the current behavior but allowing
   people to enable the new one if the correctness is a problem for them.

  Thanks in advance for your feedback.

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Joshua McKenzie <jm...@apache.org>.

Got it.

Thanks for the extra context.

No real opinion here. :)

On Wed, Nov 11, 2020 at 11:29 AM Benedict Elliott Smith <be...@apache.org>
wrote:

> It's been there since the beginning.
>
> If we were to consider the alternative proposal for 4.0, it would not have
> to be blocking for release. I had planned to come forward after 4.0,
> primarily because I did not want to create further political complexities
> for the project at this time, but also because I do not presently have the
> time to produce all of the documentation we might like for such a proposal.
> However, the work is ready, has already been reviewed by multiple
> committers, has had more extensive testing than any feature I'm aware of to
> date, and could be made available for 4.0 in fairly short order. While the
> work itself is non-trivial, the work to integrate it is not complex.  It
> would also be optional, and configurable at runtime.
>
> The only likely blocker would be the process of review, and any other due
> diligence the project might want to undertake.  Absolutely not something I
> advocate for or against an accelerated timescale on.  I have no personal
> preference for the approach taken, just providing this for context.
>
>
> On 11/11/2020, 16:18, "Joshua McKenzie" <jm...@apache.org> wrote:
>
>     How old is the C-12126 surfaced defect? i.e. is this a thing we've had
>     since initial introduction of paxos or is it a regression we introduced
>     somewhere along the way?
>
>     On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer <
> benjamin.lerer@datastax.com>
>     wrote:
>
>     > CASSANDRA-12126 addresses one correctness issue of Light Weight
>     > Transactions. Unfortunately, the current patch developed by Sylvain
> and
>     > Benedict requires an extra round trip between the coordinator and the
>     > replicas for SERIAL and LOCAL_SERIAL reads.
>     > After some experimentations, Benedict discovered that this extra
> round trip
>     > could lead to a significant increase in timeouts for read-heavy
> workloads.
>     >
>     > Users for which this behavior is a problem will be able to switch
> back to
>     > the old behavior using a system property, therefore choosing
> performance
>     > versus correctness.
>     >
>     > On the side, Benedict has worked on another approach that does not
> suffer
>     > from that performance problem and also addresses some LWT correctness
>     > issues that can happen when adding or removing nodes. He initially
> intended
>     > to deliver that improvement in 4.X but can try to incorporate it
> into 4.0.
>     >
>     > Regarding CASSANDRA-12126 and 4.0 we are facing several options and
>     > Benedict, Sylvain and I wanted to get the community feedback on them.
>     >
>     > We can:
>     >
>     >    1. Try to use Benedict proposal for 4.0 if the community has the
>     >    appetite for it. The main issue there is some potential extra
> delay for
>     > 4.0
>     >    2. Do nothing for 4.0. Meaning do not commit the current patch.
> We have
>     >    lived a long time with that issue and we can probably wait a bit
> more
>     > for a
>     >    proper solution.
>     >    3. Commit the patch as such, fixing the correctness but
> introducing
>     >    potentially some performance issue until we release a better
> solution.
>     >    4. Changing the patch to default to the current behavior but
> allowing
>     >    people to enable the new one if the correctness is a problem for
> them.
>     >
>     >   Thanks in advance for your feedback.
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

It's been there since the beginning.

If we were to consider the alternative proposal for 4.0, it would not have to be blocking for release. I had planned to come forward after 4.0, primarily because I did not want to create further political complexities for the project at this time, but also because I do not presently have the time to produce all of the documentation we might like for such a proposal. However, the work is ready, has already been reviewed by multiple committers, has had more extensive testing than any feature I'm aware of to date, and could be made available for 4.0 in fairly short order. While the work itself is non-trivial, the work to integrate it is not complex.  It would also be optional, and configurable at runtime.

The only likely blocker would be the process of review, and any other due diligence the project might want to undertake.  Absolutely not something I advocate for or against an accelerated timescale on.  I have no personal preference for the approach taken, just providing this for context.

On 11/11/2020, 16:18, "Joshua McKenzie" <jm...@apache.org> wrote:

    How old is the C-12126 surfaced defect? i.e. is this a thing we've had
    since initial introduction of paxos or is it a regression we introduced
    somewhere along the way?

    On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer <be...@datastax.com>
    wrote:

    > CASSANDRA-12126 addresses one correctness issue of Light Weight
    > Transactions. Unfortunately, the current patch developed by Sylvain and
    > Benedict requires an extra round trip between the coordinator and the
    > replicas for SERIAL and LOCAL_SERIAL reads.
    > After some experimentations, Benedict discovered that this extra round trip
    > could lead to a significant increase in timeouts for read-heavy workloads.
    >
    > Users for which this behavior is a problem will be able to switch back to
    > the old behavior using a system property, therefore choosing performance
    > versus correctness.
    >
    > On the side, Benedict has worked on another approach that does not suffer
    > from that performance problem and also addresses some LWT correctness
    > issues that can happen when adding or removing nodes. He initially intended
    > to deliver that improvement in 4.X but can try to incorporate it into 4.0.
    >
    > Regarding CASSANDRA-12126 and 4.0 we are facing several options and
    > Benedict, Sylvain and I wanted to get the community feedback on them.
    >
    > We can:
    >
    >    1. Try to use Benedict proposal for 4.0 if the community has the
    >    appetite for it. The main issue there is some potential extra delay for
    > 4.0
    >    2. Do nothing for 4.0. Meaning do not commit the current patch. We have
    >    lived a long time with that issue and we can probably wait a bit more
    > for a
    >    proper solution.
    >    3. Commit the patch as such, fixing the correctness but introducing
    >    potentially some performance issue until we release a better solution.
    >    4. Changing the patch to default to the current behavior but allowing
    >    people to enable the new one if the correctness is a problem for them.
    >
    >   Thanks in advance for your feedback.
    >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Joshua McKenzie <jm...@apache.org>.

How old is the C-12126 surfaced defect? i.e. is this a thing we've had
since initial introduction of paxos or is it a regression we introduced
somewhere along the way?

On Wed, Nov 11, 2020 at 11:03 AM Benjamin Lerer <be...@datastax.com>
wrote:

> CASSANDRA-12126 addresses one correctness issue of Light Weight
> Transactions. Unfortunately, the current patch developed by Sylvain and
> Benedict requires an extra round trip between the coordinator and the
> replicas for SERIAL and LOCAL_SERIAL reads.
> After some experimentations, Benedict discovered that this extra round trip
> could lead to a significant increase in timeouts for read-heavy workloads.
>
> Users for which this behavior is a problem will be able to switch back to
> the old behavior using a system property, therefore choosing performance
> versus correctness.
>
> On the side, Benedict has worked on another approach that does not suffer
> from that performance problem and also addresses some LWT correctness
> issues that can happen when adding or removing nodes. He initially intended
> to deliver that improvement in 4.X but can try to incorporate it into 4.0.
>
> Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> Benedict, Sylvain and I wanted to get the community feedback on them.
>
> We can:
>
>    1. Try to use Benedict proposal for 4.0 if the community has the
>    appetite for it. The main issue there is some potential extra delay for
> 4.0
>    2. Do nothing for 4.0. Meaning do not commit the current patch. We have
>    lived a long time with that issue and we can probably wait a bit more
> for a
>    proper solution.
>    3. Commit the patch as such, fixing the correctness but introducing
>    potentially some performance issue until we release a better solution.
>    4. Changing the patch to default to the current behavior but allowing
>    people to enable the new one if the correctness is a problem for them.
>
>   Thanks in advance for your feedback.
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Paulo Motta <pa...@gmail.com>.

Fair points. I retract the yaml suggestion and +1 to go with the
correctness route.

Em ter., 24 de nov. de 2020 às 11:13, Benjamin Lerer <
benjamin.lerer@datastax.com> escreveu:

> Paulo, what you propose with the yaml seems different from default to
> *correctness*. It means to me that we are forcing the user to choose
> between *correctness *and *performance*. Most of us have a good
> understanding of the problem and it is a hard choice for us. I imagine that
> most of the users do not fully understand LWTs and will not know what to
> choose. Some might not even use LWTs and will suddenly be forced to make a
> choice that they do not understand. It does not feel right to me to push
> them to make that choice.
>
> I also agree with Benedict and Mick that it is a risky thing to do.
>
> something that can bring a cluster down upon an unprepared user.
>
>
> I do not think that it will be the case (feel free to correct me Benedict).
> The impact will probably be an increase in the number of write/read
> timeouts for the LWTs read/writes. For a heavy load that would cause the
> services depending on those queries to become unreliable. On the other hand
> the impact of the current problem is that we can hit some correctness issue
> without even knowing it.
>
> We need to choose between two imperfect solutions and we have some
> difficulties to agree on which one to choose.
>
> Benedict suggested that Sylvain and I made the choice. Sylvain did not want
> to make the final call.
> I chose correctness. If it is a problem and people prefer to vote. It is
> perfectly fine for me too :-)
>
> I just want us to move forward.
>
>
>
> On Tue, Nov 24, 2020 at 12:52 PM Mick Semb Wever <mc...@apache.org> wrote:
>
> > > I think the keyword there is "normally" - if we can't say _certainly_,
> > > then this is probably an unsafe change to make.
> > >
> > > I can imagine any number of hacky upgrade processes that would be
> > > dangerous with this change.
> > >
> >
> >
> > I agree. We just don't know what users are doing, this is risky.
> >
> > IMO the same applies to a performance degradation, i.e. something that
> can
> > bring a cluster down upon an unprepared user. Despite our best efforts
> with
> > NEWS.txt we should still look after such users. IMHO the imperfection of
> > LWTs on past branches we have to carry. I'm well aware this is easier
> said
> > than done, even for far simpler changes. Having the flag there to switch
> to
> > "correct LWT" is still a huge win for users.
> >
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benjamin Lerer <be...@datastax.com>.

Thank you Sylvain and Benedict for the patch and thank you to everybody
that took the time to contribute to this discussion :-)



On Fri, Nov 27, 2020 at 5:15 PM Sylvain Lebresne <le...@gmail.com> wrote:

> I hope I haven't misread this, but it appears we've reached a kind of
> consensus for committing the fix, so I went ahead and did it.
> I added a NEWS entry that I hope is clear (and points to the flag that
> disables the fix if someone wants to go that route), but any committers can
> feel free to ninja-nitpick that NEWS entry if they so wish.
>
> Many thanks to Benjamin for driving the discussion here.
> --
> Sylvain
>
>
> On Tue, Nov 24, 2020 at 3:43 PM Ekaterina Dimitrova <e.dimitrova@gmail.com
> >
> wrote:
>
> > I am +1 on Benjamin’s proposal
> > and less interruptions during upgrades. For more visibility maybe we can
> > also write a short article about the options and the tradeoffs, further
> to
> > NEWS.txt (that’s not something to decide now, of course :-) )
> >
> >
> > On Tue, 24 Nov 2020 at 9:13, Benjamin Lerer <benjamin.lerer@datastax.com
> >
> > wrote:
> >
> > > Paulo, what you propose with the yaml seems different from default to
> > > *correctness*. It means to me that we are forcing the user to choose
> > > between *correctness *and *performance*. Most of us have a good
> > > understanding of the problem and it is a hard choice for us. I imagine
> > that
> > > most of the users do not fully understand LWTs and will not know what
> to
> > > choose. Some might not even use LWTs and will suddenly be forced to
> make
> > a
> > > choice that they do not understand. It does not feel right to me to
> push
> > > them to make that choice.
> > >
> > > I also agree with Benedict and Mick that it is a risky thing to do.
> > >
> > > something that can bring a cluster down upon an unprepared user.
> > >
> > >
> > > I do not think that it will be the case (feel free to correct me
> > Benedict).
> > > The impact will probably be an increase in the number of write/read
> > > timeouts for the LWTs read/writes. For a heavy load that would cause
> the
> > > services depending on those queries to become unreliable. On the other
> > hand
> > > the impact of the current problem is that we can hit some correctness
> > issue
> > > without even knowing it.
> > >
> > > We need to choose between two imperfect solutions and we have some
> > > difficulties to agree on which one to choose.
> > >
> > > Benedict suggested that Sylvain and I made the choice. Sylvain did not
> > want
> > > to make the final call.
> > > I chose correctness. If it is a problem and people prefer to vote. It
> is
> > > perfectly fine for me too :-)
> > >
> > > I just want us to move forward.
> > >
> > >
> > >
> > > On Tue, Nov 24, 2020 at 12:52 PM Mick Semb Wever <mc...@apache.org>
> wrote:
> > >
> > > > > I think the keyword there is "normally" - if we can't say
> > _certainly_,
> > > > > then this is probably an unsafe change to make.
> > > > >
> > > > > I can imagine any number of hacky upgrade processes that would be
> > > > > dangerous with this change.
> > > > >
> > > >
> > > >
> > > > I agree. We just don't know what users are doing, this is risky.
> > > >
> > > > IMO the same applies to a performance degradation, i.e. something
> that
> > > can
> > > > bring a cluster down upon an unprepared user. Despite our best
> efforts
> > > with
> > > > NEWS.txt we should still look after such users. IMHO the imperfection
> > of
> > > > LWTs on past branches we have to carry. I'm well aware this is easier
> > > said
> > > > than done, even for far simpler changes. Having the flag there to
> > switch
> > > to
> > > > "correct LWT" is still a huge win for users.
> > > >
> > >
> >
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Sylvain Lebresne <le...@gmail.com>.

I hope I haven't misread this, but it appears we've reached a kind of
consensus for committing the fix, so I went ahead and did it.
I added a NEWS entry that I hope is clear (and points to the flag that
disables the fix if someone wants to go that route), but any committers can
feel free to ninja-nitpick that NEWS entry if they so wish.

Many thanks to Benjamin for driving the discussion here.
--
Sylvain


On Tue, Nov 24, 2020 at 3:43 PM Ekaterina Dimitrova <e....@gmail.com>
wrote:

> I am +1 on Benjamin’s proposal
> and less interruptions during upgrades. For more visibility maybe we can
> also write a short article about the options and the tradeoffs, further to
> NEWS.txt (that’s not something to decide now, of course :-) )
>
>
> On Tue, 24 Nov 2020 at 9:13, Benjamin Lerer <be...@datastax.com>
> wrote:
>
> > Paulo, what you propose with the yaml seems different from default to
> > *correctness*. It means to me that we are forcing the user to choose
> > between *correctness *and *performance*. Most of us have a good
> > understanding of the problem and it is a hard choice for us. I imagine
> that
> > most of the users do not fully understand LWTs and will not know what to
> > choose. Some might not even use LWTs and will suddenly be forced to make
> a
> > choice that they do not understand. It does not feel right to me to push
> > them to make that choice.
> >
> > I also agree with Benedict and Mick that it is a risky thing to do.
> >
> > something that can bring a cluster down upon an unprepared user.
> >
> >
> > I do not think that it will be the case (feel free to correct me
> Benedict).
> > The impact will probably be an increase in the number of write/read
> > timeouts for the LWTs read/writes. For a heavy load that would cause the
> > services depending on those queries to become unreliable. On the other
> hand
> > the impact of the current problem is that we can hit some correctness
> issue
> > without even knowing it.
> >
> > We need to choose between two imperfect solutions and we have some
> > difficulties to agree on which one to choose.
> >
> > Benedict suggested that Sylvain and I made the choice. Sylvain did not
> want
> > to make the final call.
> > I chose correctness. If it is a problem and people prefer to vote. It is
> > perfectly fine for me too :-)
> >
> > I just want us to move forward.
> >
> >
> >
> > On Tue, Nov 24, 2020 at 12:52 PM Mick Semb Wever <mc...@apache.org> wrote:
> >
> > > > I think the keyword there is "normally" - if we can't say
> _certainly_,
> > > > then this is probably an unsafe change to make.
> > > >
> > > > I can imagine any number of hacky upgrade processes that would be
> > > > dangerous with this change.
> > > >
> > >
> > >
> > > I agree. We just don't know what users are doing, this is risky.
> > >
> > > IMO the same applies to a performance degradation, i.e. something that
> > can
> > > bring a cluster down upon an unprepared user. Despite our best efforts
> > with
> > > NEWS.txt we should still look after such users. IMHO the imperfection
> of
> > > LWTs on past branches we have to carry. I'm well aware this is easier
> > said
> > > than done, even for far simpler changes. Having the flag there to
> switch
> > to
> > > "correct LWT" is still a huge win for users.
> > >
> >
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Ekaterina Dimitrova <e....@gmail.com>.

I am +1 on Benjamin’s proposal
and less interruptions during upgrades. For more visibility maybe we can
also write a short article about the options and the tradeoffs, further to
NEWS.txt (that’s not something to decide now, of course :-) )


On Tue, 24 Nov 2020 at 9:13, Benjamin Lerer <be...@datastax.com>
wrote:

> Paulo, what you propose with the yaml seems different from default to
> *correctness*. It means to me that we are forcing the user to choose
> between *correctness *and *performance*. Most of us have a good
> understanding of the problem and it is a hard choice for us. I imagine that
> most of the users do not fully understand LWTs and will not know what to
> choose. Some might not even use LWTs and will suddenly be forced to make a
> choice that they do not understand. It does not feel right to me to push
> them to make that choice.
>
> I also agree with Benedict and Mick that it is a risky thing to do.
>
> something that can bring a cluster down upon an unprepared user.
>
>
> I do not think that it will be the case (feel free to correct me Benedict).
> The impact will probably be an increase in the number of write/read
> timeouts for the LWTs read/writes. For a heavy load that would cause the
> services depending on those queries to become unreliable. On the other hand
> the impact of the current problem is that we can hit some correctness issue
> without even knowing it.
>
> We need to choose between two imperfect solutions and we have some
> difficulties to agree on which one to choose.
>
> Benedict suggested that Sylvain and I made the choice. Sylvain did not want
> to make the final call.
> I chose correctness. If it is a problem and people prefer to vote. It is
> perfectly fine for me too :-)
>
> I just want us to move forward.
>
>
>
> On Tue, Nov 24, 2020 at 12:52 PM Mick Semb Wever <mc...@apache.org> wrote:
>
> > > I think the keyword there is "normally" - if we can't say _certainly_,
> > > then this is probably an unsafe change to make.
> > >
> > > I can imagine any number of hacky upgrade processes that would be
> > > dangerous with this change.
> > >
> >
> >
> > I agree. We just don't know what users are doing, this is risky.
> >
> > IMO the same applies to a performance degradation, i.e. something that
> can
> > bring a cluster down upon an unprepared user. Despite our best efforts
> with
> > NEWS.txt we should still look after such users. IMHO the imperfection of
> > LWTs on past branches we have to carry. I'm well aware this is easier
> said
> > than done, even for far simpler changes. Having the flag there to switch
> to
> > "correct LWT" is still a huge win for users.
> >
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Michael Semb Wever <mc...@apache.org>.

> Benedict suggested that Sylvain and I made the choice. Sylvain did not want
> to make the final call.
> I chose correctness. If it is a problem and people prefer to vote. It is
> perfectly fine for me too :-)


+1
Appreciate it having been raised for exposure and discussion Benjamin, and happy to leave the final say to those carrying the work on, especially in this case :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benjamin Lerer <be...@datastax.com>.

Paulo, what you propose with the yaml seems different from default to
*correctness*. It means to me that we are forcing the user to choose
between *correctness *and *performance*. Most of us have a good
understanding of the problem and it is a hard choice for us. I imagine that
most of the users do not fully understand LWTs and will not know what to
choose. Some might not even use LWTs and will suddenly be forced to make a
choice that they do not understand. It does not feel right to me to push
them to make that choice.

I also agree with Benedict and Mick that it is a risky thing to do.

something that can bring a cluster down upon an unprepared user.

I do not think that it will be the case (feel free to correct me Benedict).
The impact will probably be an increase in the number of write/read
timeouts for the LWTs read/writes. For a heavy load that would cause the
services depending on those queries to become unreliable. On the other hand
the impact of the current problem is that we can hit some correctness issue
without even knowing it.

We need to choose between two imperfect solutions and we have some
difficulties to agree on which one to choose.

Benedict suggested that Sylvain and I made the choice. Sylvain did not want
to make the final call.
I chose correctness. If it is a problem and people prefer to vote. It is
perfectly fine for me too :-)

I just want us to move forward.

On Tue, Nov 24, 2020 at 12:52 PM Mick Semb Wever <mc...@apache.org> wrote:

> > I think the keyword there is "normally" - if we can't say _certainly_,
> > then this is probably an unsafe change to make.
> >
> > I can imagine any number of hacky upgrade processes that would be
> > dangerous with this change.
> >
>
>
> I agree. We just don't know what users are doing, this is risky.
>
> IMO the same applies to a performance degradation, i.e. something that can
> bring a cluster down upon an unprepared user. Despite our best efforts with
> NEWS.txt we should still look after such users. IMHO the imperfection of
> LWTs on past branches we have to carry. I'm well aware this is easier said
> than done, even for far simpler changes. Having the flag there to switch to
> "correct LWT" is still a huge win for users.
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Mick Semb Wever <mc...@apache.org>.

> I think the keyword there is "normally" - if we can't say _certainly_,
> then this is probably an unsafe change to make.
>
> I can imagine any number of hacky upgrade processes that would be
> dangerous with this change.
>


I agree. We just don't know what users are doing, this is risky.

IMO the same applies to a performance degradation, i.e. something that can
bring a cluster down upon an unprepared user. Despite our best efforts with
NEWS.txt we should still look after such users. IMHO the imperfection of
LWTs on past branches we have to carry. I'm well aware this is easier said
than done, even for far simpler changes. Having the flag there to switch to
"correct LWT" is still a huge win for users.

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

I think the keyword there is "normally" - if we can't say _certainly_, then this is probably an unsafe change to make.

I can imagine any number of hacky upgrade processes that would be dangerous with this change.

But, happy to defer to the consensus of others.



On 24/11/2020, 11:04, "Paulo Motta" <pa...@gmail.com> wrote:

     In this case the breaking change is a feature, not a bug. The exact
    intention of this is to require manual intervention to raise awareness
    about the potential performance degradation. This sounds reasonable, once
    we already broke the contract of not introducing performance regressions in
    a minor.

    I don't see how this can pose an outage risk to the cluster given upgrades
    are normally performed in a rolling restart fashion, so the worst that
    could happen is the first node in the sequence not starting, so the upgrade
    would not proceed. In my view this would be far less harmful than figuring
    out about a performance regression after all your nodes are upgraded.

    Nevertheless, I'm pretty fine on retracting the suggestion to move forward
    with the proposal if you feel strongly about it.

    Em ter., 24 de nov. de 2020 às 07:26, Benedict Elliott Smith <
    benedict@apache.org> escreveu:

    > In my parlance the config property would be a breaking change, whereas the
    > LWT behaviour would be a performance regression.  This latter might cause
    > partial outages or service degradation, but refusing to start a prod
    > cluster without manual intervention is potentially a much worse situation,
    > and even more surprising for a patch upgrade.
    >
    > On 24/11/2020, 01:05, "Paulo Motta" <pa...@gmail.com> wrote:
    >
    >     Isn't the plan to change LWT implementation (and performance
    > expectation)
    >     in a patch version? This is a breaking change by itself, I'm just
    > proposing
    >     to make the trade-off choice explicit in the yaml to prevent unexpected
    >     performance degradation during upgrade (for users who are not aware of
    > the
    >     change).
    >
    >     Just to make it clear, I'm proposing having a "lwt_legacy_mode: false"
    >     uncommented in the default yaml with a descriptive comment about
    >     CASSANDRA-12126, so new users will always get the new behavior, but
    > users
    >     using a yaml template based on a previous 3.X version will not be able
    > to
    >     start the node because this property will be missing. I believe the
    >     majority of operators will just update their yaml with
    > "lwt_legacy_mode:
    >     false" and move on with their upgrades, but people wanting to keep the
    >     previous performance will become aware of the breaking change and set
    > it to
    >     true.
    >
    >     Em seg., 23 de nov. de 2020 às 21:07, Benedict Elliott Smith <
    >     benedict@apache.org> escreveu:
    >
    >     > What do you mean by minor upgrade? We can't break patch upgrades for
    > any
    >     > of 3.x, as this could also cause surprise outages.
    >     >
    >     > On 23/11/2020, 23:51, "Paulo Motta" <pa...@gmail.com>
    > wrote:
    >     >
    >     >      I was thinking about the YAML requirement during the 3.X minor
    >     > upgrade to
    >     >     make the decision explicit (need to update yaml) rather than
    > implicit
    >     > (by
    >     >     upgrading you agree with the change), since the latter can go
    >     > unnoticed by
    >     >     those who don't pay attention to NEWS.txt
    >     >
    >     >     Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
    >     >     benedict@apache.org> escreveu:
    >     >
    >     >     > What's the value of the yaml? The user is likely to have
    > upgraded to
    >     >     > latest 3.x as part of the upgrade process to 4.0, so they'll
    > already
    >     > have
    >     >     > had a decision made for them. If correctness didn't break
    > anything,
    >     > there
    >     >     > doesn't any longer seem much point in offering a choice?
    >     >     >
    >     >     > On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com>
    > wrote:
    >     >     >
    >     >     >     +1 to both as well.
    >     >     >
    >     >     >     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
    >     >     > <be...@apple.com.invalid>
    >     >     >     wrote:
    >     >     >
    >     >     >     > +1 to correctness, and I like the yaml idea
    >     >     >     >
    >     >     >     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <
    >     > pauloricardomg@gmail.com
    >     >     > >
    >     >     >     > wrote:
    >     >     >     > >
    >     >     >     > > +1 to defaulting for correctness.
    >     >     >     > >
    >     >     >     > > In addition to that, how about making it a mandatory
    >     > cassandra.yaml
    >     >     >     > > property defaulting to correctness? This would make
    > upgrades
    >     > with
    >     >     > an old
    >     >     >     > > cassandra.yaml fail unless an option is explicitly
    > specified,
    >     >     > making
    >     >     >     > > operators aware of the issue and forcing them to make a
    >     > choice.
    >     >     >     > >
    >     >     >     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
    >     >     >     > >> benjamin.lerer@datastax.com> escreveu:
    >     >     >     > >>
    >     >     >     > >> Thank you very much to everybody that provided
    > feedback. It
    >     >     > helped a
    >     >     >     > lot to
    >     >     >     > >> limit our options.
    >     >     >     > >>
    >     >     >     > >> Unfortunately, it seems that some poor soul (me,
    > really!!!)
    >     > will
    >     >     > have to
    >     >     >     > >> make the final call between #3 and #4.
    >     >     >     > >>
    >     >     >     > >> If I reformulate the question to: Do we default to
    >     > *correctness
    >     >     > *or to
    >     >     >     > >> *performance*?
    >     >     >     > >>
    >     >     >     > >> I would choose to default to *correctness*.
    >     >     >     > >>
    >     >     >     > >> Of course the situation is more complex than that but
    > it
    >     > seems
    >     >     > that
    >     >     >     > >> somebody has to make a call and live with it. It
    > seems to
    >     > me that
    >     >     > being
    >     >     >     > >> blamed for choosing correctness is easier to live
    > with ;-)
    >     >     >     > >>
    >     >     >     > >> Benjamin
    >     >     >     > >>
    >     >     >     > >> PS: I tried to push the choice on Sylvain but he
    > dodged the
    >     >     > bullet.
    >     >     >     > >>
    >     >     >     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott
    > Smith <
    >     >     >     > >> benedict@apache.org>
    >     >     >     > >> wrote:
    >     >     >     > >>
    >     >     >     > >>> I think I meant #4 __‍♂️
    >     >     >     > >>>
    >     >     >     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
    >     >     > <beggleston@apple.com.INVALID
    >     >     >     > >
    >     >     >     > >>> wrote:
    >     >     >     > >>>
    >     >     >     > >>>    I’d also prefer #3 over #4
    >     >     >     > >>>
    >     >     >     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott
    > Smith <
    >     >     >     > >>> benedict@apache.org> wrote:
    >     >     >     > >>>>
    >     >     >     > >>>> Well, I expressed a preference for #3 over #4,
    >     > particularly for
    >     >     >     > >> the
    >     >     >     > >>> 3.x series.  However at this point, I think the lack
    > of a
    >     > clear
    >     >     > project
    >     >     >     > >>> decision means we can punt it back to you and
    > Sylvain to
    >     > make
    >     >     > the final
    >     >     >     > >>> call.
    >     >     >     > >>>>
    >     >     >     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
    >     >     >     > >> benjamin.lerer@datastax.com>
    >     >     >     > >>> wrote:
    >     >     >     > >>>>
    >     >     >     > >>>>   I will try to summarize the discussion to clarify
    > the
    >     > outcome.
    >     >     >     > >>>>
    >     >     >     > >>>>   Mick is in favor of #4
    >     >     >     > >>>>   Summanth is in favor of #4
    >     >     >     > >>>>   Sylvain answer was not clear for me. I understood
    > it
    >     > like I
    >     >     >     > >>> prefer #3 to #4
    >     >     >     > >>>>   and I am also fine with #1
    >     >     >     > >>>>   Jeff is in favor of #3 and will understand #4
    >     >     >     > >>>>   David is in favor #3 (fix bug and add flag to
    > roll back
    >     > to old
    >     >     >     > >>> behavior) in
    >     >     >     > >>>>   4.0 and #4 in 3.0 and 3.11
    >     >     >     > >>>>
    >     >     >     > >>>>   Do not hesitate to correct me if I misunderstood
    > your
    >     > answer.
    >     >     >     > >>>>
    >     >     >     > >>>>   Based on these answers it seems clear that most
    > people
    >     > prefer
    >     >     > to
    >     >     >     > >>> go for #3
    >     >     >     > >>>>   or #4.
    >     >     >     > >>>>
    >     >     >     > >>>>   The choice between #3 (fix correctness opt-in to
    > current
    >     >     >     > >>> behavior) and #4
    >     >     >     > >>>>   (current behavior opt-in to correctness) is a bit
    > less
    >     > clear
    >     >     >     > >>> specially if
    >     >     >     > >>>>   we consider the 3.X branches or 4.0.
    >     >     >     > >>>>
    >     >     >     > >>>>   Does anybody as some idea on how to choose between
    >     > those 2
    >     >     >     > >>> choices or some
    >     >     >     > >>>>   extra opinions on #3 versus #4?
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
    >     >     >     > >>> dcapwell@gmail.com> wrote:
    >     >     >     > >>>>>
    >     >     >     > >>>>> I feel that #4 (fix bug and add flag to roll back
    > to old
    >     >     > behavior)
    >     >     >     > >>> is best.
    >     >     >     > >>>>>
    >     >     >     > >>>>> About the alternative implementation, I am fine
    > adding
    >     > it to
    >     >     > 3.x
    >     >     >     > >>> and 4.0,
    >     >     >     > >>>>> but should treat it as a different path disabled by
    >     > default
    >     >     > that
    >     >     >     > >>> you can
    >     >     >     > >>>>> opt-into, with a plan to opt-in by default
    > "eventually".
    >     >     >     > >>>>>
    >     >     >     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott
    > Smith <
    >     >     >     > >>>>> benedict@apache.org>
    >     >     >     > >>>>> wrote:
    >     >     >     > >>>>>
    >     >     >     > >>>>>> Perhaps there might be broader appetite to weigh
    > in on
    >     > which
    >     >     >     > >> major
    >     >     >     > >>>>>> releases we might target for work that fixes the
    >     > correctness
    >     >     > bug
    >     >     >     > >>> without
    >     >     >     > >>>>>> serious performance regression?
    >     >     >     > >>>>>>
    >     >     >     > >>>>>> i.e., if we were to fix the correctness bug now,
    >     > introducing a
    >     >     >     > >>> serious
    >     >     >     > >>>>>> performance regression (either opt-in or
    > opt-out), but
    >     > were to
    >     >     >     > >>> land work
    >     >     >     > >>>>>> without this problem for 5.0, would there be
    > appetite to
    >     >     > backport
    >     >     >     > >>> this
    >     >     >     > >>>>> work
    >     >     >     > >>>>>> to any of 4.0, 3.11 or 3.0?
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <
    > jjirsa@gmail.com>
    >     > wrote:
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>   This is complicated and relatively few people
    > on earth
    >     >     >     > >>> understand it,
    >     >     >     > >>>>>> so
    >     >     >     > >>>>>>   having little feedback is mostly expected,
    >     > unfortunately.
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>   My normal emotional response is "correctness is
    >     > required,
    >     >     >     > >>> opt-in to
    >     >     >     > >>>>>>   performance improvements that sacrifice strict
    >     > correctness",
    >     >     >     > >>> but I'm
    >     >     >     > >>>>>> also
    >     >     >     > >>>>>>   sure this is going to surprise people, and would
    >     > understand
    >     >     > /
    >     >     >     > >>> accept
    >     >     >     > >>>>> #4
    >     >     >     > >>>>>>   (default to current, opt-in to correct).
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott
    >     > Smith <
    >     >     >     > >>>>>> benedict@apache.org>
    >     >     >     > >>>>>>   wrote:
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>> It doesn't seem like there's much enthusiasm for
    > any
    >     > of the
    >     >     >     > >>> options
    >     >     >     > >>>>>>> available here...
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
    >     >     >     > >>>>> benedict@apache.org
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>> wrote:
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>> Is the new implementation a separate, distinctly
    >     > modularized
    >     >     >     > >>>>>> new
    >     >     >     > >>>>>>> body of work
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>   It’s primarily a distinct, modularised and new
    > body
    >     > of
    >     >     > work,
    >     >     >     > >>>>>> however
    >     >     >     > >>>>>>> there is some shared code that has been modified
    > -
    >     > namely
    >     >     >     > >>>>>> PaxosState, in
    >     >     >     > >>>>>>> which legacy code is maintained but modified for
    >     >     > compatibility,
    >     >     >     > >>> and
    >     >     >     > >>>>>> the
    >     >     >     > >>>>>>> system.paxos table (which receives a new column,
    > and
    >     > slightly
    >     >     >     > >>>>>> modified
    >     >     >     > >>>>>>> serialization code).  It is conceptually an
    > optimised
    >     >     > version of
    >     >     >     > >>>>> the
    >     >     >     > >>>>>>> existing algorithm.
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>   If there's a chance of being of value to 4.0,
    > I can
    >     > try to
    >     >     >     > >> put
    >     >     >     > >>>>>> up a
    >     >     >     > >>>>>>> patch next week alongside a high level
    > description of
    >     > the
    >     >     >     > >> changes.
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>> But a performance regression is a regression,
    > I'm not
    >     >     >     > >>>>>> shrugging it
    >     >     >     > >>>>>>> off.
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>   I don't want to give the impression I'm
    > shrugging
    >     > off the
    >     >     >     > >>>>>> correctness
    >     >     >     > >>>>>>> issue either. It's a serious issue to fix, but
    > since
    >     > all
    >     >     >     > >>> successful
    >     >     >     > >>>>>> updates
    >     >     >     > >>>>>>> to the database are linearizable, I think it's
    > likely
    >     > that
    >     >     > many
    >     >     >     > >>>>>>> applications behave correctly with the present
    >     > semantics, or
    >     >     > at
    >     >     >     > >>>>> least
    >     >     >     > >>>>>>> encounter only transient errors. No doubt many
    > also do
    >     > not,
    >     >     > but
    >     >     >     > >> I
    >     >     >     > >>>>>> have no
    >     >     >     > >>>>>>> idea of the ratio.
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>   The regression isn't itself a simple issue
    > either -
    >     >     > depending
    >     >     >     > >>>>> on
    >     >     >     > >>>>>> the
    >     >     >     > >>>>>>> topology and message latencies it is not
    > difficult to
    >     > produce
    >     >     >     > >>>>>> inescapable
    >     >     >     > >>>>>>> contention, i.e. guaranteed timeouts - that might
    >     > persist as
    >     >     >     > >> long
    >     >     >     > >>>>> as
    >     >     >     > >>>>>>> clients continue to retry. It could be quite a
    > serious
    >     >     >     > >> degradation
    >     >     >     > >>>>> of
    >     >     >     > >>>>>>> service to impose on our users.
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>   I don't pretend to know the correct way to
    > make a
    >     > decision
    >     >     >     > >>>>>> balancing
    >     >     >     > >>>>>>> these considerations, but I am perhaps more
    > concerned
    >     > about
    >     >     >     > >>>>> imposing
    >     >     >     > >>>>>>> service outages than I am temporarily maintaining
    >     > semantics
    >     >     > our
    >     >     >     > >>>>>> users have
    >     >     >     > >>>>>>> apparently accepted for years - though I
    > absolutely
    >     > share
    >     >     > your
    >     >     >     > >>>>>>> embarrassment there.
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
    >     >     >     > >> jmckenzie@apache.org
    >     >     >     > >>>>>>
    >     >     >     > >>>>>> wrote:
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>       Is the new implementation a separate,
    > distinctly
    >     >     >     > >>>>> modularized
    >     >     >     > >>>>>> new
    >     >     >     > >>>>>>> body of
    >     >     >     > >>>>>>>       work or does it make substantial changes to
    >     > existing
    >     >     >     > >>>>>>> implementation and
    >     >     >     > >>>>>>>       subsume it?
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain
    > Lebresne
    >     > <
    >     >     >     > >>>>>>> lebresne@gmail.com> wrote:
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>> Regarding option #4, I'll remark that experience
    >     > tends to
    >     >     >     > >>>>>>> suggest users
    >     >     >     > >>>>>>>> don't consistently read the `NEWS.txt` file on
    >     > upgrade,
    >     >     >     > >>>>> so
    >     >     >     > >>>>>>> option #4 will
    >     >     >     > >>>>>>>> likely essentially mean "LWT has a correctness
    > issue,
    >     > but
    >     >     >     > >>>>>> once
    >     >     >     > >>>>>>> it broke
    >     >     >     > >>>>>>>> your data enough that you'll notice, you'll be
    > able to
    >     >     >     > >>>>> dig
    >     >     >     > >>>>>> the
    >     >     >     > >>>>>>> proper flag
    >     >     >     > >>>>>>>> to fix it for next time". I guess it's better
    > than
    >     >     >     > >>>>>> nothing, of
    >     >     >     > >>>>>>> course, but
    >     >     >     > >>>>>>>> I'll admit that defaulting to "opt-in
    > correctness",
    >     >     >     > >>>>>> especially
    >     >     >     > >>>>>>> for a
    >     >     >     > >>>>>>>> feature (LWT) that exists uniquely to provide
    >     > additional
    >     >     >     > >>>>>>> guarantees, is
    >     >     >     > >>>>>>>> something I have a hard rallying behind.
    >     >     >     > >>>>>>>>
    >     >     >     > >>>>>>>> But a performance regression is a regression,
    > I'm not
    >     >     >     > >>>>>> shrugging
    >     >     >     > >>>>>>> it off.
    >     >     >     > >>>>>>>> Still, I feel we shouldn't leave LWT with a
    > fairly
    >     >     >     > >>>>> serious
    >     >     >     > >>>>>> known
    >     >     >     > >>>>>>>> correctness bug and I frankly feel bad for "the
    >     > project"
    >     >     >     > >>>>>> that
    >     >     >     > >>>>>>> this has been
    >     >     >     > >>>>>>>> known for so long without action, so I'm a bit
    > biased
    >     > in
    >     >     >     > >>>>>> wanting
    >     >     >     > >>>>>>> to get it
    >     >     >     > >>>>>>>> fixed asap.
    >     >     >     > >>>>>>>>
    >     >     >     > >>>>>>>> But maybe I'm overstating the urgency here, and
    > maybe
    >     >     >     > >>>>>> option #1
    >     >     >     > >>>>>>> is a better
    >     >     >     > >>>>>>>> way forward.
    >     >     >     > >>>>>>>>
    >     >     >     > >>>>>>>> --
    >     >     >     > >>>>>>>> Sylvain
    >     >     >     > >>>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>
    >     >     >
    > ---------------------------------------------------------------------
    >     >     >     > >>>>>>>   To unsubscribe, e-mail:
    >     >     > dev-unsubscribe@cassandra.apache.org
    >     >     >     > >>>>>>>   For additional commands, e-mail:
    >     >     >     > >> dev-help@cassandra.apache.org
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>
    >     >     >     > >>>
    >     >     >
    > ---------------------------------------------------------------------
    >     >     >     > >>>>>>> To unsubscribe, e-mail:
    >     > dev-unsubscribe@cassandra.apache.org
    >     >     >     > >>>>>>> For additional commands, e-mail:
    >     >     > dev-help@cassandra.apache.org
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>
    >     >     >
    > ---------------------------------------------------------------------
    >     >     >     > >>>>>> To unsubscribe, e-mail:
    >     > dev-unsubscribe@cassandra.apache.org
    >     >     >     > >>>>>> For additional commands, e-mail:
    >     >     > dev-help@cassandra.apache.org
    >     >     >     > >>>>>>
    >     >     >     > >>>>>>
    >     >     >     > >>>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>>>
    >     >     >     > >>
    >     >     >
    > ---------------------------------------------------------------------
    >     >     >     > >>>> To unsubscribe, e-mail:
    >     > dev-unsubscribe@cassandra.apache.org
    >     >     >     > >>>> For additional commands, e-mail:
    >     > dev-help@cassandra.apache.org
    >     >     >     > >>>>
    >     >     >     > >>>
    >     >     >     > >>>
    >     >     >     >
    >     > ---------------------------------------------------------------------
    >     >     >     > >>>    To unsubscribe, e-mail:
    >     > dev-unsubscribe@cassandra.apache.org
    >     >     >     > >>>    For additional commands, e-mail:
    >     >     > dev-help@cassandra.apache.org
    >     >     >     > >>>
    >     >     >     > >>>
    >     >     >     > >>>
    >     >     >     > >>>
    >     >     >     > >>>
    >     >     >
    > ---------------------------------------------------------------------
    >     >     >     > >>> To unsubscribe, e-mail:
    >     > dev-unsubscribe@cassandra.apache.org
    >     >     >     > >>> For additional commands, e-mail:
    >     > dev-help@cassandra.apache.org
    >     >     >     > >>>
    >     >     >     > >>>
    >     >     >     > >>
    >     >     >     >
    >     >     >     >
    >     > ---------------------------------------------------------------------
    >     >     >     > To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     >     >     > For additional commands, e-mail:
    > dev-help@cassandra.apache.org
    >     >     >     >
    >     >     >     >
    >     >     >
    >     >     >
    >     >     >
    >     >     >
    > ---------------------------------------------------------------------
    >     >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     >     > For additional commands, e-mail: dev-help@cassandra.apache.org
    >     >     >
    >     >     >
    >     >
    >     >
    >     >
    >     > ---------------------------------------------------------------------
    >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > For additional commands, e-mail: dev-help@cassandra.apache.org
    >     >
    >     >
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Paulo Motta <pa...@gmail.com>.

 In this case the breaking change is a feature, not a bug. The exact
intention of this is to require manual intervention to raise awareness
about the potential performance degradation. This sounds reasonable, once
we already broke the contract of not introducing performance regressions in
a minor.

I don't see how this can pose an outage risk to the cluster given upgrades
are normally performed in a rolling restart fashion, so the worst that
could happen is the first node in the sequence not starting, so the upgrade
would not proceed. In my view this would be far less harmful than figuring
out about a performance regression after all your nodes are upgraded.

Nevertheless, I'm pretty fine on retracting the suggestion to move forward
with the proposal if you feel strongly about it.

Em ter., 24 de nov. de 2020 às 07:26, Benedict Elliott Smith <
benedict@apache.org> escreveu:

> In my parlance the config property would be a breaking change, whereas the
> LWT behaviour would be a performance regression.  This latter might cause
> partial outages or service degradation, but refusing to start a prod
> cluster without manual intervention is potentially a much worse situation,
> and even more surprising for a patch upgrade.
>
> On 24/11/2020, 01:05, "Paulo Motta" <pa...@gmail.com> wrote:
>
>     Isn't the plan to change LWT implementation (and performance
> expectation)
>     in a patch version? This is a breaking change by itself, I'm just
> proposing
>     to make the trade-off choice explicit in the yaml to prevent unexpected
>     performance degradation during upgrade (for users who are not aware of
> the
>     change).
>
>     Just to make it clear, I'm proposing having a "lwt_legacy_mode: false"
>     uncommented in the default yaml with a descriptive comment about
>     CASSANDRA-12126, so new users will always get the new behavior, but
> users
>     using a yaml template based on a previous 3.X version will not be able
> to
>     start the node because this property will be missing. I believe the
>     majority of operators will just update their yaml with
> "lwt_legacy_mode:
>     false" and move on with their upgrades, but people wanting to keep the
>     previous performance will become aware of the breaking change and set
> it to
>     true.
>
>     Em seg., 23 de nov. de 2020 às 21:07, Benedict Elliott Smith <
>     benedict@apache.org> escreveu:
>
>     > What do you mean by minor upgrade? We can't break patch upgrades for
> any
>     > of 3.x, as this could also cause surprise outages.
>     >
>     > On 23/11/2020, 23:51, "Paulo Motta" <pa...@gmail.com>
> wrote:
>     >
>     >      I was thinking about the YAML requirement during the 3.X minor
>     > upgrade to
>     >     make the decision explicit (need to update yaml) rather than
> implicit
>     > (by
>     >     upgrading you agree with the change), since the latter can go
>     > unnoticed by
>     >     those who don't pay attention to NEWS.txt
>     >
>     >     Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
>     >     benedict@apache.org> escreveu:
>     >
>     >     > What's the value of the yaml? The user is likely to have
> upgraded to
>     >     > latest 3.x as part of the upgrade process to 4.0, so they'll
> already
>     > have
>     >     > had a decision made for them. If correctness didn't break
> anything,
>     > there
>     >     > doesn't any longer seem much point in offering a choice?
>     >     >
>     >     > On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com>
> wrote:
>     >     >
>     >     >     +1 to both as well.
>     >     >
>     >     >     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
>     >     > <be...@apple.com.invalid>
>     >     >     wrote:
>     >     >
>     >     >     > +1 to correctness, and I like the yaml idea
>     >     >     >
>     >     >     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <
>     > pauloricardomg@gmail.com
>     >     > >
>     >     >     > wrote:
>     >     >     > >
>     >     >     > > +1 to defaulting for correctness.
>     >     >     > >
>     >     >     > > In addition to that, how about making it a mandatory
>     > cassandra.yaml
>     >     >     > > property defaulting to correctness? This would make
> upgrades
>     > with
>     >     > an old
>     >     >     > > cassandra.yaml fail unless an option is explicitly
> specified,
>     >     > making
>     >     >     > > operators aware of the issue and forcing them to make a
>     > choice.
>     >     >     > >
>     >     >     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
>     >     >     > >> benjamin.lerer@datastax.com> escreveu:
>     >     >     > >>
>     >     >     > >> Thank you very much to everybody that provided
> feedback. It
>     >     > helped a
>     >     >     > lot to
>     >     >     > >> limit our options.
>     >     >     > >>
>     >     >     > >> Unfortunately, it seems that some poor soul (me,
> really!!!)
>     > will
>     >     > have to
>     >     >     > >> make the final call between #3 and #4.
>     >     >     > >>
>     >     >     > >> If I reformulate the question to: Do we default to
>     > *correctness
>     >     > *or to
>     >     >     > >> *performance*?
>     >     >     > >>
>     >     >     > >> I would choose to default to *correctness*.
>     >     >     > >>
>     >     >     > >> Of course the situation is more complex than that but
> it
>     > seems
>     >     > that
>     >     >     > >> somebody has to make a call and live with it. It
> seems to
>     > me that
>     >     > being
>     >     >     > >> blamed for choosing correctness is easier to live
> with ;-)
>     >     >     > >>
>     >     >     > >> Benjamin
>     >     >     > >>
>     >     >     > >> PS: I tried to push the choice on Sylvain but he
> dodged the
>     >     > bullet.
>     >     >     > >>
>     >     >     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott
> Smith <
>     >     >     > >> benedict@apache.org>
>     >     >     > >> wrote:
>     >     >     > >>
>     >     >     > >>> I think I meant #4 __‍♂️
>     >     >     > >>>
>     >     >     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
>     >     > <beggleston@apple.com.INVALID
>     >     >     > >
>     >     >     > >>> wrote:
>     >     >     > >>>
>     >     >     > >>>    I’d also prefer #3 over #4
>     >     >     > >>>
>     >     >     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott
> Smith <
>     >     >     > >>> benedict@apache.org> wrote:
>     >     >     > >>>>
>     >     >     > >>>> Well, I expressed a preference for #3 over #4,
>     > particularly for
>     >     >     > >> the
>     >     >     > >>> 3.x series.  However at this point, I think the lack
> of a
>     > clear
>     >     > project
>     >     >     > >>> decision means we can punt it back to you and
> Sylvain to
>     > make
>     >     > the final
>     >     >     > >>> call.
>     >     >     > >>>>
>     >     >     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
>     >     >     > >> benjamin.lerer@datastax.com>
>     >     >     > >>> wrote:
>     >     >     > >>>>
>     >     >     > >>>>   I will try to summarize the discussion to clarify
> the
>     > outcome.
>     >     >     > >>>>
>     >     >     > >>>>   Mick is in favor of #4
>     >     >     > >>>>   Summanth is in favor of #4
>     >     >     > >>>>   Sylvain answer was not clear for me. I understood
> it
>     > like I
>     >     >     > >>> prefer #3 to #4
>     >     >     > >>>>   and I am also fine with #1
>     >     >     > >>>>   Jeff is in favor of #3 and will understand #4
>     >     >     > >>>>   David is in favor #3 (fix bug and add flag to
> roll back
>     > to old
>     >     >     > >>> behavior) in
>     >     >     > >>>>   4.0 and #4 in 3.0 and 3.11
>     >     >     > >>>>
>     >     >     > >>>>   Do not hesitate to correct me if I misunderstood
> your
>     > answer.
>     >     >     > >>>>
>     >     >     > >>>>   Based on these answers it seems clear that most
> people
>     > prefer
>     >     > to
>     >     >     > >>> go for #3
>     >     >     > >>>>   or #4.
>     >     >     > >>>>
>     >     >     > >>>>   The choice between #3 (fix correctness opt-in to
> current
>     >     >     > >>> behavior) and #4
>     >     >     > >>>>   (current behavior opt-in to correctness) is a bit
> less
>     > clear
>     >     >     > >>> specially if
>     >     >     > >>>>   we consider the 3.X branches or 4.0.
>     >     >     > >>>>
>     >     >     > >>>>   Does anybody as some idea on how to choose between
>     > those 2
>     >     >     > >>> choices or some
>     >     >     > >>>>   extra opinions on #3 versus #4?
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
>     >     >     > >>> dcapwell@gmail.com> wrote:
>     >     >     > >>>>>
>     >     >     > >>>>> I feel that #4 (fix bug and add flag to roll back
> to old
>     >     > behavior)
>     >     >     > >>> is best.
>     >     >     > >>>>>
>     >     >     > >>>>> About the alternative implementation, I am fine
> adding
>     > it to
>     >     > 3.x
>     >     >     > >>> and 4.0,
>     >     >     > >>>>> but should treat it as a different path disabled by
>     > default
>     >     > that
>     >     >     > >>> you can
>     >     >     > >>>>> opt-into, with a plan to opt-in by default
> "eventually".
>     >     >     > >>>>>
>     >     >     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott
> Smith <
>     >     >     > >>>>> benedict@apache.org>
>     >     >     > >>>>> wrote:
>     >     >     > >>>>>
>     >     >     > >>>>>> Perhaps there might be broader appetite to weigh
> in on
>     > which
>     >     >     > >> major
>     >     >     > >>>>>> releases we might target for work that fixes the
>     > correctness
>     >     > bug
>     >     >     > >>> without
>     >     >     > >>>>>> serious performance regression?
>     >     >     > >>>>>>
>     >     >     > >>>>>> i.e., if we were to fix the correctness bug now,
>     > introducing a
>     >     >     > >>> serious
>     >     >     > >>>>>> performance regression (either opt-in or
> opt-out), but
>     > were to
>     >     >     > >>> land work
>     >     >     > >>>>>> without this problem for 5.0, would there be
> appetite to
>     >     > backport
>     >     >     > >>> this
>     >     >     > >>>>> work
>     >     >     > >>>>>> to any of 4.0, 3.11 or 3.0?
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <
> jjirsa@gmail.com>
>     > wrote:
>     >     >     > >>>>>>
>     >     >     > >>>>>>   This is complicated and relatively few people
> on earth
>     >     >     > >>> understand it,
>     >     >     > >>>>>> so
>     >     >     > >>>>>>   having little feedback is mostly expected,
>     > unfortunately.
>     >     >     > >>>>>>
>     >     >     > >>>>>>   My normal emotional response is "correctness is
>     > required,
>     >     >     > >>> opt-in to
>     >     >     > >>>>>>   performance improvements that sacrifice strict
>     > correctness",
>     >     >     > >>> but I'm
>     >     >     > >>>>>> also
>     >     >     > >>>>>>   sure this is going to surprise people, and would
>     > understand
>     >     > /
>     >     >     > >>> accept
>     >     >     > >>>>> #4
>     >     >     > >>>>>>   (default to current, opt-in to correct).
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott
>     > Smith <
>     >     >     > >>>>>> benedict@apache.org>
>     >     >     > >>>>>>   wrote:
>     >     >     > >>>>>>
>     >     >     > >>>>>>> It doesn't seem like there's much enthusiasm for
> any
>     > of the
>     >     >     > >>> options
>     >     >     > >>>>>>> available here...
>     >     >     > >>>>>>>
>     >     >     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>     >     >     > >>>>> benedict@apache.org
>     >     >     > >>>>>>>
>     >     >     > >>>>>>> wrote:
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>> Is the new implementation a separate, distinctly
>     > modularized
>     >     >     > >>>>>> new
>     >     >     > >>>>>>> body of work
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   It’s primarily a distinct, modularised and new
> body
>     > of
>     >     > work,
>     >     >     > >>>>>> however
>     >     >     > >>>>>>> there is some shared code that has been modified
> -
>     > namely
>     >     >     > >>>>>> PaxosState, in
>     >     >     > >>>>>>> which legacy code is maintained but modified for
>     >     > compatibility,
>     >     >     > >>> and
>     >     >     > >>>>>> the
>     >     >     > >>>>>>> system.paxos table (which receives a new column,
> and
>     > slightly
>     >     >     > >>>>>> modified
>     >     >     > >>>>>>> serialization code).  It is conceptually an
> optimised
>     >     > version of
>     >     >     > >>>>> the
>     >     >     > >>>>>>> existing algorithm.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   If there's a chance of being of value to 4.0,
> I can
>     > try to
>     >     >     > >> put
>     >     >     > >>>>>> up a
>     >     >     > >>>>>>> patch next week alongside a high level
> description of
>     > the
>     >     >     > >> changes.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>> But a performance regression is a regression,
> I'm not
>     >     >     > >>>>>> shrugging it
>     >     >     > >>>>>>> off.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   I don't want to give the impression I'm
> shrugging
>     > off the
>     >     >     > >>>>>> correctness
>     >     >     > >>>>>>> issue either. It's a serious issue to fix, but
> since
>     > all
>     >     >     > >>> successful
>     >     >     > >>>>>> updates
>     >     >     > >>>>>>> to the database are linearizable, I think it's
> likely
>     > that
>     >     > many
>     >     >     > >>>>>>> applications behave correctly with the present
>     > semantics, or
>     >     > at
>     >     >     > >>>>> least
>     >     >     > >>>>>>> encounter only transient errors. No doubt many
> also do
>     > not,
>     >     > but
>     >     >     > >> I
>     >     >     > >>>>>> have no
>     >     >     > >>>>>>> idea of the ratio.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   The regression isn't itself a simple issue
> either -
>     >     > depending
>     >     >     > >>>>> on
>     >     >     > >>>>>> the
>     >     >     > >>>>>>> topology and message latencies it is not
> difficult to
>     > produce
>     >     >     > >>>>>> inescapable
>     >     >     > >>>>>>> contention, i.e. guaranteed timeouts - that might
>     > persist as
>     >     >     > >> long
>     >     >     > >>>>> as
>     >     >     > >>>>>>> clients continue to retry. It could be quite a
> serious
>     >     >     > >> degradation
>     >     >     > >>>>> of
>     >     >     > >>>>>>> service to impose on our users.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   I don't pretend to know the correct way to
> make a
>     > decision
>     >     >     > >>>>>> balancing
>     >     >     > >>>>>>> these considerations, but I am perhaps more
> concerned
>     > about
>     >     >     > >>>>> imposing
>     >     >     > >>>>>>> service outages than I am temporarily maintaining
>     > semantics
>     >     > our
>     >     >     > >>>>>> users have
>     >     >     > >>>>>>> apparently accepted for years - though I
> absolutely
>     > share
>     >     > your
>     >     >     > >>>>>>> embarrassment there.
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
>     >     >     > >> jmckenzie@apache.org
>     >     >     > >>>>>>
>     >     >     > >>>>>> wrote:
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>       Is the new implementation a separate,
> distinctly
>     >     >     > >>>>> modularized
>     >     >     > >>>>>> new
>     >     >     > >>>>>>> body of
>     >     >     > >>>>>>>       work or does it make substantial changes to
>     > existing
>     >     >     > >>>>>>> implementation and
>     >     >     > >>>>>>>       subsume it?
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain
> Lebresne
>     > <
>     >     >     > >>>>>>> lebresne@gmail.com> wrote:
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>> Regarding option #4, I'll remark that experience
>     > tends to
>     >     >     > >>>>>>> suggest users
>     >     >     > >>>>>>>> don't consistently read the `NEWS.txt` file on
>     > upgrade,
>     >     >     > >>>>> so
>     >     >     > >>>>>>> option #4 will
>     >     >     > >>>>>>>> likely essentially mean "LWT has a correctness
> issue,
>     > but
>     >     >     > >>>>>> once
>     >     >     > >>>>>>> it broke
>     >     >     > >>>>>>>> your data enough that you'll notice, you'll be
> able to
>     >     >     > >>>>> dig
>     >     >     > >>>>>> the
>     >     >     > >>>>>>> proper flag
>     >     >     > >>>>>>>> to fix it for next time". I guess it's better
> than
>     >     >     > >>>>>> nothing, of
>     >     >     > >>>>>>> course, but
>     >     >     > >>>>>>>> I'll admit that defaulting to "opt-in
> correctness",
>     >     >     > >>>>>> especially
>     >     >     > >>>>>>> for a
>     >     >     > >>>>>>>> feature (LWT) that exists uniquely to provide
>     > additional
>     >     >     > >>>>>>> guarantees, is
>     >     >     > >>>>>>>> something I have a hard rallying behind.
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>> But a performance regression is a regression,
> I'm not
>     >     >     > >>>>>> shrugging
>     >     >     > >>>>>>> it off.
>     >     >     > >>>>>>>> Still, I feel we shouldn't leave LWT with a
> fairly
>     >     >     > >>>>> serious
>     >     >     > >>>>>> known
>     >     >     > >>>>>>>> correctness bug and I frankly feel bad for "the
>     > project"
>     >     >     > >>>>>> that
>     >     >     > >>>>>>> this has been
>     >     >     > >>>>>>>> known for so long without action, so I'm a bit
> biased
>     > in
>     >     >     > >>>>>> wanting
>     >     >     > >>>>>>> to get it
>     >     >     > >>>>>>>> fixed asap.
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>> But maybe I'm overstating the urgency here, and
> maybe
>     >     >     > >>>>>> option #1
>     >     >     > >>>>>>> is a better
>     >     >     > >>>>>>>> way forward.
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>> --
>     >     >     > >>>>>>>> Sylvain
>     >     >     > >>>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>>>>>   To unsubscribe, e-mail:
>     >     > dev-unsubscribe@cassandra.apache.org
>     >     >     > >>>>>>>   For additional commands, e-mail:
>     >     >     > >> dev-help@cassandra.apache.org
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>>>>> To unsubscribe, e-mail:
>     > dev-unsubscribe@cassandra.apache.org
>     >     >     > >>>>>>> For additional commands, e-mail:
>     >     > dev-help@cassandra.apache.org
>     >     >     > >>>>>>>
>     >     >     > >>>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>>>> To unsubscribe, e-mail:
>     > dev-unsubscribe@cassandra.apache.org
>     >     >     > >>>>>> For additional commands, e-mail:
>     >     > dev-help@cassandra.apache.org
>     >     >     > >>>>>>
>     >     >     > >>>>>>
>     >     >     > >>>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>>>
>     >     >     > >>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>>> To unsubscribe, e-mail:
>     > dev-unsubscribe@cassandra.apache.org
>     >     >     > >>>> For additional commands, e-mail:
>     > dev-help@cassandra.apache.org
>     >     >     > >>>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >     >
>     > ---------------------------------------------------------------------
>     >     >     > >>>    To unsubscribe, e-mail:
>     > dev-unsubscribe@cassandra.apache.org
>     >     >     > >>>    For additional commands, e-mail:
>     >     > dev-help@cassandra.apache.org
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     >     > >>> To unsubscribe, e-mail:
>     > dev-unsubscribe@cassandra.apache.org
>     >     >     > >>> For additional commands, e-mail:
>     > dev-help@cassandra.apache.org
>     >     >     > >>>
>     >     >     > >>>
>     >     >     > >>
>     >     >     >
>     >     >     >
>     > ---------------------------------------------------------------------
>     >     >     > To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     >     >     > For additional commands, e-mail:
> dev-help@cassandra.apache.org
>     >     >     >
>     >     >     >
>     >     >
>     >     >
>     >     >
>     >     >
> ---------------------------------------------------------------------
>     >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >     >
>     >     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

In my parlance the config property would be a breaking change, whereas the LWT behaviour would be a performance regression.  This latter might cause partial outages or service degradation, but refusing to start a prod cluster without manual intervention is potentially a much worse situation, and even more surprising for a patch upgrade. 

On 24/11/2020, 01:05, "Paulo Motta" <pa...@gmail.com> wrote:

    Isn't the plan to change LWT implementation (and performance expectation)
    in a patch version? This is a breaking change by itself, I'm just proposing
    to make the trade-off choice explicit in the yaml to prevent unexpected
    performance degradation during upgrade (for users who are not aware of the
    change).

    Just to make it clear, I'm proposing having a "lwt_legacy_mode: false"
    uncommented in the default yaml with a descriptive comment about
    CASSANDRA-12126, so new users will always get the new behavior, but users
    using a yaml template based on a previous 3.X version will not be able to
    start the node because this property will be missing. I believe the
    majority of operators will just update their yaml with "lwt_legacy_mode:
    false" and move on with their upgrades, but people wanting to keep the
    previous performance will become aware of the breaking change and set it to
    true.

    Em seg., 23 de nov. de 2020 às 21:07, Benedict Elliott Smith <
    benedict@apache.org> escreveu:

    > What do you mean by minor upgrade? We can't break patch upgrades for any
    > of 3.x, as this could also cause surprise outages.
    >
    > On 23/11/2020, 23:51, "Paulo Motta" <pa...@gmail.com> wrote:
    >
    >      I was thinking about the YAML requirement during the 3.X minor
    > upgrade to
    >     make the decision explicit (need to update yaml) rather than implicit
    > (by
    >     upgrading you agree with the change), since the latter can go
    > unnoticed by
    >     those who don't pay attention to NEWS.txt
    >
    >     Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
    >     benedict@apache.org> escreveu:
    >
    >     > What's the value of the yaml? The user is likely to have upgraded to
    >     > latest 3.x as part of the upgrade process to 4.0, so they'll already
    > have
    >     > had a decision made for them. If correctness didn't break anything,
    > there
    >     > doesn't any longer seem much point in offering a choice?
    >     >
    >     > On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com> wrote:
    >     >
    >     >     +1 to both as well.
    >     >
    >     >     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
    >     > <be...@apple.com.invalid>
    >     >     wrote:
    >     >
    >     >     > +1 to correctness, and I like the yaml idea
    >     >     >
    >     >     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <
    > pauloricardomg@gmail.com
    >     > >
    >     >     > wrote:
    >     >     > >
    >     >     > > +1 to defaulting for correctness.
    >     >     > >
    >     >     > > In addition to that, how about making it a mandatory
    > cassandra.yaml
    >     >     > > property defaulting to correctness? This would make upgrades
    > with
    >     > an old
    >     >     > > cassandra.yaml fail unless an option is explicitly specified,
    >     > making
    >     >     > > operators aware of the issue and forcing them to make a
    > choice.
    >     >     > >
    >     >     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
    >     >     > >> benjamin.lerer@datastax.com> escreveu:
    >     >     > >>
    >     >     > >> Thank you very much to everybody that provided feedback. It
    >     > helped a
    >     >     > lot to
    >     >     > >> limit our options.
    >     >     > >>
    >     >     > >> Unfortunately, it seems that some poor soul (me, really!!!)
    > will
    >     > have to
    >     >     > >> make the final call between #3 and #4.
    >     >     > >>
    >     >     > >> If I reformulate the question to: Do we default to
    > *correctness
    >     > *or to
    >     >     > >> *performance*?
    >     >     > >>
    >     >     > >> I would choose to default to *correctness*.
    >     >     > >>
    >     >     > >> Of course the situation is more complex than that but it
    > seems
    >     > that
    >     >     > >> somebody has to make a call and live with it. It seems to
    > me that
    >     > being
    >     >     > >> blamed for choosing correctness is easier to live with ;-)
    >     >     > >>
    >     >     > >> Benjamin
    >     >     > >>
    >     >     > >> PS: I tried to push the choice on Sylvain but he dodged the
    >     > bullet.
    >     >     > >>
    >     >     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
    >     >     > >> benedict@apache.org>
    >     >     > >> wrote:
    >     >     > >>
    >     >     > >>> I think I meant #4 __‍♂️
    >     >     > >>>
    >     >     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
    >     > <beggleston@apple.com.INVALID
    >     >     > >
    >     >     > >>> wrote:
    >     >     > >>>
    >     >     > >>>    I’d also prefer #3 over #4
    >     >     > >>>
    >     >     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
    >     >     > >>> benedict@apache.org> wrote:
    >     >     > >>>>
    >     >     > >>>> Well, I expressed a preference for #3 over #4,
    > particularly for
    >     >     > >> the
    >     >     > >>> 3.x series.  However at this point, I think the lack of a
    > clear
    >     > project
    >     >     > >>> decision means we can punt it back to you and Sylvain to
    > make
    >     > the final
    >     >     > >>> call.
    >     >     > >>>>
    >     >     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
    >     >     > >> benjamin.lerer@datastax.com>
    >     >     > >>> wrote:
    >     >     > >>>>
    >     >     > >>>>   I will try to summarize the discussion to clarify the
    > outcome.
    >     >     > >>>>
    >     >     > >>>>   Mick is in favor of #4
    >     >     > >>>>   Summanth is in favor of #4
    >     >     > >>>>   Sylvain answer was not clear for me. I understood it
    > like I
    >     >     > >>> prefer #3 to #4
    >     >     > >>>>   and I am also fine with #1
    >     >     > >>>>   Jeff is in favor of #3 and will understand #4
    >     >     > >>>>   David is in favor #3 (fix bug and add flag to roll back
    > to old
    >     >     > >>> behavior) in
    >     >     > >>>>   4.0 and #4 in 3.0 and 3.11
    >     >     > >>>>
    >     >     > >>>>   Do not hesitate to correct me if I misunderstood your
    > answer.
    >     >     > >>>>
    >     >     > >>>>   Based on these answers it seems clear that most people
    > prefer
    >     > to
    >     >     > >>> go for #3
    >     >     > >>>>   or #4.
    >     >     > >>>>
    >     >     > >>>>   The choice between #3 (fix correctness opt-in to current
    >     >     > >>> behavior) and #4
    >     >     > >>>>   (current behavior opt-in to correctness) is a bit less
    > clear
    >     >     > >>> specially if
    >     >     > >>>>   we consider the 3.X branches or 4.0.
    >     >     > >>>>
    >     >     > >>>>   Does anybody as some idea on how to choose between
    > those 2
    >     >     > >>> choices or some
    >     >     > >>>>   extra opinions on #3 versus #4?
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
    >     >     > >>> dcapwell@gmail.com> wrote:
    >     >     > >>>>>
    >     >     > >>>>> I feel that #4 (fix bug and add flag to roll back to old
    >     > behavior)
    >     >     > >>> is best.
    >     >     > >>>>>
    >     >     > >>>>> About the alternative implementation, I am fine adding
    > it to
    >     > 3.x
    >     >     > >>> and 4.0,
    >     >     > >>>>> but should treat it as a different path disabled by
    > default
    >     > that
    >     >     > >>> you can
    >     >     > >>>>> opt-into, with a plan to opt-in by default "eventually".
    >     >     > >>>>>
    >     >     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
    >     >     > >>>>> benedict@apache.org>
    >     >     > >>>>> wrote:
    >     >     > >>>>>
    >     >     > >>>>>> Perhaps there might be broader appetite to weigh in on
    > which
    >     >     > >> major
    >     >     > >>>>>> releases we might target for work that fixes the
    > correctness
    >     > bug
    >     >     > >>> without
    >     >     > >>>>>> serious performance regression?
    >     >     > >>>>>>
    >     >     > >>>>>> i.e., if we were to fix the correctness bug now,
    > introducing a
    >     >     > >>> serious
    >     >     > >>>>>> performance regression (either opt-in or opt-out), but
    > were to
    >     >     > >>> land work
    >     >     > >>>>>> without this problem for 5.0, would there be appetite to
    >     > backport
    >     >     > >>> this
    >     >     > >>>>> work
    >     >     > >>>>>> to any of 4.0, 3.11 or 3.0?
    >     >     > >>>>>>
    >     >     > >>>>>>
    >     >     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com>
    > wrote:
    >     >     > >>>>>>
    >     >     > >>>>>>   This is complicated and relatively few people on earth
    >     >     > >>> understand it,
    >     >     > >>>>>> so
    >     >     > >>>>>>   having little feedback is mostly expected,
    > unfortunately.
    >     >     > >>>>>>
    >     >     > >>>>>>   My normal emotional response is "correctness is
    > required,
    >     >     > >>> opt-in to
    >     >     > >>>>>>   performance improvements that sacrifice strict
    > correctness",
    >     >     > >>> but I'm
    >     >     > >>>>>> also
    >     >     > >>>>>>   sure this is going to surprise people, and would
    > understand
    >     > /
    >     >     > >>> accept
    >     >     > >>>>> #4
    >     >     > >>>>>>   (default to current, opt-in to correct).
    >     >     > >>>>>>
    >     >     > >>>>>>
    >     >     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott
    > Smith <
    >     >     > >>>>>> benedict@apache.org>
    >     >     > >>>>>>   wrote:
    >     >     > >>>>>>
    >     >     > >>>>>>> It doesn't seem like there's much enthusiasm for any
    > of the
    >     >     > >>> options
    >     >     > >>>>>>> available here...
    >     >     > >>>>>>>
    >     >     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
    >     >     > >>>>> benedict@apache.org
    >     >     > >>>>>>>
    >     >     > >>>>>>> wrote:
    >     >     > >>>>>>>
    >     >     > >>>>>>>> Is the new implementation a separate, distinctly
    > modularized
    >     >     > >>>>>> new
    >     >     > >>>>>>> body of work
    >     >     > >>>>>>>
    >     >     > >>>>>>>   It’s primarily a distinct, modularised and new body
    > of
    >     > work,
    >     >     > >>>>>> however
    >     >     > >>>>>>> there is some shared code that has been modified -
    > namely
    >     >     > >>>>>> PaxosState, in
    >     >     > >>>>>>> which legacy code is maintained but modified for
    >     > compatibility,
    >     >     > >>> and
    >     >     > >>>>>> the
    >     >     > >>>>>>> system.paxos table (which receives a new column, and
    > slightly
    >     >     > >>>>>> modified
    >     >     > >>>>>>> serialization code).  It is conceptually an optimised
    >     > version of
    >     >     > >>>>> the
    >     >     > >>>>>>> existing algorithm.
    >     >     > >>>>>>>
    >     >     > >>>>>>>   If there's a chance of being of value to 4.0, I can
    > try to
    >     >     > >> put
    >     >     > >>>>>> up a
    >     >     > >>>>>>> patch next week alongside a high level description of
    > the
    >     >     > >> changes.
    >     >     > >>>>>>>
    >     >     > >>>>>>>> But a performance regression is a regression, I'm not
    >     >     > >>>>>> shrugging it
    >     >     > >>>>>>> off.
    >     >     > >>>>>>>
    >     >     > >>>>>>>   I don't want to give the impression I'm shrugging
    > off the
    >     >     > >>>>>> correctness
    >     >     > >>>>>>> issue either. It's a serious issue to fix, but since
    > all
    >     >     > >>> successful
    >     >     > >>>>>> updates
    >     >     > >>>>>>> to the database are linearizable, I think it's likely
    > that
    >     > many
    >     >     > >>>>>>> applications behave correctly with the present
    > semantics, or
    >     > at
    >     >     > >>>>> least
    >     >     > >>>>>>> encounter only transient errors. No doubt many also do
    > not,
    >     > but
    >     >     > >> I
    >     >     > >>>>>> have no
    >     >     > >>>>>>> idea of the ratio.
    >     >     > >>>>>>>
    >     >     > >>>>>>>   The regression isn't itself a simple issue either -
    >     > depending
    >     >     > >>>>> on
    >     >     > >>>>>> the
    >     >     > >>>>>>> topology and message latencies it is not difficult to
    > produce
    >     >     > >>>>>> inescapable
    >     >     > >>>>>>> contention, i.e. guaranteed timeouts - that might
    > persist as
    >     >     > >> long
    >     >     > >>>>> as
    >     >     > >>>>>>> clients continue to retry. It could be quite a serious
    >     >     > >> degradation
    >     >     > >>>>> of
    >     >     > >>>>>>> service to impose on our users.
    >     >     > >>>>>>>
    >     >     > >>>>>>>   I don't pretend to know the correct way to make a
    > decision
    >     >     > >>>>>> balancing
    >     >     > >>>>>>> these considerations, but I am perhaps more concerned
    > about
    >     >     > >>>>> imposing
    >     >     > >>>>>>> service outages than I am temporarily maintaining
    > semantics
    >     > our
    >     >     > >>>>>> users have
    >     >     > >>>>>>> apparently accepted for years - though I absolutely
    > share
    >     > your
    >     >     > >>>>>>> embarrassment there.
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
    >     >     > >> jmckenzie@apache.org
    >     >     > >>>>>>
    >     >     > >>>>>> wrote:
    >     >     > >>>>>>>
    >     >     > >>>>>>>       Is the new implementation a separate, distinctly
    >     >     > >>>>> modularized
    >     >     > >>>>>> new
    >     >     > >>>>>>> body of
    >     >     > >>>>>>>       work or does it make substantial changes to
    > existing
    >     >     > >>>>>>> implementation and
    >     >     > >>>>>>>       subsume it?
    >     >     > >>>>>>>
    >     >     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne
    > <
    >     >     > >>>>>>> lebresne@gmail.com> wrote:
    >     >     > >>>>>>>
    >     >     > >>>>>>>> Regarding option #4, I'll remark that experience
    > tends to
    >     >     > >>>>>>> suggest users
    >     >     > >>>>>>>> don't consistently read the `NEWS.txt` file on
    > upgrade,
    >     >     > >>>>> so
    >     >     > >>>>>>> option #4 will
    >     >     > >>>>>>>> likely essentially mean "LWT has a correctness issue,
    > but
    >     >     > >>>>>> once
    >     >     > >>>>>>> it broke
    >     >     > >>>>>>>> your data enough that you'll notice, you'll be able to
    >     >     > >>>>> dig
    >     >     > >>>>>> the
    >     >     > >>>>>>> proper flag
    >     >     > >>>>>>>> to fix it for next time". I guess it's better than
    >     >     > >>>>>> nothing, of
    >     >     > >>>>>>> course, but
    >     >     > >>>>>>>> I'll admit that defaulting to "opt-in correctness",
    >     >     > >>>>>> especially
    >     >     > >>>>>>> for a
    >     >     > >>>>>>>> feature (LWT) that exists uniquely to provide
    > additional
    >     >     > >>>>>>> guarantees, is
    >     >     > >>>>>>>> something I have a hard rallying behind.
    >     >     > >>>>>>>>
    >     >     > >>>>>>>> But a performance regression is a regression, I'm not
    >     >     > >>>>>> shrugging
    >     >     > >>>>>>> it off.
    >     >     > >>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
    >     >     > >>>>> serious
    >     >     > >>>>>> known
    >     >     > >>>>>>>> correctness bug and I frankly feel bad for "the
    > project"
    >     >     > >>>>>> that
    >     >     > >>>>>>> this has been
    >     >     > >>>>>>>> known for so long without action, so I'm a bit biased
    > in
    >     >     > >>>>>> wanting
    >     >     > >>>>>>> to get it
    >     >     > >>>>>>>> fixed asap.
    >     >     > >>>>>>>>
    >     >     > >>>>>>>> But maybe I'm overstating the urgency here, and maybe
    >     >     > >>>>>> option #1
    >     >     > >>>>>>> is a better
    >     >     > >>>>>>>> way forward.
    >     >     > >>>>>>>>
    >     >     > >>>>>>>> --
    >     >     > >>>>>>>> Sylvain
    >     >     > >>>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>
    >     >     > >>>
    >     > ---------------------------------------------------------------------
    >     >     > >>>>>>>   To unsubscribe, e-mail:
    >     > dev-unsubscribe@cassandra.apache.org
    >     >     > >>>>>>>   For additional commands, e-mail:
    >     >     > >> dev-help@cassandra.apache.org
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>
    >     >     > >>>
    >     > ---------------------------------------------------------------------
    >     >     > >>>>>>> To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     >     > >>>>>>> For additional commands, e-mail:
    >     > dev-help@cassandra.apache.org
    >     >     > >>>>>>>
    >     >     > >>>>>>>
    >     >     > >>>>>>
    >     >     > >>>>>>
    >     >     > >>>>>>
    >     >     > >>>>>>
    >     >     > >>>
    >     > ---------------------------------------------------------------------
    >     >     > >>>>>> To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     >     > >>>>>> For additional commands, e-mail:
    >     > dev-help@cassandra.apache.org
    >     >     > >>>>>>
    >     >     > >>>>>>
    >     >     > >>>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>>>
    >     >     > >>
    >     > ---------------------------------------------------------------------
    >     >     > >>>> To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     >     > >>>> For additional commands, e-mail:
    > dev-help@cassandra.apache.org
    >     >     > >>>>
    >     >     > >>>
    >     >     > >>>
    >     >     >
    > ---------------------------------------------------------------------
    >     >     > >>>    To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     >     > >>>    For additional commands, e-mail:
    >     > dev-help@cassandra.apache.org
    >     >     > >>>
    >     >     > >>>
    >     >     > >>>
    >     >     > >>>
    >     >     > >>>
    >     > ---------------------------------------------------------------------
    >     >     > >>> To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     >     > >>> For additional commands, e-mail:
    > dev-help@cassandra.apache.org
    >     >     > >>>
    >     >     > >>>
    >     >     > >>
    >     >     >
    >     >     >
    > ---------------------------------------------------------------------
    >     >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     >     > For additional commands, e-mail: dev-help@cassandra.apache.org
    >     >     >
    >     >     >
    >     >
    >     >
    >     >
    >     > ---------------------------------------------------------------------
    >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > For additional commands, e-mail: dev-help@cassandra.apache.org
    >     >
    >     >
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Paulo Motta <pa...@gmail.com>.

Isn't the plan to change LWT implementation (and performance expectation)
in a patch version? This is a breaking change by itself, I'm just proposing
to make the trade-off choice explicit in the yaml to prevent unexpected
performance degradation during upgrade (for users who are not aware of the
change).

Just to make it clear, I'm proposing having a "lwt_legacy_mode: false"
uncommented in the default yaml with a descriptive comment about
CASSANDRA-12126, so new users will always get the new behavior, but users
using a yaml template based on a previous 3.X version will not be able to
start the node because this property will be missing. I believe the
majority of operators will just update their yaml with "lwt_legacy_mode:
false" and move on with their upgrades, but people wanting to keep the
previous performance will become aware of the breaking change and set it to
true.

Em seg., 23 de nov. de 2020 às 21:07, Benedict Elliott Smith <
benedict@apache.org> escreveu:

> What do you mean by minor upgrade? We can't break patch upgrades for any
> of 3.x, as this could also cause surprise outages.
>
> On 23/11/2020, 23:51, "Paulo Motta" <pa...@gmail.com> wrote:
>
>      I was thinking about the YAML requirement during the 3.X minor
> upgrade to
>     make the decision explicit (need to update yaml) rather than implicit
> (by
>     upgrading you agree with the change), since the latter can go
> unnoticed by
>     those who don't pay attention to NEWS.txt
>
>     Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
>     benedict@apache.org> escreveu:
>
>     > What's the value of the yaml? The user is likely to have upgraded to
>     > latest 3.x as part of the upgrade process to 4.0, so they'll already
> have
>     > had a decision made for them. If correctness didn't break anything,
> there
>     > doesn't any longer seem much point in offering a choice?
>     >
>     > On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com> wrote:
>     >
>     >     +1 to both as well.
>     >
>     >     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
>     > <be...@apple.com.invalid>
>     >     wrote:
>     >
>     >     > +1 to correctness, and I like the yaml idea
>     >     >
>     >     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <
> pauloricardomg@gmail.com
>     > >
>     >     > wrote:
>     >     > >
>     >     > > +1 to defaulting for correctness.
>     >     > >
>     >     > > In addition to that, how about making it a mandatory
> cassandra.yaml
>     >     > > property defaulting to correctness? This would make upgrades
> with
>     > an old
>     >     > > cassandra.yaml fail unless an option is explicitly specified,
>     > making
>     >     > > operators aware of the issue and forcing them to make a
> choice.
>     >     > >
>     >     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
>     >     > >> benjamin.lerer@datastax.com> escreveu:
>     >     > >>
>     >     > >> Thank you very much to everybody that provided feedback. It
>     > helped a
>     >     > lot to
>     >     > >> limit our options.
>     >     > >>
>     >     > >> Unfortunately, it seems that some poor soul (me, really!!!)
> will
>     > have to
>     >     > >> make the final call between #3 and #4.
>     >     > >>
>     >     > >> If I reformulate the question to: Do we default to
> *correctness
>     > *or to
>     >     > >> *performance*?
>     >     > >>
>     >     > >> I would choose to default to *correctness*.
>     >     > >>
>     >     > >> Of course the situation is more complex than that but it
> seems
>     > that
>     >     > >> somebody has to make a call and live with it. It seems to
> me that
>     > being
>     >     > >> blamed for choosing correctness is easier to live with ;-)
>     >     > >>
>     >     > >> Benjamin
>     >     > >>
>     >     > >> PS: I tried to push the choice on Sylvain but he dodged the
>     > bullet.
>     >     > >>
>     >     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
>     >     > >> benedict@apache.org>
>     >     > >> wrote:
>     >     > >>
>     >     > >>> I think I meant #4 __‍♂️
>     >     > >>>
>     >     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
>     > <beggleston@apple.com.INVALID
>     >     > >
>     >     > >>> wrote:
>     >     > >>>
>     >     > >>>    I’d also prefer #3 over #4
>     >     > >>>
>     >     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
>     >     > >>> benedict@apache.org> wrote:
>     >     > >>>>
>     >     > >>>> Well, I expressed a preference for #3 over #4,
> particularly for
>     >     > >> the
>     >     > >>> 3.x series.  However at this point, I think the lack of a
> clear
>     > project
>     >     > >>> decision means we can punt it back to you and Sylvain to
> make
>     > the final
>     >     > >>> call.
>     >     > >>>>
>     >     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
>     >     > >> benjamin.lerer@datastax.com>
>     >     > >>> wrote:
>     >     > >>>>
>     >     > >>>>   I will try to summarize the discussion to clarify the
> outcome.
>     >     > >>>>
>     >     > >>>>   Mick is in favor of #4
>     >     > >>>>   Summanth is in favor of #4
>     >     > >>>>   Sylvain answer was not clear for me. I understood it
> like I
>     >     > >>> prefer #3 to #4
>     >     > >>>>   and I am also fine with #1
>     >     > >>>>   Jeff is in favor of #3 and will understand #4
>     >     > >>>>   David is in favor #3 (fix bug and add flag to roll back
> to old
>     >     > >>> behavior) in
>     >     > >>>>   4.0 and #4 in 3.0 and 3.11
>     >     > >>>>
>     >     > >>>>   Do not hesitate to correct me if I misunderstood your
> answer.
>     >     > >>>>
>     >     > >>>>   Based on these answers it seems clear that most people
> prefer
>     > to
>     >     > >>> go for #3
>     >     > >>>>   or #4.
>     >     > >>>>
>     >     > >>>>   The choice between #3 (fix correctness opt-in to current
>     >     > >>> behavior) and #4
>     >     > >>>>   (current behavior opt-in to correctness) is a bit less
> clear
>     >     > >>> specially if
>     >     > >>>>   we consider the 3.X branches or 4.0.
>     >     > >>>>
>     >     > >>>>   Does anybody as some idea on how to choose between
> those 2
>     >     > >>> choices or some
>     >     > >>>>   extra opinions on #3 versus #4?
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
>     >     > >>> dcapwell@gmail.com> wrote:
>     >     > >>>>>
>     >     > >>>>> I feel that #4 (fix bug and add flag to roll back to old
>     > behavior)
>     >     > >>> is best.
>     >     > >>>>>
>     >     > >>>>> About the alternative implementation, I am fine adding
> it to
>     > 3.x
>     >     > >>> and 4.0,
>     >     > >>>>> but should treat it as a different path disabled by
> default
>     > that
>     >     > >>> you can
>     >     > >>>>> opt-into, with a plan to opt-in by default "eventually".
>     >     > >>>>>
>     >     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
>     >     > >>>>> benedict@apache.org>
>     >     > >>>>> wrote:
>     >     > >>>>>
>     >     > >>>>>> Perhaps there might be broader appetite to weigh in on
> which
>     >     > >> major
>     >     > >>>>>> releases we might target for work that fixes the
> correctness
>     > bug
>     >     > >>> without
>     >     > >>>>>> serious performance regression?
>     >     > >>>>>>
>     >     > >>>>>> i.e., if we were to fix the correctness bug now,
> introducing a
>     >     > >>> serious
>     >     > >>>>>> performance regression (either opt-in or opt-out), but
> were to
>     >     > >>> land work
>     >     > >>>>>> without this problem for 5.0, would there be appetite to
>     > backport
>     >     > >>> this
>     >     > >>>>> work
>     >     > >>>>>> to any of 4.0, 3.11 or 3.0?
>     >     > >>>>>>
>     >     > >>>>>>
>     >     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com>
> wrote:
>     >     > >>>>>>
>     >     > >>>>>>   This is complicated and relatively few people on earth
>     >     > >>> understand it,
>     >     > >>>>>> so
>     >     > >>>>>>   having little feedback is mostly expected,
> unfortunately.
>     >     > >>>>>>
>     >     > >>>>>>   My normal emotional response is "correctness is
> required,
>     >     > >>> opt-in to
>     >     > >>>>>>   performance improvements that sacrifice strict
> correctness",
>     >     > >>> but I'm
>     >     > >>>>>> also
>     >     > >>>>>>   sure this is going to surprise people, and would
> understand
>     > /
>     >     > >>> accept
>     >     > >>>>> #4
>     >     > >>>>>>   (default to current, opt-in to correct).
>     >     > >>>>>>
>     >     > >>>>>>
>     >     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott
> Smith <
>     >     > >>>>>> benedict@apache.org>
>     >     > >>>>>>   wrote:
>     >     > >>>>>>
>     >     > >>>>>>> It doesn't seem like there's much enthusiasm for any
> of the
>     >     > >>> options
>     >     > >>>>>>> available here...
>     >     > >>>>>>>
>     >     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>     >     > >>>>> benedict@apache.org
>     >     > >>>>>>>
>     >     > >>>>>>> wrote:
>     >     > >>>>>>>
>     >     > >>>>>>>> Is the new implementation a separate, distinctly
> modularized
>     >     > >>>>>> new
>     >     > >>>>>>> body of work
>     >     > >>>>>>>
>     >     > >>>>>>>   It’s primarily a distinct, modularised and new body
> of
>     > work,
>     >     > >>>>>> however
>     >     > >>>>>>> there is some shared code that has been modified -
> namely
>     >     > >>>>>> PaxosState, in
>     >     > >>>>>>> which legacy code is maintained but modified for
>     > compatibility,
>     >     > >>> and
>     >     > >>>>>> the
>     >     > >>>>>>> system.paxos table (which receives a new column, and
> slightly
>     >     > >>>>>> modified
>     >     > >>>>>>> serialization code).  It is conceptually an optimised
>     > version of
>     >     > >>>>> the
>     >     > >>>>>>> existing algorithm.
>     >     > >>>>>>>
>     >     > >>>>>>>   If there's a chance of being of value to 4.0, I can
> try to
>     >     > >> put
>     >     > >>>>>> up a
>     >     > >>>>>>> patch next week alongside a high level description of
> the
>     >     > >> changes.
>     >     > >>>>>>>
>     >     > >>>>>>>> But a performance regression is a regression, I'm not
>     >     > >>>>>> shrugging it
>     >     > >>>>>>> off.
>     >     > >>>>>>>
>     >     > >>>>>>>   I don't want to give the impression I'm shrugging
> off the
>     >     > >>>>>> correctness
>     >     > >>>>>>> issue either. It's a serious issue to fix, but since
> all
>     >     > >>> successful
>     >     > >>>>>> updates
>     >     > >>>>>>> to the database are linearizable, I think it's likely
> that
>     > many
>     >     > >>>>>>> applications behave correctly with the present
> semantics, or
>     > at
>     >     > >>>>> least
>     >     > >>>>>>> encounter only transient errors. No doubt many also do
> not,
>     > but
>     >     > >> I
>     >     > >>>>>> have no
>     >     > >>>>>>> idea of the ratio.
>     >     > >>>>>>>
>     >     > >>>>>>>   The regression isn't itself a simple issue either -
>     > depending
>     >     > >>>>> on
>     >     > >>>>>> the
>     >     > >>>>>>> topology and message latencies it is not difficult to
> produce
>     >     > >>>>>> inescapable
>     >     > >>>>>>> contention, i.e. guaranteed timeouts - that might
> persist as
>     >     > >> long
>     >     > >>>>> as
>     >     > >>>>>>> clients continue to retry. It could be quite a serious
>     >     > >> degradation
>     >     > >>>>> of
>     >     > >>>>>>> service to impose on our users.
>     >     > >>>>>>>
>     >     > >>>>>>>   I don't pretend to know the correct way to make a
> decision
>     >     > >>>>>> balancing
>     >     > >>>>>>> these considerations, but I am perhaps more concerned
> about
>     >     > >>>>> imposing
>     >     > >>>>>>> service outages than I am temporarily maintaining
> semantics
>     > our
>     >     > >>>>>> users have
>     >     > >>>>>>> apparently accepted for years - though I absolutely
> share
>     > your
>     >     > >>>>>>> embarrassment there.
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
>     >     > >> jmckenzie@apache.org
>     >     > >>>>>>
>     >     > >>>>>> wrote:
>     >     > >>>>>>>
>     >     > >>>>>>>       Is the new implementation a separate, distinctly
>     >     > >>>>> modularized
>     >     > >>>>>> new
>     >     > >>>>>>> body of
>     >     > >>>>>>>       work or does it make substantial changes to
> existing
>     >     > >>>>>>> implementation and
>     >     > >>>>>>>       subsume it?
>     >     > >>>>>>>
>     >     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne
> <
>     >     > >>>>>>> lebresne@gmail.com> wrote:
>     >     > >>>>>>>
>     >     > >>>>>>>> Regarding option #4, I'll remark that experience
> tends to
>     >     > >>>>>>> suggest users
>     >     > >>>>>>>> don't consistently read the `NEWS.txt` file on
> upgrade,
>     >     > >>>>> so
>     >     > >>>>>>> option #4 will
>     >     > >>>>>>>> likely essentially mean "LWT has a correctness issue,
> but
>     >     > >>>>>> once
>     >     > >>>>>>> it broke
>     >     > >>>>>>>> your data enough that you'll notice, you'll be able to
>     >     > >>>>> dig
>     >     > >>>>>> the
>     >     > >>>>>>> proper flag
>     >     > >>>>>>>> to fix it for next time". I guess it's better than
>     >     > >>>>>> nothing, of
>     >     > >>>>>>> course, but
>     >     > >>>>>>>> I'll admit that defaulting to "opt-in correctness",
>     >     > >>>>>> especially
>     >     > >>>>>>> for a
>     >     > >>>>>>>> feature (LWT) that exists uniquely to provide
> additional
>     >     > >>>>>>> guarantees, is
>     >     > >>>>>>>> something I have a hard rallying behind.
>     >     > >>>>>>>>
>     >     > >>>>>>>> But a performance regression is a regression, I'm not
>     >     > >>>>>> shrugging
>     >     > >>>>>>> it off.
>     >     > >>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
>     >     > >>>>> serious
>     >     > >>>>>> known
>     >     > >>>>>>>> correctness bug and I frankly feel bad for "the
> project"
>     >     > >>>>>> that
>     >     > >>>>>>> this has been
>     >     > >>>>>>>> known for so long without action, so I'm a bit biased
> in
>     >     > >>>>>> wanting
>     >     > >>>>>>> to get it
>     >     > >>>>>>>> fixed asap.
>     >     > >>>>>>>>
>     >     > >>>>>>>> But maybe I'm overstating the urgency here, and maybe
>     >     > >>>>>> option #1
>     >     > >>>>>>> is a better
>     >     > >>>>>>>> way forward.
>     >     > >>>>>>>>
>     >     > >>>>>>>> --
>     >     > >>>>>>>> Sylvain
>     >     > >>>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>
>     >     > >>>
>     > ---------------------------------------------------------------------
>     >     > >>>>>>>   To unsubscribe, e-mail:
>     > dev-unsubscribe@cassandra.apache.org
>     >     > >>>>>>>   For additional commands, e-mail:
>     >     > >> dev-help@cassandra.apache.org
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>
>     >     > >>>
>     > ---------------------------------------------------------------------
>     >     > >>>>>>> To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     >     > >>>>>>> For additional commands, e-mail:
>     > dev-help@cassandra.apache.org
>     >     > >>>>>>>
>     >     > >>>>>>>
>     >     > >>>>>>
>     >     > >>>>>>
>     >     > >>>>>>
>     >     > >>>>>>
>     >     > >>>
>     > ---------------------------------------------------------------------
>     >     > >>>>>> To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     >     > >>>>>> For additional commands, e-mail:
>     > dev-help@cassandra.apache.org
>     >     > >>>>>>
>     >     > >>>>>>
>     >     > >>>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>>>
>     >     > >>
>     > ---------------------------------------------------------------------
>     >     > >>>> To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     >     > >>>> For additional commands, e-mail:
> dev-help@cassandra.apache.org
>     >     > >>>>
>     >     > >>>
>     >     > >>>
>     >     >
> ---------------------------------------------------------------------
>     >     > >>>    To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     >     > >>>    For additional commands, e-mail:
>     > dev-help@cassandra.apache.org
>     >     > >>>
>     >     > >>>
>     >     > >>>
>     >     > >>>
>     >     > >>>
>     > ---------------------------------------------------------------------
>     >     > >>> To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     >     > >>> For additional commands, e-mail:
> dev-help@cassandra.apache.org
>     >     > >>>
>     >     > >>>
>     >     > >>
>     >     >
>     >     >
> ---------------------------------------------------------------------
>     >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >     >
>     >     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

What do you mean by minor upgrade? We can't break patch upgrades for any of 3.x, as this could also cause surprise outages.

On 23/11/2020, 23:51, "Paulo Motta" <pa...@gmail.com> wrote:

     I was thinking about the YAML requirement during the 3.X minor upgrade to
    make the decision explicit (need to update yaml) rather than implicit (by
    upgrading you agree with the change), since the latter can go unnoticed by
    those who don't pay attention to NEWS.txt

    Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
    benedict@apache.org> escreveu:

    > What's the value of the yaml? The user is likely to have upgraded to
    > latest 3.x as part of the upgrade process to 4.0, so they'll already have
    > had a decision made for them. If correctness didn't break anything, there
    > doesn't any longer seem much point in offering a choice?
    >
    > On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com> wrote:
    >
    >     +1 to both as well.
    >
    >     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
    > <be...@apple.com.invalid>
    >     wrote:
    >
    >     > +1 to correctness, and I like the yaml idea
    >     >
    >     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <pauloricardomg@gmail.com
    > >
    >     > wrote:
    >     > >
    >     > > +1 to defaulting for correctness.
    >     > >
    >     > > In addition to that, how about making it a mandatory cassandra.yaml
    >     > > property defaulting to correctness? This would make upgrades with
    > an old
    >     > > cassandra.yaml fail unless an option is explicitly specified,
    > making
    >     > > operators aware of the issue and forcing them to make a choice.
    >     > >
    >     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
    >     > >> benjamin.lerer@datastax.com> escreveu:
    >     > >>
    >     > >> Thank you very much to everybody that provided feedback. It
    > helped a
    >     > lot to
    >     > >> limit our options.
    >     > >>
    >     > >> Unfortunately, it seems that some poor soul (me, really!!!) will
    > have to
    >     > >> make the final call between #3 and #4.
    >     > >>
    >     > >> If I reformulate the question to: Do we default to *correctness
    > *or to
    >     > >> *performance*?
    >     > >>
    >     > >> I would choose to default to *correctness*.
    >     > >>
    >     > >> Of course the situation is more complex than that but it seems
    > that
    >     > >> somebody has to make a call and live with it. It seems to me that
    > being
    >     > >> blamed for choosing correctness is easier to live with ;-)
    >     > >>
    >     > >> Benjamin
    >     > >>
    >     > >> PS: I tried to push the choice on Sylvain but he dodged the
    > bullet.
    >     > >>
    >     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
    >     > >> benedict@apache.org>
    >     > >> wrote:
    >     > >>
    >     > >>> I think I meant #4 __‍♂️
    >     > >>>
    >     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
    > <beggleston@apple.com.INVALID
    >     > >
    >     > >>> wrote:
    >     > >>>
    >     > >>>    I’d also prefer #3 over #4
    >     > >>>
    >     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
    >     > >>> benedict@apache.org> wrote:
    >     > >>>>
    >     > >>>> Well, I expressed a preference for #3 over #4, particularly for
    >     > >> the
    >     > >>> 3.x series.  However at this point, I think the lack of a clear
    > project
    >     > >>> decision means we can punt it back to you and Sylvain to make
    > the final
    >     > >>> call.
    >     > >>>>
    >     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
    >     > >> benjamin.lerer@datastax.com>
    >     > >>> wrote:
    >     > >>>>
    >     > >>>>   I will try to summarize the discussion to clarify the outcome.
    >     > >>>>
    >     > >>>>   Mick is in favor of #4
    >     > >>>>   Summanth is in favor of #4
    >     > >>>>   Sylvain answer was not clear for me. I understood it like I
    >     > >>> prefer #3 to #4
    >     > >>>>   and I am also fine with #1
    >     > >>>>   Jeff is in favor of #3 and will understand #4
    >     > >>>>   David is in favor #3 (fix bug and add flag to roll back to old
    >     > >>> behavior) in
    >     > >>>>   4.0 and #4 in 3.0 and 3.11
    >     > >>>>
    >     > >>>>   Do not hesitate to correct me if I misunderstood your answer.
    >     > >>>>
    >     > >>>>   Based on these answers it seems clear that most people prefer
    > to
    >     > >>> go for #3
    >     > >>>>   or #4.
    >     > >>>>
    >     > >>>>   The choice between #3 (fix correctness opt-in to current
    >     > >>> behavior) and #4
    >     > >>>>   (current behavior opt-in to correctness) is a bit less clear
    >     > >>> specially if
    >     > >>>>   we consider the 3.X branches or 4.0.
    >     > >>>>
    >     > >>>>   Does anybody as some idea on how to choose between those 2
    >     > >>> choices or some
    >     > >>>>   extra opinions on #3 versus #4?
    >     > >>>>
    >     > >>>>
    >     > >>>>
    >     > >>>>
    >     > >>>>
    >     > >>>>
    >     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
    >     > >>> dcapwell@gmail.com> wrote:
    >     > >>>>>
    >     > >>>>> I feel that #4 (fix bug and add flag to roll back to old
    > behavior)
    >     > >>> is best.
    >     > >>>>>
    >     > >>>>> About the alternative implementation, I am fine adding it to
    > 3.x
    >     > >>> and 4.0,
    >     > >>>>> but should treat it as a different path disabled by default
    > that
    >     > >>> you can
    >     > >>>>> opt-into, with a plan to opt-in by default "eventually".
    >     > >>>>>
    >     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
    >     > >>>>> benedict@apache.org>
    >     > >>>>> wrote:
    >     > >>>>>
    >     > >>>>>> Perhaps there might be broader appetite to weigh in on which
    >     > >> major
    >     > >>>>>> releases we might target for work that fixes the correctness
    > bug
    >     > >>> without
    >     > >>>>>> serious performance regression?
    >     > >>>>>>
    >     > >>>>>> i.e., if we were to fix the correctness bug now, introducing a
    >     > >>> serious
    >     > >>>>>> performance regression (either opt-in or opt-out), but were to
    >     > >>> land work
    >     > >>>>>> without this problem for 5.0, would there be appetite to
    > backport
    >     > >>> this
    >     > >>>>> work
    >     > >>>>>> to any of 4.0, 3.11 or 3.0?
    >     > >>>>>>
    >     > >>>>>>
    >     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
    >     > >>>>>>
    >     > >>>>>>   This is complicated and relatively few people on earth
    >     > >>> understand it,
    >     > >>>>>> so
    >     > >>>>>>   having little feedback is mostly expected, unfortunately.
    >     > >>>>>>
    >     > >>>>>>   My normal emotional response is "correctness is required,
    >     > >>> opt-in to
    >     > >>>>>>   performance improvements that sacrifice strict correctness",
    >     > >>> but I'm
    >     > >>>>>> also
    >     > >>>>>>   sure this is going to surprise people, and would understand
    > /
    >     > >>> accept
    >     > >>>>> #4
    >     > >>>>>>   (default to current, opt-in to correct).
    >     > >>>>>>
    >     > >>>>>>
    >     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
    >     > >>>>>> benedict@apache.org>
    >     > >>>>>>   wrote:
    >     > >>>>>>
    >     > >>>>>>> It doesn't seem like there's much enthusiasm for any of the
    >     > >>> options
    >     > >>>>>>> available here...
    >     > >>>>>>>
    >     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
    >     > >>>>> benedict@apache.org
    >     > >>>>>>>
    >     > >>>>>>> wrote:
    >     > >>>>>>>
    >     > >>>>>>>> Is the new implementation a separate, distinctly modularized
    >     > >>>>>> new
    >     > >>>>>>> body of work
    >     > >>>>>>>
    >     > >>>>>>>   It’s primarily a distinct, modularised and new body of
    > work,
    >     > >>>>>> however
    >     > >>>>>>> there is some shared code that has been modified - namely
    >     > >>>>>> PaxosState, in
    >     > >>>>>>> which legacy code is maintained but modified for
    > compatibility,
    >     > >>> and
    >     > >>>>>> the
    >     > >>>>>>> system.paxos table (which receives a new column, and slightly
    >     > >>>>>> modified
    >     > >>>>>>> serialization code).  It is conceptually an optimised
    > version of
    >     > >>>>> the
    >     > >>>>>>> existing algorithm.
    >     > >>>>>>>
    >     > >>>>>>>   If there's a chance of being of value to 4.0, I can try to
    >     > >> put
    >     > >>>>>> up a
    >     > >>>>>>> patch next week alongside a high level description of the
    >     > >> changes.
    >     > >>>>>>>
    >     > >>>>>>>> But a performance regression is a regression, I'm not
    >     > >>>>>> shrugging it
    >     > >>>>>>> off.
    >     > >>>>>>>
    >     > >>>>>>>   I don't want to give the impression I'm shrugging off the
    >     > >>>>>> correctness
    >     > >>>>>>> issue either. It's a serious issue to fix, but since all
    >     > >>> successful
    >     > >>>>>> updates
    >     > >>>>>>> to the database are linearizable, I think it's likely that
    > many
    >     > >>>>>>> applications behave correctly with the present semantics, or
    > at
    >     > >>>>> least
    >     > >>>>>>> encounter only transient errors. No doubt many also do not,
    > but
    >     > >> I
    >     > >>>>>> have no
    >     > >>>>>>> idea of the ratio.
    >     > >>>>>>>
    >     > >>>>>>>   The regression isn't itself a simple issue either -
    > depending
    >     > >>>>> on
    >     > >>>>>> the
    >     > >>>>>>> topology and message latencies it is not difficult to produce
    >     > >>>>>> inescapable
    >     > >>>>>>> contention, i.e. guaranteed timeouts - that might persist as
    >     > >> long
    >     > >>>>> as
    >     > >>>>>>> clients continue to retry. It could be quite a serious
    >     > >> degradation
    >     > >>>>> of
    >     > >>>>>>> service to impose on our users.
    >     > >>>>>>>
    >     > >>>>>>>   I don't pretend to know the correct way to make a decision
    >     > >>>>>> balancing
    >     > >>>>>>> these considerations, but I am perhaps more concerned about
    >     > >>>>> imposing
    >     > >>>>>>> service outages than I am temporarily maintaining semantics
    > our
    >     > >>>>>> users have
    >     > >>>>>>> apparently accepted for years - though I absolutely share
    > your
    >     > >>>>>>> embarrassment there.
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
    >     > >> jmckenzie@apache.org
    >     > >>>>>>
    >     > >>>>>> wrote:
    >     > >>>>>>>
    >     > >>>>>>>       Is the new implementation a separate, distinctly
    >     > >>>>> modularized
    >     > >>>>>> new
    >     > >>>>>>> body of
    >     > >>>>>>>       work or does it make substantial changes to existing
    >     > >>>>>>> implementation and
    >     > >>>>>>>       subsume it?
    >     > >>>>>>>
    >     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
    >     > >>>>>>> lebresne@gmail.com> wrote:
    >     > >>>>>>>
    >     > >>>>>>>> Regarding option #4, I'll remark that experience tends to
    >     > >>>>>>> suggest users
    >     > >>>>>>>> don't consistently read the `NEWS.txt` file on upgrade,
    >     > >>>>> so
    >     > >>>>>>> option #4 will
    >     > >>>>>>>> likely essentially mean "LWT has a correctness issue, but
    >     > >>>>>> once
    >     > >>>>>>> it broke
    >     > >>>>>>>> your data enough that you'll notice, you'll be able to
    >     > >>>>> dig
    >     > >>>>>> the
    >     > >>>>>>> proper flag
    >     > >>>>>>>> to fix it for next time". I guess it's better than
    >     > >>>>>> nothing, of
    >     > >>>>>>> course, but
    >     > >>>>>>>> I'll admit that defaulting to "opt-in correctness",
    >     > >>>>>> especially
    >     > >>>>>>> for a
    >     > >>>>>>>> feature (LWT) that exists uniquely to provide additional
    >     > >>>>>>> guarantees, is
    >     > >>>>>>>> something I have a hard rallying behind.
    >     > >>>>>>>>
    >     > >>>>>>>> But a performance regression is a regression, I'm not
    >     > >>>>>> shrugging
    >     > >>>>>>> it off.
    >     > >>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
    >     > >>>>> serious
    >     > >>>>>> known
    >     > >>>>>>>> correctness bug and I frankly feel bad for "the project"
    >     > >>>>>> that
    >     > >>>>>>> this has been
    >     > >>>>>>>> known for so long without action, so I'm a bit biased in
    >     > >>>>>> wanting
    >     > >>>>>>> to get it
    >     > >>>>>>>> fixed asap.
    >     > >>>>>>>>
    >     > >>>>>>>> But maybe I'm overstating the urgency here, and maybe
    >     > >>>>>> option #1
    >     > >>>>>>> is a better
    >     > >>>>>>>> way forward.
    >     > >>>>>>>>
    >     > >>>>>>>> --
    >     > >>>>>>>> Sylvain
    >     > >>>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>
    >     > >>>
    > ---------------------------------------------------------------------
    >     > >>>>>>>   To unsubscribe, e-mail:
    > dev-unsubscribe@cassandra.apache.org
    >     > >>>>>>>   For additional commands, e-mail:
    >     > >> dev-help@cassandra.apache.org
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>
    >     > >>>
    > ---------------------------------------------------------------------
    >     > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > >>>>>>> For additional commands, e-mail:
    > dev-help@cassandra.apache.org
    >     > >>>>>>>
    >     > >>>>>>>
    >     > >>>>>>
    >     > >>>>>>
    >     > >>>>>>
    >     > >>>>>>
    >     > >>>
    > ---------------------------------------------------------------------
    >     > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > >>>>>> For additional commands, e-mail:
    > dev-help@cassandra.apache.org
    >     > >>>>>>
    >     > >>>>>>
    >     > >>>>>
    >     > >>>>
    >     > >>>>
    >     > >>>>
    >     > >>>>
    >     > >>
    > ---------------------------------------------------------------------
    >     > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
    >     > >>>>
    >     > >>>
    >     > >>>
    >     > ---------------------------------------------------------------------
    >     > >>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > >>>    For additional commands, e-mail:
    > dev-help@cassandra.apache.org
    >     > >>>
    >     > >>>
    >     > >>>
    >     > >>>
    >     > >>>
    > ---------------------------------------------------------------------
    >     > >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > >>> For additional commands, e-mail: dev-help@cassandra.apache.org
    >     > >>>
    >     > >>>
    >     > >>
    >     >
    >     > ---------------------------------------------------------------------
    >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     > For additional commands, e-mail: dev-help@cassandra.apache.org
    >     >
    >     >
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Paulo Motta <pa...@gmail.com>.

 I was thinking about the YAML requirement during the 3.X minor upgrade to
make the decision explicit (need to update yaml) rather than implicit (by
upgrading you agree with the change), since the latter can go unnoticed by
those who don't pay attention to NEWS.txt

Em seg., 23 de nov. de 2020 às 20:03, Benedict Elliott Smith <
benedict@apache.org> escreveu:

> What's the value of the yaml? The user is likely to have upgraded to
> latest 3.x as part of the upgrade process to 4.0, so they'll already have
> had a decision made for them. If correctness didn't break anything, there
> doesn't any longer seem much point in offering a choice?
>
> On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com> wrote:
>
>     +1 to both as well.
>
>     On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston
> <be...@apple.com.invalid>
>     wrote:
>
>     > +1 to correctness, and I like the yaml idea
>     >
>     > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <pauloricardomg@gmail.com
> >
>     > wrote:
>     > >
>     > > +1 to defaulting for correctness.
>     > >
>     > > In addition to that, how about making it a mandatory cassandra.yaml
>     > > property defaulting to correctness? This would make upgrades with
> an old
>     > > cassandra.yaml fail unless an option is explicitly specified,
> making
>     > > operators aware of the issue and forcing them to make a choice.
>     > >
>     > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
>     > >> benjamin.lerer@datastax.com> escreveu:
>     > >>
>     > >> Thank you very much to everybody that provided feedback. It
> helped a
>     > lot to
>     > >> limit our options.
>     > >>
>     > >> Unfortunately, it seems that some poor soul (me, really!!!) will
> have to
>     > >> make the final call between #3 and #4.
>     > >>
>     > >> If I reformulate the question to: Do we default to *correctness
> *or to
>     > >> *performance*?
>     > >>
>     > >> I would choose to default to *correctness*.
>     > >>
>     > >> Of course the situation is more complex than that but it seems
> that
>     > >> somebody has to make a call and live with it. It seems to me that
> being
>     > >> blamed for choosing correctness is easier to live with ;-)
>     > >>
>     > >> Benjamin
>     > >>
>     > >> PS: I tried to push the choice on Sylvain but he dodged the
> bullet.
>     > >>
>     > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
>     > >> benedict@apache.org>
>     > >> wrote:
>     > >>
>     > >>> I think I meant #4 __‍♂️
>     > >>>
>     > >>> On 20/11/2020, 21:11, "Blake Eggleston"
> <beggleston@apple.com.INVALID
>     > >
>     > >>> wrote:
>     > >>>
>     > >>>    I’d also prefer #3 over #4
>     > >>>
>     > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
>     > >>> benedict@apache.org> wrote:
>     > >>>>
>     > >>>> Well, I expressed a preference for #3 over #4, particularly for
>     > >> the
>     > >>> 3.x series.  However at this point, I think the lack of a clear
> project
>     > >>> decision means we can punt it back to you and Sylvain to make
> the final
>     > >>> call.
>     > >>>>
>     > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
>     > >> benjamin.lerer@datastax.com>
>     > >>> wrote:
>     > >>>>
>     > >>>>   I will try to summarize the discussion to clarify the outcome.
>     > >>>>
>     > >>>>   Mick is in favor of #4
>     > >>>>   Summanth is in favor of #4
>     > >>>>   Sylvain answer was not clear for me. I understood it like I
>     > >>> prefer #3 to #4
>     > >>>>   and I am also fine with #1
>     > >>>>   Jeff is in favor of #3 and will understand #4
>     > >>>>   David is in favor #3 (fix bug and add flag to roll back to old
>     > >>> behavior) in
>     > >>>>   4.0 and #4 in 3.0 and 3.11
>     > >>>>
>     > >>>>   Do not hesitate to correct me if I misunderstood your answer.
>     > >>>>
>     > >>>>   Based on these answers it seems clear that most people prefer
> to
>     > >>> go for #3
>     > >>>>   or #4.
>     > >>>>
>     > >>>>   The choice between #3 (fix correctness opt-in to current
>     > >>> behavior) and #4
>     > >>>>   (current behavior opt-in to correctness) is a bit less clear
>     > >>> specially if
>     > >>>>   we consider the 3.X branches or 4.0.
>     > >>>>
>     > >>>>   Does anybody as some idea on how to choose between those 2
>     > >>> choices or some
>     > >>>>   extra opinions on #3 versus #4?
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
>     > >>> dcapwell@gmail.com> wrote:
>     > >>>>>
>     > >>>>> I feel that #4 (fix bug and add flag to roll back to old
> behavior)
>     > >>> is best.
>     > >>>>>
>     > >>>>> About the alternative implementation, I am fine adding it to
> 3.x
>     > >>> and 4.0,
>     > >>>>> but should treat it as a different path disabled by default
> that
>     > >>> you can
>     > >>>>> opt-into, with a plan to opt-in by default "eventually".
>     > >>>>>
>     > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
>     > >>>>> benedict@apache.org>
>     > >>>>> wrote:
>     > >>>>>
>     > >>>>>> Perhaps there might be broader appetite to weigh in on which
>     > >> major
>     > >>>>>> releases we might target for work that fixes the correctness
> bug
>     > >>> without
>     > >>>>>> serious performance regression?
>     > >>>>>>
>     > >>>>>> i.e., if we were to fix the correctness bug now, introducing a
>     > >>> serious
>     > >>>>>> performance regression (either opt-in or opt-out), but were to
>     > >>> land work
>     > >>>>>> without this problem for 5.0, would there be appetite to
> backport
>     > >>> this
>     > >>>>> work
>     > >>>>>> to any of 4.0, 3.11 or 3.0?
>     > >>>>>>
>     > >>>>>>
>     > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
>     > >>>>>>
>     > >>>>>>   This is complicated and relatively few people on earth
>     > >>> understand it,
>     > >>>>>> so
>     > >>>>>>   having little feedback is mostly expected, unfortunately.
>     > >>>>>>
>     > >>>>>>   My normal emotional response is "correctness is required,
>     > >>> opt-in to
>     > >>>>>>   performance improvements that sacrifice strict correctness",
>     > >>> but I'm
>     > >>>>>> also
>     > >>>>>>   sure this is going to surprise people, and would understand
> /
>     > >>> accept
>     > >>>>> #4
>     > >>>>>>   (default to current, opt-in to correct).
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
>     > >>>>>> benedict@apache.org>
>     > >>>>>>   wrote:
>     > >>>>>>
>     > >>>>>>> It doesn't seem like there's much enthusiasm for any of the
>     > >>> options
>     > >>>>>>> available here...
>     > >>>>>>>
>     > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>     > >>>>> benedict@apache.org
>     > >>>>>>>
>     > >>>>>>> wrote:
>     > >>>>>>>
>     > >>>>>>>> Is the new implementation a separate, distinctly modularized
>     > >>>>>> new
>     > >>>>>>> body of work
>     > >>>>>>>
>     > >>>>>>>   It’s primarily a distinct, modularised and new body of
> work,
>     > >>>>>> however
>     > >>>>>>> there is some shared code that has been modified - namely
>     > >>>>>> PaxosState, in
>     > >>>>>>> which legacy code is maintained but modified for
> compatibility,
>     > >>> and
>     > >>>>>> the
>     > >>>>>>> system.paxos table (which receives a new column, and slightly
>     > >>>>>> modified
>     > >>>>>>> serialization code).  It is conceptually an optimised
> version of
>     > >>>>> the
>     > >>>>>>> existing algorithm.
>     > >>>>>>>
>     > >>>>>>>   If there's a chance of being of value to 4.0, I can try to
>     > >> put
>     > >>>>>> up a
>     > >>>>>>> patch next week alongside a high level description of the
>     > >> changes.
>     > >>>>>>>
>     > >>>>>>>> But a performance regression is a regression, I'm not
>     > >>>>>> shrugging it
>     > >>>>>>> off.
>     > >>>>>>>
>     > >>>>>>>   I don't want to give the impression I'm shrugging off the
>     > >>>>>> correctness
>     > >>>>>>> issue either. It's a serious issue to fix, but since all
>     > >>> successful
>     > >>>>>> updates
>     > >>>>>>> to the database are linearizable, I think it's likely that
> many
>     > >>>>>>> applications behave correctly with the present semantics, or
> at
>     > >>>>> least
>     > >>>>>>> encounter only transient errors. No doubt many also do not,
> but
>     > >> I
>     > >>>>>> have no
>     > >>>>>>> idea of the ratio.
>     > >>>>>>>
>     > >>>>>>>   The regression isn't itself a simple issue either -
> depending
>     > >>>>> on
>     > >>>>>> the
>     > >>>>>>> topology and message latencies it is not difficult to produce
>     > >>>>>> inescapable
>     > >>>>>>> contention, i.e. guaranteed timeouts - that might persist as
>     > >> long
>     > >>>>> as
>     > >>>>>>> clients continue to retry. It could be quite a serious
>     > >> degradation
>     > >>>>> of
>     > >>>>>>> service to impose on our users.
>     > >>>>>>>
>     > >>>>>>>   I don't pretend to know the correct way to make a decision
>     > >>>>>> balancing
>     > >>>>>>> these considerations, but I am perhaps more concerned about
>     > >>>>> imposing
>     > >>>>>>> service outages than I am temporarily maintaining semantics
> our
>     > >>>>>> users have
>     > >>>>>>> apparently accepted for years - though I absolutely share
> your
>     > >>>>>>> embarrassment there.
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
>     > >> jmckenzie@apache.org
>     > >>>>>>
>     > >>>>>> wrote:
>     > >>>>>>>
>     > >>>>>>>       Is the new implementation a separate, distinctly
>     > >>>>> modularized
>     > >>>>>> new
>     > >>>>>>> body of
>     > >>>>>>>       work or does it make substantial changes to existing
>     > >>>>>>> implementation and
>     > >>>>>>>       subsume it?
>     > >>>>>>>
>     > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
>     > >>>>>>> lebresne@gmail.com> wrote:
>     > >>>>>>>
>     > >>>>>>>> Regarding option #4, I'll remark that experience tends to
>     > >>>>>>> suggest users
>     > >>>>>>>> don't consistently read the `NEWS.txt` file on upgrade,
>     > >>>>> so
>     > >>>>>>> option #4 will
>     > >>>>>>>> likely essentially mean "LWT has a correctness issue, but
>     > >>>>>> once
>     > >>>>>>> it broke
>     > >>>>>>>> your data enough that you'll notice, you'll be able to
>     > >>>>> dig
>     > >>>>>> the
>     > >>>>>>> proper flag
>     > >>>>>>>> to fix it for next time". I guess it's better than
>     > >>>>>> nothing, of
>     > >>>>>>> course, but
>     > >>>>>>>> I'll admit that defaulting to "opt-in correctness",
>     > >>>>>> especially
>     > >>>>>>> for a
>     > >>>>>>>> feature (LWT) that exists uniquely to provide additional
>     > >>>>>>> guarantees, is
>     > >>>>>>>> something I have a hard rallying behind.
>     > >>>>>>>>
>     > >>>>>>>> But a performance regression is a regression, I'm not
>     > >>>>>> shrugging
>     > >>>>>>> it off.
>     > >>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
>     > >>>>> serious
>     > >>>>>> known
>     > >>>>>>>> correctness bug and I frankly feel bad for "the project"
>     > >>>>>> that
>     > >>>>>>> this has been
>     > >>>>>>>> known for so long without action, so I'm a bit biased in
>     > >>>>>> wanting
>     > >>>>>>> to get it
>     > >>>>>>>> fixed asap.
>     > >>>>>>>>
>     > >>>>>>>> But maybe I'm overstating the urgency here, and maybe
>     > >>>>>> option #1
>     > >>>>>>> is a better
>     > >>>>>>>> way forward.
>     > >>>>>>>>
>     > >>>>>>>> --
>     > >>>>>>>> Sylvain
>     > >>>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>
>     > >>>
> ---------------------------------------------------------------------
>     > >>>>>>>   To unsubscribe, e-mail:
> dev-unsubscribe@cassandra.apache.org
>     > >>>>>>>   For additional commands, e-mail:
>     > >> dev-help@cassandra.apache.org
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>
>     > >>>
> ---------------------------------------------------------------------
>     > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > >>>>>>> For additional commands, e-mail:
> dev-help@cassandra.apache.org
>     > >>>>>>>
>     > >>>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>>>>
>     > >>>
> ---------------------------------------------------------------------
>     > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > >>>>>> For additional commands, e-mail:
> dev-help@cassandra.apache.org
>     > >>>>>>
>     > >>>>>>
>     > >>>>>
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>>>
>     > >>
> ---------------------------------------------------------------------
>     > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>     > >>>>
>     > >>>
>     > >>>
>     > ---------------------------------------------------------------------
>     > >>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > >>>    For additional commands, e-mail:
> dev-help@cassandra.apache.org
>     > >>>
>     > >>>
>     > >>>
>     > >>>
>     > >>>
> ---------------------------------------------------------------------
>     > >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > >>> For additional commands, e-mail: dev-help@cassandra.apache.org
>     > >>>
>     > >>>
>     > >>
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

What's the value of the yaml? The user is likely to have upgraded to latest 3.x as part of the upgrade process to 4.0, so they'll already have had a decision made for them. If correctness didn't break anything, there doesn't any longer seem much point in offering a choice?

On 23/11/2020, 22:45, "Brandon Williams" <dr...@gmail.com> wrote:

    +1 to both as well.

    On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston <be...@apple.com.invalid>
    wrote:

    > +1 to correctness, and I like the yaml idea
    >
    > > On Nov 23, 2020, at 4:20 AM, Paulo Motta <pa...@gmail.com>
    > wrote:
    > >
    > > +1 to defaulting for correctness.
    > >
    > > In addition to that, how about making it a mandatory cassandra.yaml
    > > property defaulting to correctness? This would make upgrades with an old
    > > cassandra.yaml fail unless an option is explicitly specified, making
    > > operators aware of the issue and forcing them to make a choice.
    > >
    > >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
    > >> benjamin.lerer@datastax.com> escreveu:
    > >>
    > >> Thank you very much to everybody that provided feedback. It helped a
    > lot to
    > >> limit our options.
    > >>
    > >> Unfortunately, it seems that some poor soul (me, really!!!) will have to
    > >> make the final call between #3 and #4.
    > >>
    > >> If I reformulate the question to: Do we default to *correctness *or to
    > >> *performance*?
    > >>
    > >> I would choose to default to *correctness*.
    > >>
    > >> Of course the situation is more complex than that but it seems that
    > >> somebody has to make a call and live with it. It seems to me that being
    > >> blamed for choosing correctness is easier to live with ;-)
    > >>
    > >> Benjamin
    > >>
    > >> PS: I tried to push the choice on Sylvain but he dodged the bullet.
    > >>
    > >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
    > >> benedict@apache.org>
    > >> wrote:
    > >>
    > >>> I think I meant #4 __‍♂️
    > >>>
    > >>> On 20/11/2020, 21:11, "Blake Eggleston" <beggleston@apple.com.INVALID
    > >
    > >>> wrote:
    > >>>
    > >>>    I’d also prefer #3 over #4
    > >>>
    > >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
    > >>> benedict@apache.org> wrote:
    > >>>>
    > >>>> Well, I expressed a preference for #3 over #4, particularly for
    > >> the
    > >>> 3.x series.  However at this point, I think the lack of a clear project
    > >>> decision means we can punt it back to you and Sylvain to make the final
    > >>> call.
    > >>>>
    > >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
    > >> benjamin.lerer@datastax.com>
    > >>> wrote:
    > >>>>
    > >>>>   I will try to summarize the discussion to clarify the outcome.
    > >>>>
    > >>>>   Mick is in favor of #4
    > >>>>   Summanth is in favor of #4
    > >>>>   Sylvain answer was not clear for me. I understood it like I
    > >>> prefer #3 to #4
    > >>>>   and I am also fine with #1
    > >>>>   Jeff is in favor of #3 and will understand #4
    > >>>>   David is in favor #3 (fix bug and add flag to roll back to old
    > >>> behavior) in
    > >>>>   4.0 and #4 in 3.0 and 3.11
    > >>>>
    > >>>>   Do not hesitate to correct me if I misunderstood your answer.
    > >>>>
    > >>>>   Based on these answers it seems clear that most people prefer to
    > >>> go for #3
    > >>>>   or #4.
    > >>>>
    > >>>>   The choice between #3 (fix correctness opt-in to current
    > >>> behavior) and #4
    > >>>>   (current behavior opt-in to correctness) is a bit less clear
    > >>> specially if
    > >>>>   we consider the 3.X branches or 4.0.
    > >>>>
    > >>>>   Does anybody as some idea on how to choose between those 2
    > >>> choices or some
    > >>>>   extra opinions on #3 versus #4?
    > >>>>
    > >>>>
    > >>>>
    > >>>>
    > >>>>
    > >>>>
    > >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
    > >>> dcapwell@gmail.com> wrote:
    > >>>>>
    > >>>>> I feel that #4 (fix bug and add flag to roll back to old behavior)
    > >>> is best.
    > >>>>>
    > >>>>> About the alternative implementation, I am fine adding it to 3.x
    > >>> and 4.0,
    > >>>>> but should treat it as a different path disabled by default that
    > >>> you can
    > >>>>> opt-into, with a plan to opt-in by default "eventually".
    > >>>>>
    > >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
    > >>>>> benedict@apache.org>
    > >>>>> wrote:
    > >>>>>
    > >>>>>> Perhaps there might be broader appetite to weigh in on which
    > >> major
    > >>>>>> releases we might target for work that fixes the correctness bug
    > >>> without
    > >>>>>> serious performance regression?
    > >>>>>>
    > >>>>>> i.e., if we were to fix the correctness bug now, introducing a
    > >>> serious
    > >>>>>> performance regression (either opt-in or opt-out), but were to
    > >>> land work
    > >>>>>> without this problem for 5.0, would there be appetite to backport
    > >>> this
    > >>>>> work
    > >>>>>> to any of 4.0, 3.11 or 3.0?
    > >>>>>>
    > >>>>>>
    > >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
    > >>>>>>
    > >>>>>>   This is complicated and relatively few people on earth
    > >>> understand it,
    > >>>>>> so
    > >>>>>>   having little feedback is mostly expected, unfortunately.
    > >>>>>>
    > >>>>>>   My normal emotional response is "correctness is required,
    > >>> opt-in to
    > >>>>>>   performance improvements that sacrifice strict correctness",
    > >>> but I'm
    > >>>>>> also
    > >>>>>>   sure this is going to surprise people, and would understand /
    > >>> accept
    > >>>>> #4
    > >>>>>>   (default to current, opt-in to correct).
    > >>>>>>
    > >>>>>>
    > >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
    > >>>>>> benedict@apache.org>
    > >>>>>>   wrote:
    > >>>>>>
    > >>>>>>> It doesn't seem like there's much enthusiasm for any of the
    > >>> options
    > >>>>>>> available here...
    > >>>>>>>
    > >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
    > >>>>> benedict@apache.org
    > >>>>>>>
    > >>>>>>> wrote:
    > >>>>>>>
    > >>>>>>>> Is the new implementation a separate, distinctly modularized
    > >>>>>> new
    > >>>>>>> body of work
    > >>>>>>>
    > >>>>>>>   It’s primarily a distinct, modularised and new body of work,
    > >>>>>> however
    > >>>>>>> there is some shared code that has been modified - namely
    > >>>>>> PaxosState, in
    > >>>>>>> which legacy code is maintained but modified for compatibility,
    > >>> and
    > >>>>>> the
    > >>>>>>> system.paxos table (which receives a new column, and slightly
    > >>>>>> modified
    > >>>>>>> serialization code).  It is conceptually an optimised version of
    > >>>>> the
    > >>>>>>> existing algorithm.
    > >>>>>>>
    > >>>>>>>   If there's a chance of being of value to 4.0, I can try to
    > >> put
    > >>>>>> up a
    > >>>>>>> patch next week alongside a high level description of the
    > >> changes.
    > >>>>>>>
    > >>>>>>>> But a performance regression is a regression, I'm not
    > >>>>>> shrugging it
    > >>>>>>> off.
    > >>>>>>>
    > >>>>>>>   I don't want to give the impression I'm shrugging off the
    > >>>>>> correctness
    > >>>>>>> issue either. It's a serious issue to fix, but since all
    > >>> successful
    > >>>>>> updates
    > >>>>>>> to the database are linearizable, I think it's likely that many
    > >>>>>>> applications behave correctly with the present semantics, or at
    > >>>>> least
    > >>>>>>> encounter only transient errors. No doubt many also do not, but
    > >> I
    > >>>>>> have no
    > >>>>>>> idea of the ratio.
    > >>>>>>>
    > >>>>>>>   The regression isn't itself a simple issue either - depending
    > >>>>> on
    > >>>>>> the
    > >>>>>>> topology and message latencies it is not difficult to produce
    > >>>>>> inescapable
    > >>>>>>> contention, i.e. guaranteed timeouts - that might persist as
    > >> long
    > >>>>> as
    > >>>>>>> clients continue to retry. It could be quite a serious
    > >> degradation
    > >>>>> of
    > >>>>>>> service to impose on our users.
    > >>>>>>>
    > >>>>>>>   I don't pretend to know the correct way to make a decision
    > >>>>>> balancing
    > >>>>>>> these considerations, but I am perhaps more concerned about
    > >>>>> imposing
    > >>>>>>> service outages than I am temporarily maintaining semantics our
    > >>>>>> users have
    > >>>>>>> apparently accepted for years - though I absolutely share your
    > >>>>>>> embarrassment there.
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
    > >> jmckenzie@apache.org
    > >>>>>>
    > >>>>>> wrote:
    > >>>>>>>
    > >>>>>>>       Is the new implementation a separate, distinctly
    > >>>>> modularized
    > >>>>>> new
    > >>>>>>> body of
    > >>>>>>>       work or does it make substantial changes to existing
    > >>>>>>> implementation and
    > >>>>>>>       subsume it?
    > >>>>>>>
    > >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
    > >>>>>>> lebresne@gmail.com> wrote:
    > >>>>>>>
    > >>>>>>>> Regarding option #4, I'll remark that experience tends to
    > >>>>>>> suggest users
    > >>>>>>>> don't consistently read the `NEWS.txt` file on upgrade,
    > >>>>> so
    > >>>>>>> option #4 will
    > >>>>>>>> likely essentially mean "LWT has a correctness issue, but
    > >>>>>> once
    > >>>>>>> it broke
    > >>>>>>>> your data enough that you'll notice, you'll be able to
    > >>>>> dig
    > >>>>>> the
    > >>>>>>> proper flag
    > >>>>>>>> to fix it for next time". I guess it's better than
    > >>>>>> nothing, of
    > >>>>>>> course, but
    > >>>>>>>> I'll admit that defaulting to "opt-in correctness",
    > >>>>>> especially
    > >>>>>>> for a
    > >>>>>>>> feature (LWT) that exists uniquely to provide additional
    > >>>>>>> guarantees, is
    > >>>>>>>> something I have a hard rallying behind.
    > >>>>>>>>
    > >>>>>>>> But a performance regression is a regression, I'm not
    > >>>>>> shrugging
    > >>>>>>> it off.
    > >>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
    > >>>>> serious
    > >>>>>> known
    > >>>>>>>> correctness bug and I frankly feel bad for "the project"
    > >>>>>> that
    > >>>>>>> this has been
    > >>>>>>>> known for so long without action, so I'm a bit biased in
    > >>>>>> wanting
    > >>>>>>> to get it
    > >>>>>>>> fixed asap.
    > >>>>>>>>
    > >>>>>>>> But maybe I'm overstating the urgency here, and maybe
    > >>>>>> option #1
    > >>>>>>> is a better
    > >>>>>>>> way forward.
    > >>>>>>>>
    > >>>>>>>> --
    > >>>>>>>> Sylvain
    > >>>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>
    > >>> ---------------------------------------------------------------------
    > >>>>>>>   To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >>>>>>>   For additional commands, e-mail:
    > >> dev-help@cassandra.apache.org
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>>>
    > >>>>>
    > >>> ---------------------------------------------------------------------
    > >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
    > >>>>>>>
    > >>>>>>>
    > >>>>>>
    > >>>>>>
    > >>>>>>
    > >>>>>>
    > >>> ---------------------------------------------------------------------
    > >>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
    > >>>>>>
    > >>>>>>
    > >>>>>
    > >>>>
    > >>>>
    > >>>>
    > >>>>
    > >> ---------------------------------------------------------------------
    > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
    > >>>>
    > >>>
    > >>>
    > ---------------------------------------------------------------------
    > >>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >>>    For additional commands, e-mail: dev-help@cassandra.apache.org
    > >>>
    > >>>
    > >>>
    > >>>
    > >>> ---------------------------------------------------------------------
    > >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >>> For additional commands, e-mail: dev-help@cassandra.apache.org
    > >>>
    > >>>
    > >>
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Brandon Williams <dr...@gmail.com>.

+1 to both as well.

On Mon, Nov 23, 2020, 4:42 PM Blake Eggleston <be...@apple.com.invalid>
wrote:

> +1 to correctness, and I like the yaml idea
>
> > On Nov 23, 2020, at 4:20 AM, Paulo Motta <pa...@gmail.com>
> wrote:
> >
> > +1 to defaulting for correctness.
> >
> > In addition to that, how about making it a mandatory cassandra.yaml
> > property defaulting to correctness? This would make upgrades with an old
> > cassandra.yaml fail unless an option is explicitly specified, making
> > operators aware of the issue and forcing them to make a choice.
> >
> >> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
> >> benjamin.lerer@datastax.com> escreveu:
> >>
> >> Thank you very much to everybody that provided feedback. It helped a
> lot to
> >> limit our options.
> >>
> >> Unfortunately, it seems that some poor soul (me, really!!!) will have to
> >> make the final call between #3 and #4.
> >>
> >> If I reformulate the question to: Do we default to *correctness *or to
> >> *performance*?
> >>
> >> I would choose to default to *correctness*.
> >>
> >> Of course the situation is more complex than that but it seems that
> >> somebody has to make a call and live with it. It seems to me that being
> >> blamed for choosing correctness is easier to live with ;-)
> >>
> >> Benjamin
> >>
> >> PS: I tried to push the choice on Sylvain but he dodged the bullet.
> >>
> >> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
> >> benedict@apache.org>
> >> wrote:
> >>
> >>> I think I meant #4 __‍♂️
> >>>
> >>> On 20/11/2020, 21:11, "Blake Eggleston" <beggleston@apple.com.INVALID
> >
> >>> wrote:
> >>>
> >>>    I’d also prefer #3 over #4
> >>>
> >>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
> >>> benedict@apache.org> wrote:
> >>>>
> >>>> Well, I expressed a preference for #3 over #4, particularly for
> >> the
> >>> 3.x series.  However at this point, I think the lack of a clear project
> >>> decision means we can punt it back to you and Sylvain to make the final
> >>> call.
> >>>>
> >>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
> >> benjamin.lerer@datastax.com>
> >>> wrote:
> >>>>
> >>>>   I will try to summarize the discussion to clarify the outcome.
> >>>>
> >>>>   Mick is in favor of #4
> >>>>   Summanth is in favor of #4
> >>>>   Sylvain answer was not clear for me. I understood it like I
> >>> prefer #3 to #4
> >>>>   and I am also fine with #1
> >>>>   Jeff is in favor of #3 and will understand #4
> >>>>   David is in favor #3 (fix bug and add flag to roll back to old
> >>> behavior) in
> >>>>   4.0 and #4 in 3.0 and 3.11
> >>>>
> >>>>   Do not hesitate to correct me if I misunderstood your answer.
> >>>>
> >>>>   Based on these answers it seems clear that most people prefer to
> >>> go for #3
> >>>>   or #4.
> >>>>
> >>>>   The choice between #3 (fix correctness opt-in to current
> >>> behavior) and #4
> >>>>   (current behavior opt-in to correctness) is a bit less clear
> >>> specially if
> >>>>   we consider the 3.X branches or 4.0.
> >>>>
> >>>>   Does anybody as some idea on how to choose between those 2
> >>> choices or some
> >>>>   extra opinions on #3 versus #4?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
> >>> dcapwell@gmail.com> wrote:
> >>>>>
> >>>>> I feel that #4 (fix bug and add flag to roll back to old behavior)
> >>> is best.
> >>>>>
> >>>>> About the alternative implementation, I am fine adding it to 3.x
> >>> and 4.0,
> >>>>> but should treat it as a different path disabled by default that
> >>> you can
> >>>>> opt-into, with a plan to opt-in by default "eventually".
> >>>>>
> >>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
> >>>>> benedict@apache.org>
> >>>>> wrote:
> >>>>>
> >>>>>> Perhaps there might be broader appetite to weigh in on which
> >> major
> >>>>>> releases we might target for work that fixes the correctness bug
> >>> without
> >>>>>> serious performance regression?
> >>>>>>
> >>>>>> i.e., if we were to fix the correctness bug now, introducing a
> >>> serious
> >>>>>> performance regression (either opt-in or opt-out), but were to
> >>> land work
> >>>>>> without this problem for 5.0, would there be appetite to backport
> >>> this
> >>>>> work
> >>>>>> to any of 4.0, 3.11 or 3.0?
> >>>>>>
> >>>>>>
> >>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
> >>>>>>
> >>>>>>   This is complicated and relatively few people on earth
> >>> understand it,
> >>>>>> so
> >>>>>>   having little feedback is mostly expected, unfortunately.
> >>>>>>
> >>>>>>   My normal emotional response is "correctness is required,
> >>> opt-in to
> >>>>>>   performance improvements that sacrifice strict correctness",
> >>> but I'm
> >>>>>> also
> >>>>>>   sure this is going to surprise people, and would understand /
> >>> accept
> >>>>> #4
> >>>>>>   (default to current, opt-in to correct).
> >>>>>>
> >>>>>>
> >>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
> >>>>>> benedict@apache.org>
> >>>>>>   wrote:
> >>>>>>
> >>>>>>> It doesn't seem like there's much enthusiasm for any of the
> >>> options
> >>>>>>> available here...
> >>>>>>>
> >>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
> >>>>> benedict@apache.org
> >>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Is the new implementation a separate, distinctly modularized
> >>>>>> new
> >>>>>>> body of work
> >>>>>>>
> >>>>>>>   It’s primarily a distinct, modularised and new body of work,
> >>>>>> however
> >>>>>>> there is some shared code that has been modified - namely
> >>>>>> PaxosState, in
> >>>>>>> which legacy code is maintained but modified for compatibility,
> >>> and
> >>>>>> the
> >>>>>>> system.paxos table (which receives a new column, and slightly
> >>>>>> modified
> >>>>>>> serialization code).  It is conceptually an optimised version of
> >>>>> the
> >>>>>>> existing algorithm.
> >>>>>>>
> >>>>>>>   If there's a chance of being of value to 4.0, I can try to
> >> put
> >>>>>> up a
> >>>>>>> patch next week alongside a high level description of the
> >> changes.
> >>>>>>>
> >>>>>>>> But a performance regression is a regression, I'm not
> >>>>>> shrugging it
> >>>>>>> off.
> >>>>>>>
> >>>>>>>   I don't want to give the impression I'm shrugging off the
> >>>>>> correctness
> >>>>>>> issue either. It's a serious issue to fix, but since all
> >>> successful
> >>>>>> updates
> >>>>>>> to the database are linearizable, I think it's likely that many
> >>>>>>> applications behave correctly with the present semantics, or at
> >>>>> least
> >>>>>>> encounter only transient errors. No doubt many also do not, but
> >> I
> >>>>>> have no
> >>>>>>> idea of the ratio.
> >>>>>>>
> >>>>>>>   The regression isn't itself a simple issue either - depending
> >>>>> on
> >>>>>> the
> >>>>>>> topology and message latencies it is not difficult to produce
> >>>>>> inescapable
> >>>>>>> contention, i.e. guaranteed timeouts - that might persist as
> >> long
> >>>>> as
> >>>>>>> clients continue to retry. It could be quite a serious
> >> degradation
> >>>>> of
> >>>>>>> service to impose on our users.
> >>>>>>>
> >>>>>>>   I don't pretend to know the correct way to make a decision
> >>>>>> balancing
> >>>>>>> these considerations, but I am perhaps more concerned about
> >>>>> imposing
> >>>>>>> service outages than I am temporarily maintaining semantics our
> >>>>>> users have
> >>>>>>> apparently accepted for years - though I absolutely share your
> >>>>>>> embarrassment there.
> >>>>>>>
> >>>>>>>
> >>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
> >> jmckenzie@apache.org
> >>>>>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>       Is the new implementation a separate, distinctly
> >>>>> modularized
> >>>>>> new
> >>>>>>> body of
> >>>>>>>       work or does it make substantial changes to existing
> >>>>>>> implementation and
> >>>>>>>       subsume it?
> >>>>>>>
> >>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
> >>>>>>> lebresne@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Regarding option #4, I'll remark that experience tends to
> >>>>>>> suggest users
> >>>>>>>> don't consistently read the `NEWS.txt` file on upgrade,
> >>>>> so
> >>>>>>> option #4 will
> >>>>>>>> likely essentially mean "LWT has a correctness issue, but
> >>>>>> once
> >>>>>>> it broke
> >>>>>>>> your data enough that you'll notice, you'll be able to
> >>>>> dig
> >>>>>> the
> >>>>>>> proper flag
> >>>>>>>> to fix it for next time". I guess it's better than
> >>>>>> nothing, of
> >>>>>>> course, but
> >>>>>>>> I'll admit that defaulting to "opt-in correctness",
> >>>>>> especially
> >>>>>>> for a
> >>>>>>>> feature (LWT) that exists uniquely to provide additional
> >>>>>>> guarantees, is
> >>>>>>>> something I have a hard rallying behind.
> >>>>>>>>
> >>>>>>>> But a performance regression is a regression, I'm not
> >>>>>> shrugging
> >>>>>>> it off.
> >>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
> >>>>> serious
> >>>>>> known
> >>>>>>>> correctness bug and I frankly feel bad for "the project"
> >>>>>> that
> >>>>>>> this has been
> >>>>>>>> known for so long without action, so I'm a bit biased in
> >>>>>> wanting
> >>>>>>> to get it
> >>>>>>>> fixed asap.
> >>>>>>>>
> >>>>>>>> But maybe I'm overstating the urgency here, and maybe
> >>>>>> option #1
> >>>>>>> is a better
> >>>>>>>> way forward.
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Sylvain
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>> ---------------------------------------------------------------------
> >>>>>>>   To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>>>>   For additional commands, e-mail:
> >> dev-help@cassandra.apache.org
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>
> >>>
> >>>
> ---------------------------------------------------------------------
> >>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>    For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>
> >>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>
> >>>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Blake Eggleston <be...@apple.com.INVALID>.

+1 to correctness, and I like the yaml idea

> On Nov 23, 2020, at 4:20 AM, Paulo Motta <pa...@gmail.com> wrote:
> 
> +1 to defaulting for correctness.
> 
> In addition to that, how about making it a mandatory cassandra.yaml
> property defaulting to correctness? This would make upgrades with an old
> cassandra.yaml fail unless an option is explicitly specified, making
> operators aware of the issue and forcing them to make a choice.
> 
>> Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
>> benjamin.lerer@datastax.com> escreveu:
>> 
>> Thank you very much to everybody that provided feedback. It helped a lot to
>> limit our options.
>> 
>> Unfortunately, it seems that some poor soul (me, really!!!) will have to
>> make the final call between #3 and #4.
>> 
>> If I reformulate the question to: Do we default to *correctness *or to
>> *performance*?
>> 
>> I would choose to default to *correctness*.
>> 
>> Of course the situation is more complex than that but it seems that
>> somebody has to make a call and live with it. It seems to me that being
>> blamed for choosing correctness is easier to live with ;-)
>> 
>> Benjamin
>> 
>> PS: I tried to push the choice on Sylvain but he dodged the bullet.
>> 
>> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
>> benedict@apache.org>
>> wrote:
>> 
>>> I think I meant #4 __‍♂️
>>> 
>>> On 20/11/2020, 21:11, "Blake Eggleston" <be...@apple.com.INVALID>
>>> wrote:
>>> 
>>>    I’d also prefer #3 over #4
>>> 
>>>> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
>>> benedict@apache.org> wrote:
>>>> 
>>>> Well, I expressed a preference for #3 over #4, particularly for
>> the
>>> 3.x series.  However at this point, I think the lack of a clear project
>>> decision means we can punt it back to you and Sylvain to make the final
>>> call.
>>>> 
>>>> On 20/11/2020, 16:23, "Benjamin Lerer" <
>> benjamin.lerer@datastax.com>
>>> wrote:
>>>> 
>>>>   I will try to summarize the discussion to clarify the outcome.
>>>> 
>>>>   Mick is in favor of #4
>>>>   Summanth is in favor of #4
>>>>   Sylvain answer was not clear for me. I understood it like I
>>> prefer #3 to #4
>>>>   and I am also fine with #1
>>>>   Jeff is in favor of #3 and will understand #4
>>>>   David is in favor #3 (fix bug and add flag to roll back to old
>>> behavior) in
>>>>   4.0 and #4 in 3.0 and 3.11
>>>> 
>>>>   Do not hesitate to correct me if I misunderstood your answer.
>>>> 
>>>>   Based on these answers it seems clear that most people prefer to
>>> go for #3
>>>>   or #4.
>>>> 
>>>>   The choice between #3 (fix correctness opt-in to current
>>> behavior) and #4
>>>>   (current behavior opt-in to correctness) is a bit less clear
>>> specially if
>>>>   we consider the 3.X branches or 4.0.
>>>> 
>>>>   Does anybody as some idea on how to choose between those 2
>>> choices or some
>>>>   extra opinions on #3 versus #4?
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>>>   On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
>>> dcapwell@gmail.com> wrote:
>>>>> 
>>>>> I feel that #4 (fix bug and add flag to roll back to old behavior)
>>> is best.
>>>>> 
>>>>> About the alternative implementation, I am fine adding it to 3.x
>>> and 4.0,
>>>>> but should treat it as a different path disabled by default that
>>> you can
>>>>> opt-into, with a plan to opt-in by default "eventually".
>>>>> 
>>>>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
>>>>> benedict@apache.org>
>>>>> wrote:
>>>>> 
>>>>>> Perhaps there might be broader appetite to weigh in on which
>> major
>>>>>> releases we might target for work that fixes the correctness bug
>>> without
>>>>>> serious performance regression?
>>>>>> 
>>>>>> i.e., if we were to fix the correctness bug now, introducing a
>>> serious
>>>>>> performance regression (either opt-in or opt-out), but were to
>>> land work
>>>>>> without this problem for 5.0, would there be appetite to backport
>>> this
>>>>> work
>>>>>> to any of 4.0, 3.11 or 3.0?
>>>>>> 
>>>>>> 
>>>>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
>>>>>> 
>>>>>>   This is complicated and relatively few people on earth
>>> understand it,
>>>>>> so
>>>>>>   having little feedback is mostly expected, unfortunately.
>>>>>> 
>>>>>>   My normal emotional response is "correctness is required,
>>> opt-in to
>>>>>>   performance improvements that sacrifice strict correctness",
>>> but I'm
>>>>>> also
>>>>>>   sure this is going to surprise people, and would understand /
>>> accept
>>>>> #4
>>>>>>   (default to current, opt-in to correct).
>>>>>> 
>>>>>> 
>>>>>>   On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
>>>>>> benedict@apache.org>
>>>>>>   wrote:
>>>>>> 
>>>>>>> It doesn't seem like there's much enthusiasm for any of the
>>> options
>>>>>>> available here...
>>>>>>> 
>>>>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>>>>> benedict@apache.org
>>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Is the new implementation a separate, distinctly modularized
>>>>>> new
>>>>>>> body of work
>>>>>>> 
>>>>>>>   It’s primarily a distinct, modularised and new body of work,
>>>>>> however
>>>>>>> there is some shared code that has been modified - namely
>>>>>> PaxosState, in
>>>>>>> which legacy code is maintained but modified for compatibility,
>>> and
>>>>>> the
>>>>>>> system.paxos table (which receives a new column, and slightly
>>>>>> modified
>>>>>>> serialization code).  It is conceptually an optimised version of
>>>>> the
>>>>>>> existing algorithm.
>>>>>>> 
>>>>>>>   If there's a chance of being of value to 4.0, I can try to
>> put
>>>>>> up a
>>>>>>> patch next week alongside a high level description of the
>> changes.
>>>>>>> 
>>>>>>>> But a performance regression is a regression, I'm not
>>>>>> shrugging it
>>>>>>> off.
>>>>>>> 
>>>>>>>   I don't want to give the impression I'm shrugging off the
>>>>>> correctness
>>>>>>> issue either. It's a serious issue to fix, but since all
>>> successful
>>>>>> updates
>>>>>>> to the database are linearizable, I think it's likely that many
>>>>>>> applications behave correctly with the present semantics, or at
>>>>> least
>>>>>>> encounter only transient errors. No doubt many also do not, but
>> I
>>>>>> have no
>>>>>>> idea of the ratio.
>>>>>>> 
>>>>>>>   The regression isn't itself a simple issue either - depending
>>>>> on
>>>>>> the
>>>>>>> topology and message latencies it is not difficult to produce
>>>>>> inescapable
>>>>>>> contention, i.e. guaranteed timeouts - that might persist as
>> long
>>>>> as
>>>>>>> clients continue to retry. It could be quite a serious
>> degradation
>>>>> of
>>>>>>> service to impose on our users.
>>>>>>> 
>>>>>>>   I don't pretend to know the correct way to make a decision
>>>>>> balancing
>>>>>>> these considerations, but I am perhaps more concerned about
>>>>> imposing
>>>>>>> service outages than I am temporarily maintaining semantics our
>>>>>> users have
>>>>>>> apparently accepted for years - though I absolutely share your
>>>>>>> embarrassment there.
>>>>>>> 
>>>>>>> 
>>>>>>>   On 12/11/2020, 12:41, "Joshua McKenzie" <
>> jmckenzie@apache.org
>>>>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>>       Is the new implementation a separate, distinctly
>>>>> modularized
>>>>>> new
>>>>>>> body of
>>>>>>>       work or does it make substantial changes to existing
>>>>>>> implementation and
>>>>>>>       subsume it?
>>>>>>> 
>>>>>>>       On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
>>>>>>> lebresne@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Regarding option #4, I'll remark that experience tends to
>>>>>>> suggest users
>>>>>>>> don't consistently read the `NEWS.txt` file on upgrade,
>>>>> so
>>>>>>> option #4 will
>>>>>>>> likely essentially mean "LWT has a correctness issue, but
>>>>>> once
>>>>>>> it broke
>>>>>>>> your data enough that you'll notice, you'll be able to
>>>>> dig
>>>>>> the
>>>>>>> proper flag
>>>>>>>> to fix it for next time". I guess it's better than
>>>>>> nothing, of
>>>>>>> course, but
>>>>>>>> I'll admit that defaulting to "opt-in correctness",
>>>>>> especially
>>>>>>> for a
>>>>>>>> feature (LWT) that exists uniquely to provide additional
>>>>>>> guarantees, is
>>>>>>>> something I have a hard rallying behind.
>>>>>>>> 
>>>>>>>> But a performance regression is a regression, I'm not
>>>>>> shrugging
>>>>>>> it off.
>>>>>>>> Still, I feel we shouldn't leave LWT with a fairly
>>>>> serious
>>>>>> known
>>>>>>>> correctness bug and I frankly feel bad for "the project"
>>>>>> that
>>>>>>> this has been
>>>>>>>> known for so long without action, so I'm a bit biased in
>>>>>> wanting
>>>>>>> to get it
>>>>>>>> fixed asap.
>>>>>>>> 
>>>>>>>> But maybe I'm overstating the urgency here, and maybe
>>>>>> option #1
>>>>>>> is a better
>>>>>>>> way forward.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Sylvain
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>> ---------------------------------------------------------------------
>>>>>>>   To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>>>   For additional commands, e-mail:
>> dev-help@cassandra.apache.org
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>> 
>>> 
>>>    ---------------------------------------------------------------------
>>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>    For additional commands, e-mail: dev-help@cassandra.apache.org
>>> 
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>> 
>>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Paulo Motta <pa...@gmail.com>.

+1 to defaulting for correctness.

In addition to that, how about making it a mandatory cassandra.yaml
property defaulting to correctness? This would make upgrades with an old
cassandra.yaml fail unless an option is explicitly specified, making
operators aware of the issue and forcing them to make a choice.

Em seg., 23 de nov. de 2020 às 07:30, Benjamin Lerer <
benjamin.lerer@datastax.com> escreveu:

> Thank you very much to everybody that provided feedback. It helped a lot to
> limit our options.
>
> Unfortunately, it seems that some poor soul (me, really!!!) will have to
> make the final call between #3 and #4.
>
> If I reformulate the question to: Do we default to *correctness *or to
> *performance*?
>
> I would choose to default to *correctness*.
>
> Of course the situation is more complex than that but it seems that
> somebody has to make a call and live with it. It seems to me that being
> blamed for choosing correctness is easier to live with ;-)
>
> Benjamin
>
> PS: I tried to push the choice on Sylvain but he dodged the bullet.
>
> On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <
> benedict@apache.org>
> wrote:
>
> > I think I meant #4 __‍♂️
> >
> > On 20/11/2020, 21:11, "Blake Eggleston" <be...@apple.com.INVALID>
> > wrote:
> >
> >     I’d also prefer #3 over #4
> >
> >     > On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
> > benedict@apache.org> wrote:
> >     >
> >     > Well, I expressed a preference for #3 over #4, particularly for
> the
> > 3.x series.  However at this point, I think the lack of a clear project
> > decision means we can punt it back to you and Sylvain to make the final
> > call.
> >     >
> >     > On 20/11/2020, 16:23, "Benjamin Lerer" <
> benjamin.lerer@datastax.com>
> > wrote:
> >     >
> >     >    I will try to summarize the discussion to clarify the outcome.
> >     >
> >     >    Mick is in favor of #4
> >     >    Summanth is in favor of #4
> >     >    Sylvain answer was not clear for me. I understood it like I
> > prefer #3 to #4
> >     >    and I am also fine with #1
> >     >    Jeff is in favor of #3 and will understand #4
> >     >    David is in favor #3 (fix bug and add flag to roll back to old
> > behavior) in
> >     >    4.0 and #4 in 3.0 and 3.11
> >     >
> >     >    Do not hesitate to correct me if I misunderstood your answer.
> >     >
> >     >    Based on these answers it seems clear that most people prefer to
> > go for #3
> >     >    or #4.
> >     >
> >     >    The choice between #3 (fix correctness opt-in to current
> > behavior) and #4
> >     >    (current behavior opt-in to correctness) is a bit less clear
> > specially if
> >     >    we consider the 3.X branches or 4.0.
> >     >
> >     >    Does anybody as some idea on how to choose between those 2
> > choices or some
> >     >    extra opinions on #3 versus #4?
> >     >
> >     >
> >     >
> >     >
> >     >
> >     >
> >     >>    On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
> > dcapwell@gmail.com> wrote:
> >     >>
> >     >> I feel that #4 (fix bug and add flag to roll back to old behavior)
> > is best.
> >     >>
> >     >> About the alternative implementation, I am fine adding it to 3.x
> > and 4.0,
> >     >> but should treat it as a different path disabled by default that
> > you can
> >     >> opt-into, with a plan to opt-in by default "eventually".
> >     >>
> >     >> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
> >     >> benedict@apache.org>
> >     >> wrote:
> >     >>
> >     >>> Perhaps there might be broader appetite to weigh in on which
> major
> >     >>> releases we might target for work that fixes the correctness bug
> > without
> >     >>> serious performance regression?
> >     >>>
> >     >>> i.e., if we were to fix the correctness bug now, introducing a
> > serious
> >     >>> performance regression (either opt-in or opt-out), but were to
> > land work
> >     >>> without this problem for 5.0, would there be appetite to backport
> > this
> >     >> work
> >     >>> to any of 4.0, 3.11 or 3.0?
> >     >>>
> >     >>>
> >     >>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
> >     >>>
> >     >>>    This is complicated and relatively few people on earth
> > understand it,
> >     >>> so
> >     >>>    having little feedback is mostly expected, unfortunately.
> >     >>>
> >     >>>    My normal emotional response is "correctness is required,
> > opt-in to
> >     >>>    performance improvements that sacrifice strict correctness",
> > but I'm
> >     >>> also
> >     >>>    sure this is going to surprise people, and would understand /
> > accept
> >     >> #4
> >     >>>    (default to current, opt-in to correct).
> >     >>>
> >     >>>
> >     >>>    On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
> >     >>> benedict@apache.org>
> >     >>>    wrote:
> >     >>>
> >     >>>> It doesn't seem like there's much enthusiasm for any of the
> > options
> >     >>>> available here...
> >     >>>>
> >     >>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
> >     >> benedict@apache.org
> >     >>>>
> >     >>>> wrote:
> >     >>>>
> >     >>>>> Is the new implementation a separate, distinctly modularized
> >     >>> new
> >     >>>> body of work
> >     >>>>
> >     >>>>    It’s primarily a distinct, modularised and new body of work,
> >     >>> however
> >     >>>> there is some shared code that has been modified - namely
> >     >>> PaxosState, in
> >     >>>> which legacy code is maintained but modified for compatibility,
> > and
> >     >>> the
> >     >>>> system.paxos table (which receives a new column, and slightly
> >     >>> modified
> >     >>>> serialization code).  It is conceptually an optimised version of
> >     >> the
> >     >>>> existing algorithm.
> >     >>>>
> >     >>>>    If there's a chance of being of value to 4.0, I can try to
> put
> >     >>> up a
> >     >>>> patch next week alongside a high level description of the
> changes.
> >     >>>>
> >     >>>>> But a performance regression is a regression, I'm not
> >     >>> shrugging it
> >     >>>> off.
> >     >>>>
> >     >>>>    I don't want to give the impression I'm shrugging off the
> >     >>> correctness
> >     >>>> issue either. It's a serious issue to fix, but since all
> > successful
> >     >>> updates
> >     >>>> to the database are linearizable, I think it's likely that many
> >     >>>> applications behave correctly with the present semantics, or at
> >     >> least
> >     >>>> encounter only transient errors. No doubt many also do not, but
> I
> >     >>> have no
> >     >>>> idea of the ratio.
> >     >>>>
> >     >>>>    The regression isn't itself a simple issue either - depending
> >     >> on
> >     >>> the
> >     >>>> topology and message latencies it is not difficult to produce
> >     >>> inescapable
> >     >>>> contention, i.e. guaranteed timeouts - that might persist as
> long
> >     >> as
> >     >>>> clients continue to retry. It could be quite a serious
> degradation
> >     >> of
> >     >>>> service to impose on our users.
> >     >>>>
> >     >>>>    I don't pretend to know the correct way to make a decision
> >     >>> balancing
> >     >>>> these considerations, but I am perhaps more concerned about
> >     >> imposing
> >     >>>> service outages than I am temporarily maintaining semantics our
> >     >>> users have
> >     >>>> apparently accepted for years - though I absolutely share your
> >     >>>> embarrassment there.
> >     >>>>
> >     >>>>
> >     >>>>    On 12/11/2020, 12:41, "Joshua McKenzie" <
> jmckenzie@apache.org
> >     >>>
> >     >>> wrote:
> >     >>>>
> >     >>>>        Is the new implementation a separate, distinctly
> >     >> modularized
> >     >>> new
> >     >>>> body of
> >     >>>>        work or does it make substantial changes to existing
> >     >>>> implementation and
> >     >>>>        subsume it?
> >     >>>>
> >     >>>>        On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
> >     >>>> lebresne@gmail.com> wrote:
> >     >>>>
> >     >>>>> Regarding option #4, I'll remark that experience tends to
> >     >>>> suggest users
> >     >>>>> don't consistently read the `NEWS.txt` file on upgrade,
> >     >> so
> >     >>>> option #4 will
> >     >>>>> likely essentially mean "LWT has a correctness issue, but
> >     >>> once
> >     >>>> it broke
> >     >>>>> your data enough that you'll notice, you'll be able to
> >     >> dig
> >     >>> the
> >     >>>> proper flag
> >     >>>>> to fix it for next time". I guess it's better than
> >     >>> nothing, of
> >     >>>> course, but
> >     >>>>> I'll admit that defaulting to "opt-in correctness",
> >     >>> especially
> >     >>>> for a
> >     >>>>> feature (LWT) that exists uniquely to provide additional
> >     >>>> guarantees, is
> >     >>>>> something I have a hard rallying behind.
> >     >>>>>
> >     >>>>> But a performance regression is a regression, I'm not
> >     >>> shrugging
> >     >>>> it off.
> >     >>>>> Still, I feel we shouldn't leave LWT with a fairly
> >     >> serious
> >     >>> known
> >     >>>>> correctness bug and I frankly feel bad for "the project"
> >     >>> that
> >     >>>> this has been
> >     >>>>> known for so long without action, so I'm a bit biased in
> >     >>> wanting
> >     >>>> to get it
> >     >>>>> fixed asap.
> >     >>>>>
> >     >>>>> But maybe I'm overstating the urgency here, and maybe
> >     >>> option #1
> >     >>>> is a better
> >     >>>>> way forward.
> >     >>>>>
> >     >>>>> --
> >     >>>>> Sylvain
> >     >>>>>
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>>
> > ---------------------------------------------------------------------
> >     >>>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     >>>>    For additional commands, e-mail:
> dev-help@cassandra.apache.org
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>>>
> >     >>
> > ---------------------------------------------------------------------
> >     >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >     >>>>
> >     >>>>
> >     >>>
> >     >>>
> >     >>>
> >     >>>
> > ---------------------------------------------------------------------
> >     >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     >>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >     >>>
> >     >>>
> >     >>
> >     >
> >     >
> >     >
> >     >
> ---------------------------------------------------------------------
> >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     > For additional commands, e-mail: dev-help@cassandra.apache.org
> >     >
> >
> >     ---------------------------------------------------------------------
> >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benjamin Lerer <be...@datastax.com>.

Thank you very much to everybody that provided feedback. It helped a lot to
limit our options.

Unfortunately, it seems that some poor soul (me, really!!!) will have to
make the final call between #3 and #4.

If I reformulate the question to: Do we default to *correctness *or to
*performance*?

I would choose to default to *correctness*.

Of course the situation is more complex than that but it seems that
somebody has to make a call and live with it. It seems to me that being
blamed for choosing correctness is easier to live with ;-)

Benjamin

PS: I tried to push the choice on Sylvain but he dodged the bullet.

On Sat, Nov 21, 2020 at 12:30 AM Benedict Elliott Smith <be...@apache.org>
wrote:

> I think I meant #4 __‍♂️
>
> On 20/11/2020, 21:11, "Blake Eggleston" <be...@apple.com.INVALID>
> wrote:
>
>     I’d also prefer #3 over #4
>
>     > On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <
> benedict@apache.org> wrote:
>     >
>     > Well, I expressed a preference for #3 over #4, particularly for the
> 3.x series.  However at this point, I think the lack of a clear project
> decision means we can punt it back to you and Sylvain to make the final
> call.
>     >
>     > On 20/11/2020, 16:23, "Benjamin Lerer" <be...@datastax.com>
> wrote:
>     >
>     >    I will try to summarize the discussion to clarify the outcome.
>     >
>     >    Mick is in favor of #4
>     >    Summanth is in favor of #4
>     >    Sylvain answer was not clear for me. I understood it like I
> prefer #3 to #4
>     >    and I am also fine with #1
>     >    Jeff is in favor of #3 and will understand #4
>     >    David is in favor #3 (fix bug and add flag to roll back to old
> behavior) in
>     >    4.0 and #4 in 3.0 and 3.11
>     >
>     >    Do not hesitate to correct me if I misunderstood your answer.
>     >
>     >    Based on these answers it seems clear that most people prefer to
> go for #3
>     >    or #4.
>     >
>     >    The choice between #3 (fix correctness opt-in to current
> behavior) and #4
>     >    (current behavior opt-in to correctness) is a bit less clear
> specially if
>     >    we consider the 3.X branches or 4.0.
>     >
>     >    Does anybody as some idea on how to choose between those 2
> choices or some
>     >    extra opinions on #3 versus #4?
>     >
>     >
>     >
>     >
>     >
>     >
>     >>    On Wed, Nov 18, 2020 at 9:45 PM David Capwell <
> dcapwell@gmail.com> wrote:
>     >>
>     >> I feel that #4 (fix bug and add flag to roll back to old behavior)
> is best.
>     >>
>     >> About the alternative implementation, I am fine adding it to 3.x
> and 4.0,
>     >> but should treat it as a different path disabled by default that
> you can
>     >> opt-into, with a plan to opt-in by default "eventually".
>     >>
>     >> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
>     >> benedict@apache.org>
>     >> wrote:
>     >>
>     >>> Perhaps there might be broader appetite to weigh in on which major
>     >>> releases we might target for work that fixes the correctness bug
> without
>     >>> serious performance regression?
>     >>>
>     >>> i.e., if we were to fix the correctness bug now, introducing a
> serious
>     >>> performance regression (either opt-in or opt-out), but were to
> land work
>     >>> without this problem for 5.0, would there be appetite to backport
> this
>     >> work
>     >>> to any of 4.0, 3.11 or 3.0?
>     >>>
>     >>>
>     >>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
>     >>>
>     >>>    This is complicated and relatively few people on earth
> understand it,
>     >>> so
>     >>>    having little feedback is mostly expected, unfortunately.
>     >>>
>     >>>    My normal emotional response is "correctness is required,
> opt-in to
>     >>>    performance improvements that sacrifice strict correctness",
> but I'm
>     >>> also
>     >>>    sure this is going to surprise people, and would understand /
> accept
>     >> #4
>     >>>    (default to current, opt-in to correct).
>     >>>
>     >>>
>     >>>    On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
>     >>> benedict@apache.org>
>     >>>    wrote:
>     >>>
>     >>>> It doesn't seem like there's much enthusiasm for any of the
> options
>     >>>> available here...
>     >>>>
>     >>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>     >> benedict@apache.org
>     >>>>
>     >>>> wrote:
>     >>>>
>     >>>>> Is the new implementation a separate, distinctly modularized
>     >>> new
>     >>>> body of work
>     >>>>
>     >>>>    It’s primarily a distinct, modularised and new body of work,
>     >>> however
>     >>>> there is some shared code that has been modified - namely
>     >>> PaxosState, in
>     >>>> which legacy code is maintained but modified for compatibility,
> and
>     >>> the
>     >>>> system.paxos table (which receives a new column, and slightly
>     >>> modified
>     >>>> serialization code).  It is conceptually an optimised version of
>     >> the
>     >>>> existing algorithm.
>     >>>>
>     >>>>    If there's a chance of being of value to 4.0, I can try to put
>     >>> up a
>     >>>> patch next week alongside a high level description of the changes.
>     >>>>
>     >>>>> But a performance regression is a regression, I'm not
>     >>> shrugging it
>     >>>> off.
>     >>>>
>     >>>>    I don't want to give the impression I'm shrugging off the
>     >>> correctness
>     >>>> issue either. It's a serious issue to fix, but since all
> successful
>     >>> updates
>     >>>> to the database are linearizable, I think it's likely that many
>     >>>> applications behave correctly with the present semantics, or at
>     >> least
>     >>>> encounter only transient errors. No doubt many also do not, but I
>     >>> have no
>     >>>> idea of the ratio.
>     >>>>
>     >>>>    The regression isn't itself a simple issue either - depending
>     >> on
>     >>> the
>     >>>> topology and message latencies it is not difficult to produce
>     >>> inescapable
>     >>>> contention, i.e. guaranteed timeouts - that might persist as long
>     >> as
>     >>>> clients continue to retry. It could be quite a serious degradation
>     >> of
>     >>>> service to impose on our users.
>     >>>>
>     >>>>    I don't pretend to know the correct way to make a decision
>     >>> balancing
>     >>>> these considerations, but I am perhaps more concerned about
>     >> imposing
>     >>>> service outages than I am temporarily maintaining semantics our
>     >>> users have
>     >>>> apparently accepted for years - though I absolutely share your
>     >>>> embarrassment there.
>     >>>>
>     >>>>
>     >>>>    On 12/11/2020, 12:41, "Joshua McKenzie" <jmckenzie@apache.org
>     >>>
>     >>> wrote:
>     >>>>
>     >>>>        Is the new implementation a separate, distinctly
>     >> modularized
>     >>> new
>     >>>> body of
>     >>>>        work or does it make substantial changes to existing
>     >>>> implementation and
>     >>>>        subsume it?
>     >>>>
>     >>>>        On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
>     >>>> lebresne@gmail.com> wrote:
>     >>>>
>     >>>>> Regarding option #4, I'll remark that experience tends to
>     >>>> suggest users
>     >>>>> don't consistently read the `NEWS.txt` file on upgrade,
>     >> so
>     >>>> option #4 will
>     >>>>> likely essentially mean "LWT has a correctness issue, but
>     >>> once
>     >>>> it broke
>     >>>>> your data enough that you'll notice, you'll be able to
>     >> dig
>     >>> the
>     >>>> proper flag
>     >>>>> to fix it for next time". I guess it's better than
>     >>> nothing, of
>     >>>> course, but
>     >>>>> I'll admit that defaulting to "opt-in correctness",
>     >>> especially
>     >>>> for a
>     >>>>> feature (LWT) that exists uniquely to provide additional
>     >>>> guarantees, is
>     >>>>> something I have a hard rallying behind.
>     >>>>>
>     >>>>> But a performance regression is a regression, I'm not
>     >>> shrugging
>     >>>> it off.
>     >>>>> Still, I feel we shouldn't leave LWT with a fairly
>     >> serious
>     >>> known
>     >>>>> correctness bug and I frankly feel bad for "the project"
>     >>> that
>     >>>> this has been
>     >>>>> known for so long without action, so I'm a bit biased in
>     >>> wanting
>     >>>> to get it
>     >>>>> fixed asap.
>     >>>>>
>     >>>>> But maybe I'm overstating the urgency here, and maybe
>     >>> option #1
>     >>>> is a better
>     >>>>> way forward.
>     >>>>>
>     >>>>> --
>     >>>>> Sylvain
>     >>>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>
> ---------------------------------------------------------------------
>     >>>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >>>>    For additional commands, e-mail: dev-help@cassandra.apache.org
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>
> ---------------------------------------------------------------------
>     >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>     >>>>
>     >>>>
>     >>>
>     >>>
>     >>>
>     >>>
> ---------------------------------------------------------------------
>     >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >>> For additional commands, e-mail: dev-help@cassandra.apache.org
>     >>>
>     >>>
>     >>
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

I think I meant #4 __‍♂️

On 20/11/2020, 21:11, "Blake Eggleston" <be...@apple.com.INVALID> wrote:

    I’d also prefer #3 over #4

    > On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <be...@apache.org> wrote:
    > 
    > Well, I expressed a preference for #3 over #4, particularly for the 3.x series.  However at this point, I think the lack of a clear project decision means we can punt it back to you and Sylvain to make the final call.
    > 
    > On 20/11/2020, 16:23, "Benjamin Lerer" <be...@datastax.com> wrote:
    > 
    >    I will try to summarize the discussion to clarify the outcome.
    > 
    >    Mick is in favor of #4
    >    Summanth is in favor of #4
    >    Sylvain answer was not clear for me. I understood it like I prefer #3 to #4
    >    and I am also fine with #1
    >    Jeff is in favor of #3 and will understand #4
    >    David is in favor #3 (fix bug and add flag to roll back to old behavior) in
    >    4.0 and #4 in 3.0 and 3.11
    > 
    >    Do not hesitate to correct me if I misunderstood your answer.
    > 
    >    Based on these answers it seems clear that most people prefer to go for #3
    >    or #4.
    > 
    >    The choice between #3 (fix correctness opt-in to current behavior) and #4
    >    (current behavior opt-in to correctness) is a bit less clear specially if
    >    we consider the 3.X branches or 4.0.
    > 
    >    Does anybody as some idea on how to choose between those 2 choices or some
    >    extra opinions on #3 versus #4?
    > 
    > 
    > 
    > 
    > 
    > 
    >>    On Wed, Nov 18, 2020 at 9:45 PM David Capwell <dc...@gmail.com> wrote:
    >> 
    >> I feel that #4 (fix bug and add flag to roll back to old behavior) is best.
    >> 
    >> About the alternative implementation, I am fine adding it to 3.x and 4.0,
    >> but should treat it as a different path disabled by default that you can
    >> opt-into, with a plan to opt-in by default "eventually".
    >> 
    >> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
    >> benedict@apache.org>
    >> wrote:
    >> 
    >>> Perhaps there might be broader appetite to weigh in on which major
    >>> releases we might target for work that fixes the correctness bug without
    >>> serious performance regression?
    >>> 
    >>> i.e., if we were to fix the correctness bug now, introducing a serious
    >>> performance regression (either opt-in or opt-out), but were to land work
    >>> without this problem for 5.0, would there be appetite to backport this
    >> work
    >>> to any of 4.0, 3.11 or 3.0?
    >>> 
    >>> 
    >>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
    >>> 
    >>>    This is complicated and relatively few people on earth understand it,
    >>> so
    >>>    having little feedback is mostly expected, unfortunately.
    >>> 
    >>>    My normal emotional response is "correctness is required, opt-in to
    >>>    performance improvements that sacrifice strict correctness", but I'm
    >>> also
    >>>    sure this is going to surprise people, and would understand / accept
    >> #4
    >>>    (default to current, opt-in to correct).
    >>> 
    >>> 
    >>>    On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
    >>> benedict@apache.org>
    >>>    wrote:
    >>> 
    >>>> It doesn't seem like there's much enthusiasm for any of the options
    >>>> available here...
    >>>> 
    >>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
    >> benedict@apache.org
    >>>> 
    >>>> wrote:
    >>>> 
    >>>>> Is the new implementation a separate, distinctly modularized
    >>> new
    >>>> body of work
    >>>> 
    >>>>    It’s primarily a distinct, modularised and new body of work,
    >>> however
    >>>> there is some shared code that has been modified - namely
    >>> PaxosState, in
    >>>> which legacy code is maintained but modified for compatibility, and
    >>> the
    >>>> system.paxos table (which receives a new column, and slightly
    >>> modified
    >>>> serialization code).  It is conceptually an optimised version of
    >> the
    >>>> existing algorithm.
    >>>> 
    >>>>    If there's a chance of being of value to 4.0, I can try to put
    >>> up a
    >>>> patch next week alongside a high level description of the changes.
    >>>> 
    >>>>> But a performance regression is a regression, I'm not
    >>> shrugging it
    >>>> off.
    >>>> 
    >>>>    I don't want to give the impression I'm shrugging off the
    >>> correctness
    >>>> issue either. It's a serious issue to fix, but since all successful
    >>> updates
    >>>> to the database are linearizable, I think it's likely that many
    >>>> applications behave correctly with the present semantics, or at
    >> least
    >>>> encounter only transient errors. No doubt many also do not, but I
    >>> have no
    >>>> idea of the ratio.
    >>>> 
    >>>>    The regression isn't itself a simple issue either - depending
    >> on
    >>> the
    >>>> topology and message latencies it is not difficult to produce
    >>> inescapable
    >>>> contention, i.e. guaranteed timeouts - that might persist as long
    >> as
    >>>> clients continue to retry. It could be quite a serious degradation
    >> of
    >>>> service to impose on our users.
    >>>> 
    >>>>    I don't pretend to know the correct way to make a decision
    >>> balancing
    >>>> these considerations, but I am perhaps more concerned about
    >> imposing
    >>>> service outages than I am temporarily maintaining semantics our
    >>> users have
    >>>> apparently accepted for years - though I absolutely share your
    >>>> embarrassment there.
    >>>> 
    >>>> 
    >>>>    On 12/11/2020, 12:41, "Joshua McKenzie" <jmckenzie@apache.org
    >>> 
    >>> wrote:
    >>>> 
    >>>>        Is the new implementation a separate, distinctly
    >> modularized
    >>> new
    >>>> body of
    >>>>        work or does it make substantial changes to existing
    >>>> implementation and
    >>>>        subsume it?
    >>>> 
    >>>>        On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
    >>>> lebresne@gmail.com> wrote:
    >>>> 
    >>>>> Regarding option #4, I'll remark that experience tends to
    >>>> suggest users
    >>>>> don't consistently read the `NEWS.txt` file on upgrade,
    >> so
    >>>> option #4 will
    >>>>> likely essentially mean "LWT has a correctness issue, but
    >>> once
    >>>> it broke
    >>>>> your data enough that you'll notice, you'll be able to
    >> dig
    >>> the
    >>>> proper flag
    >>>>> to fix it for next time". I guess it's better than
    >>> nothing, of
    >>>> course, but
    >>>>> I'll admit that defaulting to "opt-in correctness",
    >>> especially
    >>>> for a
    >>>>> feature (LWT) that exists uniquely to provide additional
    >>>> guarantees, is
    >>>>> something I have a hard rallying behind.
    >>>>> 
    >>>>> But a performance regression is a regression, I'm not
    >>> shrugging
    >>>> it off.
    >>>>> Still, I feel we shouldn't leave LWT with a fairly
    >> serious
    >>> known
    >>>>> correctness bug and I frankly feel bad for "the project"
    >>> that
    >>>> this has been
    >>>>> known for so long without action, so I'm a bit biased in
    >>> wanting
    >>>> to get it
    >>>>> fixed asap.
    >>>>> 
    >>>>> But maybe I'm overstating the urgency here, and maybe
    >>> option #1
    >>>> is a better
    >>>>> way forward.
    >>>>> 
    >>>>> --
    >>>>> Sylvain
    >>>>> 
    >>>> 
    >>>> 
    >>>> 
    >>>> 
    >>> ---------------------------------------------------------------------
    >>>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >>>>    For additional commands, e-mail: dev-help@cassandra.apache.org
    >>>> 
    >>>> 
    >>>> 
    >>>> 
    >>>> 
    >> ---------------------------------------------------------------------
    >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
    >>>> 
    >>>> 
    >>> 
    >>> 
    >>> 
    >>> ---------------------------------------------------------------------
    >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >>> For additional commands, e-mail: dev-help@cassandra.apache.org
    >>> 
    >>> 
    >> 
    > 
    > 
    > 
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    > 

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    For additional commands, e-mail: dev-help@cassandra.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Blake Eggleston <be...@apple.com.INVALID>.

I’d also prefer #3 over #4

> On Nov 20, 2020, at 10:03 AM, Benedict Elliott Smith <be...@apache.org> wrote:
> 
> Well, I expressed a preference for #3 over #4, particularly for the 3.x series.  However at this point, I think the lack of a clear project decision means we can punt it back to you and Sylvain to make the final call.
> 
> On 20/11/2020, 16:23, "Benjamin Lerer" <be...@datastax.com> wrote:
> 
>    I will try to summarize the discussion to clarify the outcome.
> 
>    Mick is in favor of #4
>    Summanth is in favor of #4
>    Sylvain answer was not clear for me. I understood it like I prefer #3 to #4
>    and I am also fine with #1
>    Jeff is in favor of #3 and will understand #4
>    David is in favor #3 (fix bug and add flag to roll back to old behavior) in
>    4.0 and #4 in 3.0 and 3.11
> 
>    Do not hesitate to correct me if I misunderstood your answer.
> 
>    Based on these answers it seems clear that most people prefer to go for #3
>    or #4.
> 
>    The choice between #3 (fix correctness opt-in to current behavior) and #4
>    (current behavior opt-in to correctness) is a bit less clear specially if
>    we consider the 3.X branches or 4.0.
> 
>    Does anybody as some idea on how to choose between those 2 choices or some
>    extra opinions on #3 versus #4?
> 
> 
> 
> 
> 
> 
>>    On Wed, Nov 18, 2020 at 9:45 PM David Capwell <dc...@gmail.com> wrote:
>> 
>> I feel that #4 (fix bug and add flag to roll back to old behavior) is best.
>> 
>> About the alternative implementation, I am fine adding it to 3.x and 4.0,
>> but should treat it as a different path disabled by default that you can
>> opt-into, with a plan to opt-in by default "eventually".
>> 
>> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
>> benedict@apache.org>
>> wrote:
>> 
>>> Perhaps there might be broader appetite to weigh in on which major
>>> releases we might target for work that fixes the correctness bug without
>>> serious performance regression?
>>> 
>>> i.e., if we were to fix the correctness bug now, introducing a serious
>>> performance regression (either opt-in or opt-out), but were to land work
>>> without this problem for 5.0, would there be appetite to backport this
>> work
>>> to any of 4.0, 3.11 or 3.0?
>>> 
>>> 
>>> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
>>> 
>>>    This is complicated and relatively few people on earth understand it,
>>> so
>>>    having little feedback is mostly expected, unfortunately.
>>> 
>>>    My normal emotional response is "correctness is required, opt-in to
>>>    performance improvements that sacrifice strict correctness", but I'm
>>> also
>>>    sure this is going to surprise people, and would understand / accept
>> #4
>>>    (default to current, opt-in to correct).
>>> 
>>> 
>>>    On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
>>> benedict@apache.org>
>>>    wrote:
>>> 
>>>> It doesn't seem like there's much enthusiasm for any of the options
>>>> available here...
>>>> 
>>>> On 12/11/2020, 14:37, "Benedict Elliott Smith" <
>> benedict@apache.org
>>>> 
>>>> wrote:
>>>> 
>>>>> Is the new implementation a separate, distinctly modularized
>>> new
>>>> body of work
>>>> 
>>>>    It’s primarily a distinct, modularised and new body of work,
>>> however
>>>> there is some shared code that has been modified - namely
>>> PaxosState, in
>>>> which legacy code is maintained but modified for compatibility, and
>>> the
>>>> system.paxos table (which receives a new column, and slightly
>>> modified
>>>> serialization code).  It is conceptually an optimised version of
>> the
>>>> existing algorithm.
>>>> 
>>>>    If there's a chance of being of value to 4.0, I can try to put
>>> up a
>>>> patch next week alongside a high level description of the changes.
>>>> 
>>>>> But a performance regression is a regression, I'm not
>>> shrugging it
>>>> off.
>>>> 
>>>>    I don't want to give the impression I'm shrugging off the
>>> correctness
>>>> issue either. It's a serious issue to fix, but since all successful
>>> updates
>>>> to the database are linearizable, I think it's likely that many
>>>> applications behave correctly with the present semantics, or at
>> least
>>>> encounter only transient errors. No doubt many also do not, but I
>>> have no
>>>> idea of the ratio.
>>>> 
>>>>    The regression isn't itself a simple issue either - depending
>> on
>>> the
>>>> topology and message latencies it is not difficult to produce
>>> inescapable
>>>> contention, i.e. guaranteed timeouts - that might persist as long
>> as
>>>> clients continue to retry. It could be quite a serious degradation
>> of
>>>> service to impose on our users.
>>>> 
>>>>    I don't pretend to know the correct way to make a decision
>>> balancing
>>>> these considerations, but I am perhaps more concerned about
>> imposing
>>>> service outages than I am temporarily maintaining semantics our
>>> users have
>>>> apparently accepted for years - though I absolutely share your
>>>> embarrassment there.
>>>> 
>>>> 
>>>>    On 12/11/2020, 12:41, "Joshua McKenzie" <jmckenzie@apache.org
>>> 
>>> wrote:
>>>> 
>>>>        Is the new implementation a separate, distinctly
>> modularized
>>> new
>>>> body of
>>>>        work or does it make substantial changes to existing
>>>> implementation and
>>>>        subsume it?
>>>> 
>>>>        On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
>>>> lebresne@gmail.com> wrote:
>>>> 
>>>>> Regarding option #4, I'll remark that experience tends to
>>>> suggest users
>>>>> don't consistently read the `NEWS.txt` file on upgrade,
>> so
>>>> option #4 will
>>>>> likely essentially mean "LWT has a correctness issue, but
>>> once
>>>> it broke
>>>>> your data enough that you'll notice, you'll be able to
>> dig
>>> the
>>>> proper flag
>>>>> to fix it for next time". I guess it's better than
>>> nothing, of
>>>> course, but
>>>>> I'll admit that defaulting to "opt-in correctness",
>>> especially
>>>> for a
>>>>> feature (LWT) that exists uniquely to provide additional
>>>> guarantees, is
>>>>> something I have a hard rallying behind.
>>>>> 
>>>>> But a performance regression is a regression, I'm not
>>> shrugging
>>>> it off.
>>>>> Still, I feel we shouldn't leave LWT with a fairly
>> serious
>>> known
>>>>> correctness bug and I frankly feel bad for "the project"
>>> that
>>>> this has been
>>>>> known for so long without action, so I'm a bit biased in
>>> wanting
>>>> to get it
>>>>> fixed asap.
>>>>> 
>>>>> But maybe I'm overstating the urgency here, and maybe
>>> option #1
>>>> is a better
>>>>> way forward.
>>>>> 
>>>>> --
>>>>> Sylvain
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> ---------------------------------------------------------------------
>>>>    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>    For additional commands, e-mail: dev-help@cassandra.apache.org
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>> 
>>> 
>> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

Well, I expressed a preference for #3 over #4, particularly for the 3.x series.  However at this point, I think the lack of a clear project decision means we can punt it back to you and Sylvain to make the final call.

On 20/11/2020, 16:23, "Benjamin Lerer" <be...@datastax.com> wrote:

    I will try to summarize the discussion to clarify the outcome.

    Mick is in favor of #4
    Summanth is in favor of #4
    Sylvain answer was not clear for me. I understood it like I prefer #3 to #4
    and I am also fine with #1
    Jeff is in favor of #3 and will understand #4
    David is in favor #3 (fix bug and add flag to roll back to old behavior) in
    4.0 and #4 in 3.0 and 3.11

    Do not hesitate to correct me if I misunderstood your answer.

    Based on these answers it seems clear that most people prefer to go for #3
    or #4.

    The choice between #3 (fix correctness opt-in to current behavior) and #4
    (current behavior opt-in to correctness) is a bit less clear specially if
    we consider the 3.X branches or 4.0.

    Does anybody as some idea on how to choose between those 2 choices or some
    extra opinions on #3 versus #4?






    On Wed, Nov 18, 2020 at 9:45 PM David Capwell <dc...@gmail.com> wrote:

    > I feel that #4 (fix bug and add flag to roll back to old behavior) is best.
    >
    > About the alternative implementation, I am fine adding it to 3.x and 4.0,
    > but should treat it as a different path disabled by default that you can
    > opt-into, with a plan to opt-in by default "eventually".
    >
    > On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
    > benedict@apache.org>
    > wrote:
    >
    > > Perhaps there might be broader appetite to weigh in on which major
    > > releases we might target for work that fixes the correctness bug without
    > > serious performance regression?
    > >
    > > i.e., if we were to fix the correctness bug now, introducing a serious
    > > performance regression (either opt-in or opt-out), but were to land work
    > > without this problem for 5.0, would there be appetite to backport this
    > work
    > > to any of 4.0, 3.11 or 3.0?
    > >
    > >
    > > On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
    > >
    > >     This is complicated and relatively few people on earth understand it,
    > > so
    > >     having little feedback is mostly expected, unfortunately.
    > >
    > >     My normal emotional response is "correctness is required, opt-in to
    > >     performance improvements that sacrifice strict correctness", but I'm
    > > also
    > >     sure this is going to surprise people, and would understand / accept
    > #4
    > >     (default to current, opt-in to correct).
    > >
    > >
    > >     On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
    > > benedict@apache.org>
    > >     wrote:
    > >
    > >     > It doesn't seem like there's much enthusiasm for any of the options
    > >     > available here...
    > >     >
    > >     > On 12/11/2020, 14:37, "Benedict Elliott Smith" <
    > benedict@apache.org
    > > >
    > >     > wrote:
    > >     >
    > >     >     > Is the new implementation a separate, distinctly modularized
    > > new
    > >     > body of work
    > >     >
    > >     >     It’s primarily a distinct, modularised and new body of work,
    > > however
    > >     > there is some shared code that has been modified - namely
    > > PaxosState, in
    > >     > which legacy code is maintained but modified for compatibility, and
    > > the
    > >     > system.paxos table (which receives a new column, and slightly
    > > modified
    > >     > serialization code).  It is conceptually an optimised version of
    > the
    > >     > existing algorithm.
    > >     >
    > >     >     If there's a chance of being of value to 4.0, I can try to put
    > > up a
    > >     > patch next week alongside a high level description of the changes.
    > >     >
    > >     >     > But a performance regression is a regression, I'm not
    > > shrugging it
    > >     > off.
    > >     >
    > >     >     I don't want to give the impression I'm shrugging off the
    > > correctness
    > >     > issue either. It's a serious issue to fix, but since all successful
    > > updates
    > >     > to the database are linearizable, I think it's likely that many
    > >     > applications behave correctly with the present semantics, or at
    > least
    > >     > encounter only transient errors. No doubt many also do not, but I
    > > have no
    > >     > idea of the ratio.
    > >     >
    > >     >     The regression isn't itself a simple issue either - depending
    > on
    > > the
    > >     > topology and message latencies it is not difficult to produce
    > > inescapable
    > >     > contention, i.e. guaranteed timeouts - that might persist as long
    > as
    > >     > clients continue to retry. It could be quite a serious degradation
    > of
    > >     > service to impose on our users.
    > >     >
    > >     >     I don't pretend to know the correct way to make a decision
    > > balancing
    > >     > these considerations, but I am perhaps more concerned about
    > imposing
    > >     > service outages than I am temporarily maintaining semantics our
    > > users have
    > >     > apparently accepted for years - though I absolutely share your
    > >     > embarrassment there.
    > >     >
    > >     >
    > >     >     On 12/11/2020, 12:41, "Joshua McKenzie" <jmckenzie@apache.org
    > >
    > > wrote:
    > >     >
    > >     >         Is the new implementation a separate, distinctly
    > modularized
    > > new
    > >     > body of
    > >     >         work or does it make substantial changes to existing
    > >     > implementation and
    > >     >         subsume it?
    > >     >
    > >     >         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
    > >     > lebresne@gmail.com> wrote:
    > >     >
    > >     >         > Regarding option #4, I'll remark that experience tends to
    > >     > suggest users
    > >     >         > don't consistently read the `NEWS.txt` file on upgrade,
    > so
    > >     > option #4 will
    > >     >         > likely essentially mean "LWT has a correctness issue, but
    > > once
    > >     > it broke
    > >     >         > your data enough that you'll notice, you'll be able to
    > dig
    > > the
    > >     > proper flag
    > >     >         > to fix it for next time". I guess it's better than
    > > nothing, of
    > >     > course, but
    > >     >         > I'll admit that defaulting to "opt-in correctness",
    > > especially
    > >     > for a
    > >     >         > feature (LWT) that exists uniquely to provide additional
    > >     > guarantees, is
    > >     >         > something I have a hard rallying behind.
    > >     >         >
    > >     >         > But a performance regression is a regression, I'm not
    > > shrugging
    > >     > it off.
    > >     >         > Still, I feel we shouldn't leave LWT with a fairly
    > serious
    > > known
    > >     >         > correctness bug and I frankly feel bad for "the project"
    > > that
    > >     > this has been
    > >     >         > known for so long without action, so I'm a bit biased in
    > > wanting
    > >     > to get it
    > >     >         > fixed asap.
    > >     >         >
    > >     >         > But maybe I'm overstating the urgency here, and maybe
    > > option #1
    > >     > is a better
    > >     >         > way forward.
    > >     >         >
    > >     >         > --
    > >     >         > Sylvain
    > >     >         >
    > >     >
    > >     >
    > >     >
    > >     >
    > >  ---------------------------------------------------------------------
    > >     >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >     >     For additional commands, e-mail: dev-help@cassandra.apache.org
    > >     >
    > >     >
    > >     >
    > >     >
    > >     >
    > ---------------------------------------------------------------------
    > >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > >     > For additional commands, e-mail: dev-help@cassandra.apache.org
    > >     >
    > >     >
    > >
    > >
    > >
    > > ---------------------------------------------------------------------
    > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > > For additional commands, e-mail: dev-help@cassandra.apache.org
    > >
    > >
    >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benjamin Lerer <be...@datastax.com>.

I will try to summarize the discussion to clarify the outcome.

Mick is in favor of #4
Summanth is in favor of #4
Sylvain answer was not clear for me. I understood it like I prefer #3 to #4
and I am also fine with #1
Jeff is in favor of #3 and will understand #4
David is in favor #3 (fix bug and add flag to roll back to old behavior) in
4.0 and #4 in 3.0 and 3.11

Do not hesitate to correct me if I misunderstood your answer.

Based on these answers it seems clear that most people prefer to go for #3
or #4.

The choice between #3 (fix correctness opt-in to current behavior) and #4
(current behavior opt-in to correctness) is a bit less clear specially if
we consider the 3.X branches or 4.0.

Does anybody as some idea on how to choose between those 2 choices or some
extra opinions on #3 versus #4?






On Wed, Nov 18, 2020 at 9:45 PM David Capwell <dc...@gmail.com> wrote:

> I feel that #4 (fix bug and add flag to roll back to old behavior) is best.
>
> About the alternative implementation, I am fine adding it to 3.x and 4.0,
> but should treat it as a different path disabled by default that you can
> opt-into, with a plan to opt-in by default "eventually".
>
> On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <
> benedict@apache.org>
> wrote:
>
> > Perhaps there might be broader appetite to weigh in on which major
> > releases we might target for work that fixes the correctness bug without
> > serious performance regression?
> >
> > i.e., if we were to fix the correctness bug now, introducing a serious
> > performance regression (either opt-in or opt-out), but were to land work
> > without this problem for 5.0, would there be appetite to backport this
> work
> > to any of 4.0, 3.11 or 3.0?
> >
> >
> > On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
> >
> >     This is complicated and relatively few people on earth understand it,
> > so
> >     having little feedback is mostly expected, unfortunately.
> >
> >     My normal emotional response is "correctness is required, opt-in to
> >     performance improvements that sacrifice strict correctness", but I'm
> > also
> >     sure this is going to surprise people, and would understand / accept
> #4
> >     (default to current, opt-in to correct).
> >
> >
> >     On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
> > benedict@apache.org>
> >     wrote:
> >
> >     > It doesn't seem like there's much enthusiasm for any of the options
> >     > available here...
> >     >
> >     > On 12/11/2020, 14:37, "Benedict Elliott Smith" <
> benedict@apache.org
> > >
> >     > wrote:
> >     >
> >     >     > Is the new implementation a separate, distinctly modularized
> > new
> >     > body of work
> >     >
> >     >     It’s primarily a distinct, modularised and new body of work,
> > however
> >     > there is some shared code that has been modified - namely
> > PaxosState, in
> >     > which legacy code is maintained but modified for compatibility, and
> > the
> >     > system.paxos table (which receives a new column, and slightly
> > modified
> >     > serialization code).  It is conceptually an optimised version of
> the
> >     > existing algorithm.
> >     >
> >     >     If there's a chance of being of value to 4.0, I can try to put
> > up a
> >     > patch next week alongside a high level description of the changes.
> >     >
> >     >     > But a performance regression is a regression, I'm not
> > shrugging it
> >     > off.
> >     >
> >     >     I don't want to give the impression I'm shrugging off the
> > correctness
> >     > issue either. It's a serious issue to fix, but since all successful
> > updates
> >     > to the database are linearizable, I think it's likely that many
> >     > applications behave correctly with the present semantics, or at
> least
> >     > encounter only transient errors. No doubt many also do not, but I
> > have no
> >     > idea of the ratio.
> >     >
> >     >     The regression isn't itself a simple issue either - depending
> on
> > the
> >     > topology and message latencies it is not difficult to produce
> > inescapable
> >     > contention, i.e. guaranteed timeouts - that might persist as long
> as
> >     > clients continue to retry. It could be quite a serious degradation
> of
> >     > service to impose on our users.
> >     >
> >     >     I don't pretend to know the correct way to make a decision
> > balancing
> >     > these considerations, but I am perhaps more concerned about
> imposing
> >     > service outages than I am temporarily maintaining semantics our
> > users have
> >     > apparently accepted for years - though I absolutely share your
> >     > embarrassment there.
> >     >
> >     >
> >     >     On 12/11/2020, 12:41, "Joshua McKenzie" <jmckenzie@apache.org
> >
> > wrote:
> >     >
> >     >         Is the new implementation a separate, distinctly
> modularized
> > new
> >     > body of
> >     >         work or does it make substantial changes to existing
> >     > implementation and
> >     >         subsume it?
> >     >
> >     >         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
> >     > lebresne@gmail.com> wrote:
> >     >
> >     >         > Regarding option #4, I'll remark that experience tends to
> >     > suggest users
> >     >         > don't consistently read the `NEWS.txt` file on upgrade,
> so
> >     > option #4 will
> >     >         > likely essentially mean "LWT has a correctness issue, but
> > once
> >     > it broke
> >     >         > your data enough that you'll notice, you'll be able to
> dig
> > the
> >     > proper flag
> >     >         > to fix it for next time". I guess it's better than
> > nothing, of
> >     > course, but
> >     >         > I'll admit that defaulting to "opt-in correctness",
> > especially
> >     > for a
> >     >         > feature (LWT) that exists uniquely to provide additional
> >     > guarantees, is
> >     >         > something I have a hard rallying behind.
> >     >         >
> >     >         > But a performance regression is a regression, I'm not
> > shrugging
> >     > it off.
> >     >         > Still, I feel we shouldn't leave LWT with a fairly
> serious
> > known
> >     >         > correctness bug and I frankly feel bad for "the project"
> > that
> >     > this has been
> >     >         > known for so long without action, so I'm a bit biased in
> > wanting
> >     > to get it
> >     >         > fixed asap.
> >     >         >
> >     >         > But maybe I'm overstating the urgency here, and maybe
> > option #1
> >     > is a better
> >     >         > way forward.
> >     >         >
> >     >         > --
> >     >         > Sylvain
> >     >         >
> >     >
> >     >
> >     >
> >     >
> >  ---------------------------------------------------------------------
> >     >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     >     For additional commands, e-mail: dev-help@cassandra.apache.org
> >     >
> >     >
> >     >
> >     >
> >     >
> ---------------------------------------------------------------------
> >     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >     > For additional commands, e-mail: dev-help@cassandra.apache.org
> >     >
> >     >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by David Capwell <dc...@gmail.com>.

I feel that #4 (fix bug and add flag to roll back to old behavior) is best.

About the alternative implementation, I am fine adding it to 3.x and 4.0,
but should treat it as a different path disabled by default that you can
opt-into, with a plan to opt-in by default "eventually".

On Wed, Nov 18, 2020 at 11:10 AM Benedict Elliott Smith <be...@apache.org>
wrote:

> Perhaps there might be broader appetite to weigh in on which major
> releases we might target for work that fixes the correctness bug without
> serious performance regression?
>
> i.e., if we were to fix the correctness bug now, introducing a serious
> performance regression (either opt-in or opt-out), but were to land work
> without this problem for 5.0, would there be appetite to backport this work
> to any of 4.0, 3.11 or 3.0?
>
>
> On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:
>
>     This is complicated and relatively few people on earth understand it,
> so
>     having little feedback is mostly expected, unfortunately.
>
>     My normal emotional response is "correctness is required, opt-in to
>     performance improvements that sacrifice strict correctness", but I'm
> also
>     sure this is going to surprise people, and would understand / accept #4
>     (default to current, opt-in to correct).
>
>
>     On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <
> benedict@apache.org>
>     wrote:
>
>     > It doesn't seem like there's much enthusiasm for any of the options
>     > available here...
>     >
>     > On 12/11/2020, 14:37, "Benedict Elliott Smith" <benedict@apache.org
> >
>     > wrote:
>     >
>     >     > Is the new implementation a separate, distinctly modularized
> new
>     > body of work
>     >
>     >     It’s primarily a distinct, modularised and new body of work,
> however
>     > there is some shared code that has been modified - namely
> PaxosState, in
>     > which legacy code is maintained but modified for compatibility, and
> the
>     > system.paxos table (which receives a new column, and slightly
> modified
>     > serialization code).  It is conceptually an optimised version of the
>     > existing algorithm.
>     >
>     >     If there's a chance of being of value to 4.0, I can try to put
> up a
>     > patch next week alongside a high level description of the changes.
>     >
>     >     > But a performance regression is a regression, I'm not
> shrugging it
>     > off.
>     >
>     >     I don't want to give the impression I'm shrugging off the
> correctness
>     > issue either. It's a serious issue to fix, but since all successful
> updates
>     > to the database are linearizable, I think it's likely that many
>     > applications behave correctly with the present semantics, or at least
>     > encounter only transient errors. No doubt many also do not, but I
> have no
>     > idea of the ratio.
>     >
>     >     The regression isn't itself a simple issue either - depending on
> the
>     > topology and message latencies it is not difficult to produce
> inescapable
>     > contention, i.e. guaranteed timeouts - that might persist as long as
>     > clients continue to retry. It could be quite a serious degradation of
>     > service to impose on our users.
>     >
>     >     I don't pretend to know the correct way to make a decision
> balancing
>     > these considerations, but I am perhaps more concerned about imposing
>     > service outages than I am temporarily maintaining semantics our
> users have
>     > apparently accepted for years - though I absolutely share your
>     > embarrassment there.
>     >
>     >
>     >     On 12/11/2020, 12:41, "Joshua McKenzie" <jm...@apache.org>
> wrote:
>     >
>     >         Is the new implementation a separate, distinctly modularized
> new
>     > body of
>     >         work or does it make substantial changes to existing
>     > implementation and
>     >         subsume it?
>     >
>     >         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
>     > lebresne@gmail.com> wrote:
>     >
>     >         > Regarding option #4, I'll remark that experience tends to
>     > suggest users
>     >         > don't consistently read the `NEWS.txt` file on upgrade, so
>     > option #4 will
>     >         > likely essentially mean "LWT has a correctness issue, but
> once
>     > it broke
>     >         > your data enough that you'll notice, you'll be able to dig
> the
>     > proper flag
>     >         > to fix it for next time". I guess it's better than
> nothing, of
>     > course, but
>     >         > I'll admit that defaulting to "opt-in correctness",
> especially
>     > for a
>     >         > feature (LWT) that exists uniquely to provide additional
>     > guarantees, is
>     >         > something I have a hard rallying behind.
>     >         >
>     >         > But a performance regression is a regression, I'm not
> shrugging
>     > it off.
>     >         > Still, I feel we shouldn't leave LWT with a fairly serious
> known
>     >         > correctness bug and I frankly feel bad for "the project"
> that
>     > this has been
>     >         > known for so long without action, so I'm a bit biased in
> wanting
>     > to get it
>     >         > fixed asap.
>     >         >
>     >         > But maybe I'm overstating the urgency here, and maybe
> option #1
>     > is a better
>     >         > way forward.
>     >         >
>     >         > --
>     >         > Sylvain
>     >         >
>     >
>     >
>     >
>     >
>  ---------------------------------------------------------------------
>     >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     >     For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>     >
>     >
>     > ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     > For additional commands, e-mail: dev-help@cassandra.apache.org
>     >
>     >
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

Perhaps there might be broader appetite to weigh in on which major releases we might target for work that fixes the correctness bug without serious performance regression?

i.e., if we were to fix the correctness bug now, introducing a serious performance regression (either opt-in or opt-out), but were to land work without this problem for 5.0, would there be appetite to backport this work to any of 4.0, 3.11 or 3.0? 


On 18/11/2020, 18:31, "Jeff Jirsa" <jj...@gmail.com> wrote:

    This is complicated and relatively few people on earth understand it, so
    having little feedback is mostly expected, unfortunately.

    My normal emotional response is "correctness is required, opt-in to
    performance improvements that sacrifice strict correctness", but I'm also
    sure this is going to surprise people, and would understand / accept #4
    (default to current, opt-in to correct).


    On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <be...@apache.org>
    wrote:

    > It doesn't seem like there's much enthusiasm for any of the options
    > available here...
    >
    > On 12/11/2020, 14:37, "Benedict Elliott Smith" <be...@apache.org>
    > wrote:
    >
    >     > Is the new implementation a separate, distinctly modularized new
    > body of work
    >
    >     It’s primarily a distinct, modularised and new body of work, however
    > there is some shared code that has been modified - namely PaxosState, in
    > which legacy code is maintained but modified for compatibility, and the
    > system.paxos table (which receives a new column, and slightly modified
    > serialization code).  It is conceptually an optimised version of the
    > existing algorithm.
    >
    >     If there's a chance of being of value to 4.0, I can try to put up a
    > patch next week alongside a high level description of the changes.
    >
    >     > But a performance regression is a regression, I'm not shrugging it
    > off.
    >
    >     I don't want to give the impression I'm shrugging off the correctness
    > issue either. It's a serious issue to fix, but since all successful updates
    > to the database are linearizable, I think it's likely that many
    > applications behave correctly with the present semantics, or at least
    > encounter only transient errors. No doubt many also do not, but I have no
    > idea of the ratio.
    >
    >     The regression isn't itself a simple issue either - depending on the
    > topology and message latencies it is not difficult to produce inescapable
    > contention, i.e. guaranteed timeouts - that might persist as long as
    > clients continue to retry. It could be quite a serious degradation of
    > service to impose on our users.
    >
    >     I don't pretend to know the correct way to make a decision balancing
    > these considerations, but I am perhaps more concerned about imposing
    > service outages than I am temporarily maintaining semantics our users have
    > apparently accepted for years - though I absolutely share your
    > embarrassment there.
    >
    >
    >     On 12/11/2020, 12:41, "Joshua McKenzie" <jm...@apache.org> wrote:
    >
    >         Is the new implementation a separate, distinctly modularized new
    > body of
    >         work or does it make substantial changes to existing
    > implementation and
    >         subsume it?
    >
    >         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
    > lebresne@gmail.com> wrote:
    >
    >         > Regarding option #4, I'll remark that experience tends to
    > suggest users
    >         > don't consistently read the `NEWS.txt` file on upgrade, so
    > option #4 will
    >         > likely essentially mean "LWT has a correctness issue, but once
    > it broke
    >         > your data enough that you'll notice, you'll be able to dig the
    > proper flag
    >         > to fix it for next time". I guess it's better than nothing, of
    > course, but
    >         > I'll admit that defaulting to "opt-in correctness", especially
    > for a
    >         > feature (LWT) that exists uniquely to provide additional
    > guarantees, is
    >         > something I have a hard rallying behind.
    >         >
    >         > But a performance regression is a regression, I'm not shrugging
    > it off.
    >         > Still, I feel we shouldn't leave LWT with a fairly serious known
    >         > correctness bug and I frankly feel bad for "the project" that
    > this has been
    >         > known for so long without action, so I'm a bit biased in wanting
    > to get it
    >         > fixed asap.
    >         >
    >         > But maybe I'm overstating the urgency here, and maybe option #1
    > is a better
    >         > way forward.
    >         >
    >         > --
    >         > Sylvain
    >         >
    >
    >
    >
    >     ---------------------------------------------------------------------
    >     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    >     For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >
    >
    >
    > ---------------------------------------------------------------------
    > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    > For additional commands, e-mail: dev-help@cassandra.apache.org
    >
    >



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Jeff Jirsa <jj...@gmail.com>.

This is complicated and relatively few people on earth understand it, so
having little feedback is mostly expected, unfortunately.

My normal emotional response is "correctness is required, opt-in to
performance improvements that sacrifice strict correctness", but I'm also
sure this is going to surprise people, and would understand / accept #4
(default to current, opt-in to correct).


On Wed, Nov 18, 2020 at 4:54 AM Benedict Elliott Smith <be...@apache.org>
wrote:

> It doesn't seem like there's much enthusiasm for any of the options
> available here...
>
> On 12/11/2020, 14:37, "Benedict Elliott Smith" <be...@apache.org>
> wrote:
>
>     > Is the new implementation a separate, distinctly modularized new
> body of work
>
>     It’s primarily a distinct, modularised and new body of work, however
> there is some shared code that has been modified - namely PaxosState, in
> which legacy code is maintained but modified for compatibility, and the
> system.paxos table (which receives a new column, and slightly modified
> serialization code).  It is conceptually an optimised version of the
> existing algorithm.
>
>     If there's a chance of being of value to 4.0, I can try to put up a
> patch next week alongside a high level description of the changes.
>
>     > But a performance regression is a regression, I'm not shrugging it
> off.
>
>     I don't want to give the impression I'm shrugging off the correctness
> issue either. It's a serious issue to fix, but since all successful updates
> to the database are linearizable, I think it's likely that many
> applications behave correctly with the present semantics, or at least
> encounter only transient errors. No doubt many also do not, but I have no
> idea of the ratio.
>
>     The regression isn't itself a simple issue either - depending on the
> topology and message latencies it is not difficult to produce inescapable
> contention, i.e. guaranteed timeouts - that might persist as long as
> clients continue to retry. It could be quite a serious degradation of
> service to impose on our users.
>
>     I don't pretend to know the correct way to make a decision balancing
> these considerations, but I am perhaps more concerned about imposing
> service outages than I am temporarily maintaining semantics our users have
> apparently accepted for years - though I absolutely share your
> embarrassment there.
>
>
>     On 12/11/2020, 12:41, "Joshua McKenzie" <jm...@apache.org> wrote:
>
>         Is the new implementation a separate, distinctly modularized new
> body of
>         work or does it make substantial changes to existing
> implementation and
>         subsume it?
>
>         On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <
> lebresne@gmail.com> wrote:
>
>         > Regarding option #4, I'll remark that experience tends to
> suggest users
>         > don't consistently read the `NEWS.txt` file on upgrade, so
> option #4 will
>         > likely essentially mean "LWT has a correctness issue, but once
> it broke
>         > your data enough that you'll notice, you'll be able to dig the
> proper flag
>         > to fix it for next time". I guess it's better than nothing, of
> course, but
>         > I'll admit that defaulting to "opt-in correctness", especially
> for a
>         > feature (LWT) that exists uniquely to provide additional
> guarantees, is
>         > something I have a hard rallying behind.
>         >
>         > But a performance regression is a regression, I'm not shrugging
> it off.
>         > Still, I feel we shouldn't leave LWT with a fairly serious known
>         > correctness bug and I frankly feel bad for "the project" that
> this has been
>         > known for so long without action, so I'm a bit biased in wanting
> to get it
>         > fixed asap.
>         >
>         > But maybe I'm overstating the urgency here, and maybe option #1
> is a better
>         > way forward.
>         >
>         > --
>         > Sylvain
>         >
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

It doesn't seem like there's much enthusiasm for any of the options available here...

On 12/11/2020, 14:37, "Benedict Elliott Smith" <be...@apache.org> wrote:

    > Is the new implementation a separate, distinctly modularized new body of work

    It’s primarily a distinct, modularised and new body of work, however there is some shared code that has been modified - namely PaxosState, in which legacy code is maintained but modified for compatibility, and the system.paxos table (which receives a new column, and slightly modified serialization code).  It is conceptually an optimised version of the existing algorithm.

    If there's a chance of being of value to 4.0, I can try to put up a patch next week alongside a high level description of the changes.

    > But a performance regression is a regression, I'm not shrugging it off.

    I don't want to give the impression I'm shrugging off the correctness issue either. It's a serious issue to fix, but since all successful updates to the database are linearizable, I think it's likely that many applications behave correctly with the present semantics, or at least encounter only transient errors. No doubt many also do not, but I have no idea of the ratio.

    The regression isn't itself a simple issue either - depending on the topology and message latencies it is not difficult to produce inescapable contention, i.e. guaranteed timeouts - that might persist as long as clients continue to retry. It could be quite a serious degradation of service to impose on our users.

    I don't pretend to know the correct way to make a decision balancing these considerations, but I am perhaps more concerned about imposing service outages than I am temporarily maintaining semantics our users have apparently accepted for years - though I absolutely share your embarrassment there.


    On 12/11/2020, 12:41, "Joshua McKenzie" <jm...@apache.org> wrote:

        Is the new implementation a separate, distinctly modularized new body of
        work or does it make substantial changes to existing implementation and
        subsume it?

        On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <le...@gmail.com> wrote:

        > Regarding option #4, I'll remark that experience tends to suggest users
        > don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
        > likely essentially mean "LWT has a correctness issue, but once it broke
        > your data enough that you'll notice, you'll be able to dig the proper flag
        > to fix it for next time". I guess it's better than nothing, of course, but
        > I'll admit that defaulting to "opt-in correctness", especially for a
        > feature (LWT) that exists uniquely to provide additional guarantees, is
        > something I have a hard rallying behind.
        >
        > But a performance regression is a regression, I'm not shrugging it off.
        > Still, I feel we shouldn't leave LWT with a fairly serious known
        > correctness bug and I frankly feel bad for "the project" that this has been
        > known for so long without action, so I'm a bit biased in wanting to get it
        > fixed asap.
        >
        > But maybe I'm overstating the urgency here, and maybe option #1 is a better
        > way forward.
        >
        > --
        > Sylvain
        >



    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    For additional commands, e-mail: dev-help@cassandra.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

> Is the new implementation a separate, distinctly modularized new body of work

It’s primarily a distinct, modularised and new body of work, however there is some shared code that has been modified - namely PaxosState, in which legacy code is maintained but modified for compatibility, and the system.paxos table (which receives a new column, and slightly modified serialization code).  It is conceptually an optimised version of the existing algorithm.

If there's a chance of being of value to 4.0, I can try to put up a patch next week alongside a high level description of the changes.

> But a performance regression is a regression, I'm not shrugging it off.

I don't want to give the impression I'm shrugging off the correctness issue either. It's a serious issue to fix, but since all successful updates to the database are linearizable, I think it's likely that many applications behave correctly with the present semantics, or at least encounter only transient errors. No doubt many also do not, but I have no idea of the ratio.

The regression isn't itself a simple issue either - depending on the topology and message latencies it is not difficult to produce inescapable contention, i.e. guaranteed timeouts - that might persist as long as clients continue to retry. It could be quite a serious degradation of service to impose on our users.

I don't pretend to know the correct way to make a decision balancing these considerations, but I am perhaps more concerned about imposing service outages than I am temporarily maintaining semantics our users have apparently accepted for years - though I absolutely share your embarrassment there.

On 12/11/2020, 12:41, "Joshua McKenzie" <jm...@apache.org> wrote:

    Is the new implementation a separate, distinctly modularized new body of
    work or does it make substantial changes to existing implementation and
    subsume it?

    On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <le...@gmail.com> wrote:

    > Regarding option #4, I'll remark that experience tends to suggest users
    > don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
    > likely essentially mean "LWT has a correctness issue, but once it broke
    > your data enough that you'll notice, you'll be able to dig the proper flag
    > to fix it for next time". I guess it's better than nothing, of course, but
    > I'll admit that defaulting to "opt-in correctness", especially for a
    > feature (LWT) that exists uniquely to provide additional guarantees, is
    > something I have a hard rallying behind.
    >
    > But a performance regression is a regression, I'm not shrugging it off.
    > Still, I feel we shouldn't leave LWT with a fairly serious known
    > correctness bug and I frankly feel bad for "the project" that this has been
    > known for so long without action, so I'm a bit biased in wanting to get it
    > fixed asap.
    >
    > But maybe I'm overstating the urgency here, and maybe option #1 is a better
    > way forward.
    >
    > --
    > Sylvain
    >

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Joshua McKenzie <jm...@apache.org>.

Is the new implementation a separate, distinctly modularized new body of
work or does it make substantial changes to existing implementation and
subsume it?

On Thu, Nov 12, 2020 at 3:56 AM Sylvain Lebresne <le...@gmail.com> wrote:

> Regarding option #4, I'll remark that experience tends to suggest users
> don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
> likely essentially mean "LWT has a correctness issue, but once it broke
> your data enough that you'll notice, you'll be able to dig the proper flag
> to fix it for next time". I guess it's better than nothing, of course, but
> I'll admit that defaulting to "opt-in correctness", especially for a
> feature (LWT) that exists uniquely to provide additional guarantees, is
> something I have a hard rallying behind.
>
> But a performance regression is a regression, I'm not shrugging it off.
> Still, I feel we shouldn't leave LWT with a fairly serious known
> correctness bug and I frankly feel bad for "the project" that this has been
> known for so long without action, so I'm a bit biased in wanting to get it
> fixed asap.
>
> But maybe I'm overstating the urgency here, and maybe option #1 is a better
> way forward.
>
> --
> Sylvain
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Sylvain Lebresne <le...@gmail.com>.

Regarding option #4, I'll remark that experience tends to suggest users
don't consistently read the `NEWS.txt` file on upgrade, so option #4 will
likely essentially mean "LWT has a correctness issue, but once it broke
your data enough that you'll notice, you'll be able to dig the proper flag
to fix it for next time". I guess it's better than nothing, of course, but
I'll admit that defaulting to "opt-in correctness", especially for a
feature (LWT) that exists uniquely to provide additional guarantees, is
something I have a hard rallying behind.

But a performance regression is a regression, I'm not shrugging it off.
Still, I feel we shouldn't leave LWT with a fairly serious known
correctness bug and I frankly feel bad for "the project" that this has been
known for so long without action, so I'm a bit biased in wanting to get it
fixed asap.

But maybe I'm overstating the urgency here, and maybe option #1 is a better
way forward.

--
Sylvain

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Sumanth Pasupuleti <su...@gmail.com>.

Knowing there is a correctness issue in LWT, and given users use LWT
primarily for correctness, my opinion is we should commit the correctness
patch (makes it one of #1, #3 or #4)

I agree we should not cause further delay to 4.0 release (making it one of
#3 or #4).

Con for #3 would be, applications may have to rework their (and
downstreams') configuration(s) to potentially accommodate for the
performance regression which may not be ideal for a seamless 4.0 upgrade
that we expect users to experience.

Now, given this correctness issue has been since the beginning, existing
LWT users would notice no new difference potentially w.r.t. correctness
since they may have already worked around this bug (if they noticed), so +1
to option #4.

On Wed, Nov 11, 2020 at 1:49 PM Benedict Elliott Smith <be...@apache.org>
wrote:

> In my opinion, a similar calculus should be applied to 3.0 and 3.11.  This
> is a(n arguably quite serious) bug, so whatever is not overly onerous to
> backport should be considered while they are supported. The work under
> discussion has two components: a replacement to the core consensus
> algorithm, and mechanisms to ensure safety across range movements. The
> latter might be more invasive for 3.x, but the former should be quite easy
> to backport and as such probably quite well justified.
>
> > can it also pluggable (either opt-in or opt-out)?
>
> I think pluggable means something different to opt-in/opt-out, at least to
> me.  I'm all for more pluggability, and also for more optionality, but the
> decision is very sensitive to context. We need to be able to select between
> our options, which for consensus practically means supporting live
> migration - which is exceptionally challenging in any general sense (and
> perhaps inherently non-pluggable).
>
> As to future development for consensus, I personally hope the work we are
> discussing here will be a strong platform for it, but obviously that's for
> the community to decide later on. I think the work to take it forwards to
> something epaxos-like will not be that herculean, with some incremental
> milestones en route. But that's a totally different discussion for the
> future, and either a CEP or a small intercollegiate working group.
>
>
> On 11/11/2020, 18:48, "Michael Semb Wever" <mc...@apache.org> wrote:
>
>
>     > Regarding CASSANDRA-12126 and 4.0 we are facing several options and
>     > Benedict, Sylvain and I wanted to get the community feedback on them.
>     >
>     > We can:
>     >
>     >    1. Try to use Benedict proposal for 4.0 if the community has the
>     >    appetite for it. The main issue there is some potential extra
> delay for 4.0
>     >    2. Do nothing for 4.0. Meaning do not commit the current patch.
> We have
>     >    lived a long time with that issue and we can probably wait a bit
> more for a
>     >    proper solution.
>     >    3. Commit the patch as such, fixing the correctness but
> introducing
>     >    potentially some performance issue until we release a better
> solution.
>     >    4. Changing the patch to default to the current behavior but
> allowing
>     >    people to enable the new one if the correctness is a problem for
> them.
>     >
>
>
>     If these options are for 4.0, is it then (4) that it is getting
> applied to 3.0 and 3.11 ?
>
>     If that is the case then I would vote on also applying (4) to 4.0,
> given we are now in front of beta4. Please let's not further delay 4.0.
>
>     Post 4.0, if (1) is as described "a parallel implementation of the
> same underlying Paxos algorithm" can it also pluggable (either opt-in or
> opt-out)? And would/could EPaxos become pluggable too in a similar manner
> (if it eventuates)? I'm in favour on providing more pluggable interfaces
> into C*, along with the code quality improvements that's going to have to
> be accompanied with.
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>     For additional commands, e-mail: dev-help@cassandra.apache.org
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Benedict Elliott Smith <be...@apache.org>.

In my opinion, a similar calculus should be applied to 3.0 and 3.11.  This is a(n arguably quite serious) bug, so whatever is not overly onerous to backport should be considered while they are supported. The work under discussion has two components: a replacement to the core consensus algorithm, and mechanisms to ensure safety across range movements. The latter might be more invasive for 3.x, but the former should be quite easy to backport and as such probably quite well justified.

> can it also pluggable (either opt-in or opt-out)?

I think pluggable means something different to opt-in/opt-out, at least to me.  I'm all for more pluggability, and also for more optionality, but the decision is very sensitive to context. We need to be able to select between our options, which for consensus practically means supporting live migration - which is exceptionally challenging in any general sense (and perhaps inherently non-pluggable).

As to future development for consensus, I personally hope the work we are discussing here will be a strong platform for it, but obviously that's for the community to decide later on. I think the work to take it forwards to something epaxos-like will not be that herculean, with some incremental milestones en route. But that's a totally different discussion for the future, and either a CEP or a small intercollegiate working group.

On 11/11/2020, 18:48, "Michael Semb Wever" <mc...@apache.org> wrote:

    > Regarding CASSANDRA-12126 and 4.0 we are facing several options and
    > Benedict, Sylvain and I wanted to get the community feedback on them.
    > 
    > We can:
    > 
    >    1. Try to use Benedict proposal for 4.0 if the community has the
    >    appetite for it. The main issue there is some potential extra delay for 4.0
    >    2. Do nothing for 4.0. Meaning do not commit the current patch. We have
    >    lived a long time with that issue and we can probably wait a bit more for a
    >    proper solution.
    >    3. Commit the patch as such, fixing the correctness but introducing
    >    potentially some performance issue until we release a better solution.
    >    4. Changing the patch to default to the current behavior but allowing
    >    people to enable the new one if the correctness is a problem for them.
    > 

    If these options are for 4.0, is it then (4) that it is getting applied to 3.0 and 3.11 ?

    If that is the case then I would vote on also applying (4) to 4.0, given we are now in front of beta4. Please let's not further delay 4.0.

    Post 4.0, if (1) is as described "a parallel implementation of the same underlying Paxos algorithm" can it also pluggable (either opt-in or opt-out)? And would/could EPaxos become pluggable too in a similar manner (if it eventuates)? I'm in favour on providing more pluggable interfaces into C*, along with the code quality improvements that's going to have to be accompanied with. 

    ---------------------------------------------------------------------
    To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
    For additional commands, e-mail: dev-help@cassandra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: [DISCUSS] CASSANDRA-12126: LWTs correcteness and performance

Posted by Michael Semb Wever <mc...@apache.org>.

> Regarding CASSANDRA-12126 and 4.0 we are facing several options and
> Benedict, Sylvain and I wanted to get the community feedback on them.
> 
> We can:
> 
>    1. Try to use Benedict proposal for 4.0 if the community has the
>    appetite for it. The main issue there is some potential extra delay for 4.0
>    2. Do nothing for 4.0. Meaning do not commit the current patch. We have
>    lived a long time with that issue and we can probably wait a bit more for a
>    proper solution.
>    3. Commit the patch as such, fixing the correctness but introducing
>    potentially some performance issue until we release a better solution.
>    4. Changing the patch to default to the current behavior but allowing
>    people to enable the new one if the correctness is a problem for them.
> 


If these options are for 4.0, is it then (4) that it is getting applied to 3.0 and 3.11 ?

If that is the case then I would vote on also applying (4) to 4.0, given we are now in front of beta4. Please let's not further delay 4.0.

Post 4.0, if (1) is as described "a parallel implementation of the same underlying Paxos algorithm" can it also pluggable (either opt-in or opt-out)? And would/could EPaxos become pluggable too in a similar manner (if it eventuates)? I'm in favour on providing more pluggable interfaces into C*, along with the code quality improvements that's going to have to be accompanied with. 



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org