You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Marcus Eriksson <kr...@gmail.com> on 2017/10/01 10:12:27 UTC

Re: Proposal to retroactively mark materialized views experimental

I was just thinking that we should try really hard to avoid adding
experimental features - they are experimental due to lack of testing right?
There should be a clear path to making the feature non-experimental (or get
it removed) and having that path discussed on dev@ might give more
visibility to it.

I'm also struggling a bit to find good historic examples of "this would
have been better off as an experimental feature" - I used to think that it
would have been good to commit DTCS with some sort of experimental flag,
but that would not have made DTCS any better - it would have been better to
do more testing, realise that it does not work and then not commit it at
all of course.

Does anyone have good examples of features where it would have made sense
to commit them behind an experimental flag? SASI might be a good example,
but for MVs - if we knew how painful they would be, they really would not
have gotten committed at all, right?

/Marcus

On Sat, Sep 30, 2017 at 7:42 AM, Jeff Jirsa <jj...@gmail.com> wrote:

> Reviewers should be able to suggest when experimental is warranted, and
> conversation on dev+jira to justify when it’s transitioned from
> experimental to stable?
>
> We should remove the flag as soon as we’re (collectively) confident in a
> feature’s behavior - at least correctness, if not performance.
>
>
> > On Sep 29, 2017, at 10:31 PM, Marcus Eriksson <kr...@gmail.com> wrote:
> >
> > +1 on marking MVs experimental, but should there be some point in the
> > future where we consider removing them from the code base unless they
> have
> > gotten significant improvement as well?
> >
> > We probably need to enforce some kind of process for adding new
> > experimental features in the future - perhaps a mail like this one to
> dev@
> > motivating why it should be experimental?
> >
> > /Marcus
> >
> > On Sat, Sep 30, 2017 at 1:15 AM, Vinay Chella
> <vc...@netflix.com.invalid>
> > wrote:
> >
> >> We tried perf testing MVs internally here but did not see good results
> with
> >> it, hence paused its usage. +1 on tagging certain features which are not
> >> PROD ready or not stable enough.
> >>
> >> Regards,
> >> Vinay Chella
> >>
> >>> On Fri, Sep 29, 2017 at 7:22 PM, Ben Bromhead <be...@instaclustr.com>
> wrote:
> >>>
> >>> I'm a fan of introducing experimental flags in general as well, +1
> >>>
> >>>
> >>>
> >>>> On Fri, 29 Sep 2017 at 13:22 Jon Haddad <jo...@jonhaddad.com> wrote:
> >>>>
> >>>> I’m very much +1 on this, and to new features in general.
> >>>>
> >>>> I think having a clear line in which we classify something as
> >> production
> >>>> ready would be nice.  It would be great if committers were using the
> >>>> feature in prod and could vouch for it’s stability.
> >>>>
> >>>>> On Sep 29, 2017, at 1:09 PM, Blake Eggleston <be...@apple.com>
> >>>> wrote:
> >>>>>
> >>>>> Hi dev@,
> >>>>>
> >>>>> I’d like to propose that we retroactively classify materialized views
> >>> as
> >>>> an experimental feature, disable them by default, and require users to
> >>>> enable them through a config setting before using.
> >>>>>
> >>>>> Materialized views have several issues that make them (effectively)
> >>>> unusable in production. Some of the issues aren’t just implementation
> >>>> problems, but problems with the design that aren’t easily fixed. It’s
> >>>> unfair of us to make features available to users in this state without
> >>>> providing a clear warning that bad or unexpected things are likely to
> >>>> happen if they use it.
> >>>>>
> >>>>> Obviously, this isn’t great news for users that have already adopted
> >>>> MVs, and I don’t have a great answer for that. I think that’s sort of
> a
> >>>> sunk cost at this point. If they have any MV related problems, they’ll
> >>> have
> >>>> them whether they’re marked experimental or not. I would expect this
> to
> >>>> reduce the number of users adopting MVs in the future though, and if
> >> they
> >>>> do, it would be opt-in.
> >>>>>
> >>>>> Once MVs reach a point where they’re usable in production, we can
> >>> remove
> >>>> the flag. Specifics of how the experimental flag would work can be
> >>> hammered
> >>>> out in a forthcoming JIRA, but I’d imagine it would just prevent users
> >>> from
> >>>> creating new MVs, and maybe log warnings on startup for existing MVs
> if
> >>> the
> >>>> flag isn’t enabled.
> >>>>>
> >>>>> Let me know what you think.
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Blake
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>
> >>>> --
> >>> Ben Bromhead
> >>> CTO | Instaclustr <https://www.instaclustr.com/>
> >>> +1 650 284 9692
> >>> Reliability at Scale
> >>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> >>>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

I think you're presenting a false dichotomy here. Yes there are people who are not interested in taking risks with C* and are still running 1.2, there are probably a few people who would put trunk in prod if we packaged it up for them, but there's a whole spectrum of users in between. Operator competence / sophistication has the same sort of spectrum.

I'd expect the amount of feedback on experimental features would be a function of the quality of the design / implementation and the amount of user interest. If you're not getting feedback on experimental feature, it's probably poorly implemented, or no one's interested in it.

I don't think labelling features is going to kill the user <-> developer feedback loop. It will probably slow down the pace of feature development a bit, but it's been slowing down anyway, and that's a good thing imo.

On October 1, 2017 at 9:14:45 AM, DuyHai Doan (doanduyhai@gmail.com) wrote:

So basically we're saying that even with a lot of tests, you're never sure  
to cover all the possible edge cases and the real stamp for "production  
readiness" is only when the "experimental features" have been deployed in  
various clusters with various scenarios/use-cases, just re-phrasing Blake  
here. Totally +1 on the idea.  

Now I can foresee a problem with the "experimental" flag, that is nobody  
(in the community) will use it or even dare to play with it and thus the  
"experimental" features never get a chance to be tested and then we break  
the bug-reports/bug-fixes iterations ...  

How many times have I seen users on the ML asking which version of C* is  
the most fit for production and the answer was always at least 1 major  
version behind the current released major (2.1 was recommended when 3.x was  
released and so one ...) ?  

The fundamental issue here is that a lot of folks in the community do not  
want to take any risk and take a conservative approach for the production,  
which is fine and perfectly understandable. But it means that the implicit  
contract for OSS software, e.g. "you have a software for free in exchange  
you will give feedbacks and bug reports to improve it", is completely  
broken.  

Let's take the example of MV. MV was shipped with 3.0 --> considered not  
stable --> nobody/few people uses MV --> few bug reports --> bugs didn't  
have chance to get fixed --> the problem lasts until now  

About SASI, how many people really played with thoroughly apart from some  
toy examples ? Same causes, same consequences. And we can't even blame its  
design because fundamentally the architecture is pretty solid, just a  
question of usage and feedbacks.  

I suspect that this broken community QA/feedback loop did also explain  
partially the failure of tic/toc releases but it's only my own  
interpretation here.  

So if we don't figure out how to restore the "new feature/community bug  
report" strong feedback loop, we're going to face again the same issues and  
same debate in the future  


On Sun, Oct 1, 2017 at 5:30 PM, Blake Eggleston <be...@apple.com>  
wrote:  

> I'm not sure the main issue in the case of MVs is testing. In this case it  
> seems to be that there are some design issues and/or the design was only  
> works in some overly restrictive use cases. That MVs were committed knowing  
> these were issues seems to be the real problem. So in the case of MVs, sure  
> I don't think they should have ever made it to an experimental stage.  
>  
> Thinking of how an experimental flag fits in the with the project going  
> forward though, I disagree that we should avoid adding experimental  
> features. On the contrary, I think leaning towards classifying new features  
> as experimental would be better for users. Especially larger features and  
> changes.  
>  
> Even with well spec'd, well tested, and well designed features, there will  
> always be edge cases that you didn't think of, or you'll have made  
> assumptions about the other parts of C* it relies on that aren't 100%  
> correct. Small problems here can often affect correctness, or result in  
> data loss. So, I think it makes sense to avoid marking them as ready for  
> regular use until they've had time to bake in clusters where there are some  
> expert operators that are sophisticated enough to understand the  
> implications of running them, detect issues, and report bugs.  
>  
> Regarding historical examples, in hindsight I think committing 8099, or at  
> the very least, parts of it, behind an experimental flag would have been  
> the right thing to do. It was a huge change that we're still finding issues  
> with 2 years later.  
>  
> On October 1, 2017 at 6:08:50 AM, DuyHai Doan (doanduyhai@gmail.com)  
> wrote:  
>  
> How should we transition one feature from the "experimental" state to  
> "production ready" state ? On which criteria ?  
>  
>  
>  
> On Sun, Oct 1, 2017 at 12:12 PM, Marcus Eriksson <kr...@gmail.com>  
> wrote:  
>  
> > I was just thinking that we should try really hard to avoid adding  
> > experimental features - they are experimental due to lack of testing  
> right?  
> > There should be a clear path to making the feature non-experimental (or  
> get  
> > it removed) and having that path discussed on dev@ might give more  
> > visibility to it.  
> >  
> > I'm also struggling a bit to find good historic examples of "this would  
> > have been better off as an experimental feature" - I used to think that  
> it  
> > would have been good to commit DTCS with some sort of experimental flag,  
> > but that would not have made DTCS any better - it would have been better  
> to  
> > do more testing, realise that it does not work and then not commit it at  
> > all of course.  
> >  
> > Does anyone have good examples of features where it would have made sense  
> > to commit them behind an experimental flag? SASI might be a good example,  
> > but for MVs - if we knew how painful they would be, they really would not  
> > have gotten committed at all, right?  
> >  
> > /Marcus  
> >  
> > On Sat, Sep 30, 2017 at 7:42 AM, Jeff Jirsa <jj...@gmail.com> wrote:  
> >  
> > > Reviewers should be able to suggest when experimental is warranted, and  
> > > conversation on dev+jira to justify when it’s transitioned from  
> > > experimental to stable?  
> > >  
> > > We should remove the flag as soon as we’re (collectively) confident in  
> a  
> > > feature’s behavior - at least correctness, if not performance.  
> > >  
> > >  
> > > > On Sep 29, 2017, at 10:31 PM, Marcus Eriksson <kr...@gmail.com>  
> > wrote:  
> > > >  
> > > > +1 on marking MVs experimental, but should there be some point in the  
> > > > future where we consider removing them from the code base unless they  
> > > have  
> > > > gotten significant improvement as well?  
> > > >  
> > > > We probably need to enforce some kind of process for adding new  
> > > > experimental features in the future - perhaps a mail like this one to  
> > > dev@  
> > > > motivating why it should be experimental?  
> > > >  
> > > > /Marcus  
> > > >  
> > > > On Sat, Sep 30, 2017 at 1:15 AM, Vinay Chella  
> > > <vc...@netflix.com.invalid>  
> > > > wrote:  
> > > >  
> > > >> We tried perf testing MVs internally here but did not see good  
> results  
> > > with  
> > > >> it, hence paused its usage. +1 on tagging certain features which are  
> > not  
> > > >> PROD ready or not stable enough.  
> > > >>  
> > > >> Regards,  
> > > >> Vinay Chella  
> > > >>  
> > > >>> On Fri, Sep 29, 2017 at 7:22 PM, Ben Bromhead <ben@instaclustr.com  
> >  
> > > wrote:  
> > > >>>  
> > > >>> I'm a fan of introducing experimental flags in general as well, +1  
> > > >>>  
> > > >>>  
> > > >>>  
> > > >>>> On Fri, 29 Sep 2017 at 13:22 Jon Haddad <jo...@jonhaddad.com>  
> wrote:  
> > > >>>>  
> > > >>>> I’m very much +1 on this, and to new features in general.  
> > > >>>>  
> > > >>>> I think having a clear line in which we classify something as  
> > > >> production  
> > > >>>> ready would be nice. It would be great if committers were using  
> the  
> > > >>>> feature in prod and could vouch for it’s stability.  
> > > >>>>  
> > > >>>>> On Sep 29, 2017, at 1:09 PM, Blake Eggleston <  
> beggleston@apple.com  
> > >  
> > > >>>> wrote:  
> > > >>>>>  
> > > >>>>> Hi dev@,  
> > > >>>>>  
> > > >>>>> I’d like to propose that we retroactively classify materialized  
> > views  
> > > >>> as  
> > > >>>> an experimental feature, disable them by default, and require  
> users  
> > to  
> > > >>>> enable them through a config setting before using.  
> > > >>>>>  
> > > >>>>> Materialized views have several issues that make them  
> (effectively)  
> > > >>>> unusable in production. Some of the issues aren’t just  
> > implementation  
> > > >>>> problems, but problems with the design that aren’t easily fixed.  
> > It’s  
> > > >>>> unfair of us to make features available to users in this state  
> > without  
> > > >>>> providing a clear warning that bad or unexpected things are likely  
> > to  
> > > >>>> happen if they use it.  
> > > >>>>>  
> > > >>>>> Obviously, this isn’t great news for users that have already  
> > adopted  
> > > >>>> MVs, and I don’t have a great answer for that. I think that’s sort  
> > of  
> > > a  
> > > >>>> sunk cost at this point. If they have any MV related problems,  
> > they’ll  
> > > >>> have  
> > > >>>> them whether they’re marked experimental or not. I would expect  
> this  
> > > to  
> > > >>>> reduce the number of users adopting MVs in the future though, and  
> if  
> > > >> they  
> > > >>>> do, it would be opt-in.  
> > > >>>>>  
> > > >>>>> Once MVs reach a point where they’re usable in production, we can  
> > > >>> remove  
> > > >>>> the flag. Specifics of how the experimental flag would work can be  
> > > >>> hammered  
> > > >>>> out in a forthcoming JIRA, but I’d imagine it would just prevent  
> > users  
> > > >>> from  
> > > >>>> creating new MVs, and maybe log warnings on startup for existing  
> MVs  
> > > if  
> > > >>> the  
> > > >>>> flag isn’t enabled.  
> > > >>>>>  
> > > >>>>> Let me know what you think.  
> > > >>>>>  
> > > >>>>> Thanks,  
> > > >>>>>  
> > > >>>>> Blake  
> > > >>>>  
> > > >>>>  
> > > >>>> ------------------------------------------------------------  
> > ---------  
> > > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
> > > >>>>  
> > > >>>> --  
> > > >>> Ben Bromhead  
> > > >>> CTO | Instaclustr <https://www.instaclustr.com/>  
> > > >>> +1 650 284 9692  
> > > >>> Reliability at Scale  
> > > >>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer  
> > > >>>  
> > > >>  
> > >  
> > > ---------------------------------------------------------------------  
> > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > > For additional commands, e-mail: dev-help@cassandra.apache.org  
> > >  
> > >  
> >  
>

Re: Proposal to retroactively mark materialized views experimental

Posted by DuyHai Doan <do...@gmail.com>.

So basically we're saying that even with a lot of tests, you're never sure
to cover all the possible edge cases and the real stamp for "production
readiness" is only when the "experimental features" have been deployed in
various clusters with various scenarios/use-cases, just re-phrasing Blake
here. Totally +1 on the idea.

Now I can foresee a problem with the "experimental" flag, that is nobody
(in the community) will use it or even dare to play with it and thus the
"experimental" features never get a chance to be tested and then we break
the bug-reports/bug-fixes iterations ...

How many times have I seen users on the ML asking which version of C* is
the most fit for production and the answer was always at least 1 major
version behind the current released major (2.1 was recommended when 3.x was
released and so one ...) ?

The fundamental issue here is that a lot of folks in the community do not
want to take any risk and take a conservative approach for the production,
which is fine and perfectly understandable. But it means that the implicit
contract for OSS software, e.g. "you have a software for free in exchange
you will give feedbacks and bug reports to improve it", is completely
broken.

Let's take the example of MV. MV was shipped with 3.0 --> considered not
stable --> nobody/few people uses MV --> few bug reports --> bugs didn't
have chance to get fixed --> the problem lasts until now

About SASI, how many people really played with thoroughly apart from some
toy examples ? Same causes, same consequences. And we can't even blame its
design because fundamentally the architecture is pretty solid, just a
question of usage and feedbacks.

I suspect that this broken community QA/feedback loop did also explain
partially the failure of tic/toc releases but it's only my own
interpretation here.

So if we don't figure out how to restore the "new feature/community bug
report" strong feedback loop, we're going to face again the same issues and
same debate in the future


On Sun, Oct 1, 2017 at 5:30 PM, Blake Eggleston <be...@apple.com>
wrote:

> I'm not sure the main issue in the case of MVs is testing. In this case it
> seems to be that there are some design issues and/or the design was only
> works in some overly restrictive use cases. That MVs were committed knowing
> these were issues seems to be the real problem. So in the case of MVs, sure
> I don't think they should have ever made it to an experimental stage.
>
> Thinking of how an experimental flag fits in the with the project going
> forward though, I disagree that we should avoid adding experimental
> features. On the contrary, I think leaning towards classifying new features
> as  experimental would be better for users. Especially larger features and
> changes.
>
> Even with well spec'd, well tested, and well designed features, there will
> always be edge cases that you didn't think of, or you'll have made
> assumptions about the other parts of C* it relies on that aren't 100%
> correct. Small problems here can often affect correctness, or result in
> data loss. So, I think it makes sense to avoid marking them as ready for
> regular use until they've had time to bake in clusters where there are some
> expert operators that are sophisticated enough to understand the
> implications of running them, detect issues, and report bugs.
>
> Regarding historical examples, in hindsight I think committing 8099, or at
> the very least, parts of it, behind an experimental flag would have been
> the right thing to do. It was a huge change that we're still finding issues
> with 2 years later.
>
> On October 1, 2017 at 6:08:50 AM, DuyHai Doan (doanduyhai@gmail.com)
> wrote:
>
> How should we transition one feature from the "experimental" state to
> "production ready" state ? On which criteria ?
>
>
>
> On Sun, Oct 1, 2017 at 12:12 PM, Marcus Eriksson <kr...@gmail.com>
> wrote:
>
> > I was just thinking that we should try really hard to avoid adding
> > experimental features - they are experimental due to lack of testing
> right?
> > There should be a clear path to making the feature non-experimental (or
> get
> > it removed) and having that path discussed on dev@ might give more
> > visibility to it.
> >
> > I'm also struggling a bit to find good historic examples of "this would
> > have been better off as an experimental feature" - I used to think that
> it
> > would have been good to commit DTCS with some sort of experimental flag,
> > but that would not have made DTCS any better - it would have been better
> to
> > do more testing, realise that it does not work and then not commit it at
> > all of course.
> >
> > Does anyone have good examples of features where it would have made sense
> > to commit them behind an experimental flag? SASI might be a good example,
> > but for MVs - if we knew how painful they would be, they really would not
> > have gotten committed at all, right?
> >
> > /Marcus
> >
> > On Sat, Sep 30, 2017 at 7:42 AM, Jeff Jirsa <jj...@gmail.com> wrote:
> >
> > > Reviewers should be able to suggest when experimental is warranted, and
> > > conversation on dev+jira to justify when it’s transitioned from
> > > experimental to stable?
> > >
> > > We should remove the flag as soon as we’re (collectively) confident in
> a
> > > feature’s behavior - at least correctness, if not performance.
> > >
> > >
> > > > On Sep 29, 2017, at 10:31 PM, Marcus Eriksson <kr...@gmail.com>
> > wrote:
> > > >
> > > > +1 on marking MVs experimental, but should there be some point in the
> > > > future where we consider removing them from the code base unless they
> > > have
> > > > gotten significant improvement as well?
> > > >
> > > > We probably need to enforce some kind of process for adding new
> > > > experimental features in the future - perhaps a mail like this one to
> > > dev@
> > > > motivating why it should be experimental?
> > > >
> > > > /Marcus
> > > >
> > > > On Sat, Sep 30, 2017 at 1:15 AM, Vinay Chella
> > > <vc...@netflix.com.invalid>
> > > > wrote:
> > > >
> > > >> We tried perf testing MVs internally here but did not see good
> results
> > > with
> > > >> it, hence paused its usage. +1 on tagging certain features which are
> > not
> > > >> PROD ready or not stable enough.
> > > >>
> > > >> Regards,
> > > >> Vinay Chella
> > > >>
> > > >>> On Fri, Sep 29, 2017 at 7:22 PM, Ben Bromhead <ben@instaclustr.com
> >
> > > wrote:
> > > >>>
> > > >>> I'm a fan of introducing experimental flags in general as well, +1
> > > >>>
> > > >>>
> > > >>>
> > > >>>> On Fri, 29 Sep 2017 at 13:22 Jon Haddad <jo...@jonhaddad.com>
> wrote:
> > > >>>>
> > > >>>> I’m very much +1 on this, and to new features in general.
> > > >>>>
> > > >>>> I think having a clear line in which we classify something as
> > > >> production
> > > >>>> ready would be nice. It would be great if committers were using
> the
> > > >>>> feature in prod and could vouch for it’s stability.
> > > >>>>
> > > >>>>> On Sep 29, 2017, at 1:09 PM, Blake Eggleston <
> beggleston@apple.com
> > >
> > > >>>> wrote:
> > > >>>>>
> > > >>>>> Hi dev@,
> > > >>>>>
> > > >>>>> I’d like to propose that we retroactively classify materialized
> > views
> > > >>> as
> > > >>>> an experimental feature, disable them by default, and require
> users
> > to
> > > >>>> enable them through a config setting before using.
> > > >>>>>
> > > >>>>> Materialized views have several issues that make them
> (effectively)
> > > >>>> unusable in production. Some of the issues aren’t just
> > implementation
> > > >>>> problems, but problems with the design that aren’t easily fixed.
> > It’s
> > > >>>> unfair of us to make features available to users in this state
> > without
> > > >>>> providing a clear warning that bad or unexpected things are likely
> > to
> > > >>>> happen if they use it.
> > > >>>>>
> > > >>>>> Obviously, this isn’t great news for users that have already
> > adopted
> > > >>>> MVs, and I don’t have a great answer for that. I think that’s sort
> > of
> > > a
> > > >>>> sunk cost at this point. If they have any MV related problems,
> > they’ll
> > > >>> have
> > > >>>> them whether they’re marked experimental or not. I would expect
> this
> > > to
> > > >>>> reduce the number of users adopting MVs in the future though, and
> if
> > > >> they
> > > >>>> do, it would be opt-in.
> > > >>>>>
> > > >>>>> Once MVs reach a point where they’re usable in production, we can
> > > >>> remove
> > > >>>> the flag. Specifics of how the experimental flag would work can be
> > > >>> hammered
> > > >>>> out in a forthcoming JIRA, but I’d imagine it would just prevent
> > users
> > > >>> from
> > > >>>> creating new MVs, and maybe log warnings on startup for existing
> MVs
> > > if
> > > >>> the
> > > >>>> flag isn’t enabled.
> > > >>>>>
> > > >>>>> Let me know what you think.
> > > >>>>>
> > > >>>>> Thanks,
> > > >>>>>
> > > >>>>> Blake
> > > >>>>
> > > >>>>
> > > >>>> ------------------------------------------------------------
> > ---------
> > > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> > > >>>>
> > > >>>> --
> > > >>> Ben Bromhead
> > > >>> CTO | Instaclustr <https://www.instaclustr.com/>
> > > >>> +1 650 284 9692
> > > >>> Reliability at Scale
> > > >>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> > > >>>
> > > >>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > > For additional commands, e-mail: dev-help@cassandra.apache.org
> > >
> > >
> >
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

I'm not sure the main issue in the case of MVs is testing. In this case it seems to be that there are some design issues and/or the design was only works in some overly restrictive use cases. That MVs were committed knowing these were issues seems to be the real problem. So in the case of MVs, sure I don't think they should have ever made it to an experimental stage.

Thinking of how an experimental flag fits in the with the project going forward though, I disagree that we should avoid adding experimental features. On the contrary, I think leaning towards classifying new features as  experimental would be better for users. Especially larger features and changes.

Even with well spec'd, well tested, and well designed features, there will always be edge cases that you didn't think of, or you'll have made assumptions about the other parts of C* it relies on that aren't 100% correct. Small problems here can often affect correctness, or result in data loss. So, I think it makes sense to avoid marking them as ready for regular use until they've had time to bake in clusters where there are some expert operators that are sophisticated enough to understand the implications of running them, detect issues, and report bugs.

Regarding historical examples, in hindsight I think committing 8099, or at the very least, parts of it, behind an experimental flag would have been the right thing to do. It was a huge change that we're still finding issues with 2 years later.

On October 1, 2017 at 6:08:50 AM, DuyHai Doan (doanduyhai@gmail.com) wrote:

How should we transition one feature from the "experimental" state to  
"production ready" state ? On which criteria ?  



On Sun, Oct 1, 2017 at 12:12 PM, Marcus Eriksson <kr...@gmail.com> wrote:  

> I was just thinking that we should try really hard to avoid adding  
> experimental features - they are experimental due to lack of testing right?  
> There should be a clear path to making the feature non-experimental (or get  
> it removed) and having that path discussed on dev@ might give more  
> visibility to it.  
>  
> I'm also struggling a bit to find good historic examples of "this would  
> have been better off as an experimental feature" - I used to think that it  
> would have been good to commit DTCS with some sort of experimental flag,  
> but that would not have made DTCS any better - it would have been better to  
> do more testing, realise that it does not work and then not commit it at  
> all of course.  
>  
> Does anyone have good examples of features where it would have made sense  
> to commit them behind an experimental flag? SASI might be a good example,  
> but for MVs - if we knew how painful they would be, they really would not  
> have gotten committed at all, right?  
>  
> /Marcus  
>  
> On Sat, Sep 30, 2017 at 7:42 AM, Jeff Jirsa <jj...@gmail.com> wrote:  
>  
> > Reviewers should be able to suggest when experimental is warranted, and  
> > conversation on dev+jira to justify when it’s transitioned from  
> > experimental to stable?  
> >  
> > We should remove the flag as soon as we’re (collectively) confident in a  
> > feature’s behavior - at least correctness, if not performance.  
> >  
> >  
> > > On Sep 29, 2017, at 10:31 PM, Marcus Eriksson <kr...@gmail.com>  
> wrote:  
> > >  
> > > +1 on marking MVs experimental, but should there be some point in the  
> > > future where we consider removing them from the code base unless they  
> > have  
> > > gotten significant improvement as well?  
> > >  
> > > We probably need to enforce some kind of process for adding new  
> > > experimental features in the future - perhaps a mail like this one to  
> > dev@  
> > > motivating why it should be experimental?  
> > >  
> > > /Marcus  
> > >  
> > > On Sat, Sep 30, 2017 at 1:15 AM, Vinay Chella  
> > <vc...@netflix.com.invalid>  
> > > wrote:  
> > >  
> > >> We tried perf testing MVs internally here but did not see good results  
> > with  
> > >> it, hence paused its usage. +1 on tagging certain features which are  
> not  
> > >> PROD ready or not stable enough.  
> > >>  
> > >> Regards,  
> > >> Vinay Chella  
> > >>  
> > >>> On Fri, Sep 29, 2017 at 7:22 PM, Ben Bromhead <be...@instaclustr.com>  
> > wrote:  
> > >>>  
> > >>> I'm a fan of introducing experimental flags in general as well, +1  
> > >>>  
> > >>>  
> > >>>  
> > >>>> On Fri, 29 Sep 2017 at 13:22 Jon Haddad <jo...@jonhaddad.com> wrote:  
> > >>>>  
> > >>>> I’m very much +1 on this, and to new features in general.  
> > >>>>  
> > >>>> I think having a clear line in which we classify something as  
> > >> production  
> > >>>> ready would be nice. It would be great if committers were using the  
> > >>>> feature in prod and could vouch for it’s stability.  
> > >>>>  
> > >>>>> On Sep 29, 2017, at 1:09 PM, Blake Eggleston <beggleston@apple.com  
> >  
> > >>>> wrote:  
> > >>>>>  
> > >>>>> Hi dev@,  
> > >>>>>  
> > >>>>> I’d like to propose that we retroactively classify materialized  
> views  
> > >>> as  
> > >>>> an experimental feature, disable them by default, and require users  
> to  
> > >>>> enable them through a config setting before using.  
> > >>>>>  
> > >>>>> Materialized views have several issues that make them (effectively)  
> > >>>> unusable in production. Some of the issues aren’t just  
> implementation  
> > >>>> problems, but problems with the design that aren’t easily fixed.  
> It’s  
> > >>>> unfair of us to make features available to users in this state  
> without  
> > >>>> providing a clear warning that bad or unexpected things are likely  
> to  
> > >>>> happen if they use it.  
> > >>>>>  
> > >>>>> Obviously, this isn’t great news for users that have already  
> adopted  
> > >>>> MVs, and I don’t have a great answer for that. I think that’s sort  
> of  
> > a  
> > >>>> sunk cost at this point. If they have any MV related problems,  
> they’ll  
> > >>> have  
> > >>>> them whether they’re marked experimental or not. I would expect this  
> > to  
> > >>>> reduce the number of users adopting MVs in the future though, and if  
> > >> they  
> > >>>> do, it would be opt-in.  
> > >>>>>  
> > >>>>> Once MVs reach a point where they’re usable in production, we can  
> > >>> remove  
> > >>>> the flag. Specifics of how the experimental flag would work can be  
> > >>> hammered  
> > >>>> out in a forthcoming JIRA, but I’d imagine it would just prevent  
> users  
> > >>> from  
> > >>>> creating new MVs, and maybe log warnings on startup for existing MVs  
> > if  
> > >>> the  
> > >>>> flag isn’t enabled.  
> > >>>>>  
> > >>>>> Let me know what you think.  
> > >>>>>  
> > >>>>> Thanks,  
> > >>>>>  
> > >>>>> Blake  
> > >>>>  
> > >>>>  
> > >>>> ------------------------------------------------------------  
> ---------  
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
> > >>>>  
> > >>>> --  
> > >>> Ben Bromhead  
> > >>> CTO | Instaclustr <https://www.instaclustr.com/>  
> > >>> +1 650 284 9692  
> > >>> Reliability at Scale  
> > >>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer  
> > >>>  
> > >>  
> >  
> > ---------------------------------------------------------------------  
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > For additional commands, e-mail: dev-help@cassandra.apache.org  
> >  
> >  
>

Re: Proposal to retroactively mark materialized views experimental

Posted by DuyHai Doan <do...@gmail.com>.

How should we transition one feature from the "experimental" state to
"production ready" state ? On which criteria ?



On Sun, Oct 1, 2017 at 12:12 PM, Marcus Eriksson <kr...@gmail.com> wrote:

> I was just thinking that we should try really hard to avoid adding
> experimental features - they are experimental due to lack of testing right?
> There should be a clear path to making the feature non-experimental (or get
> it removed) and having that path discussed on dev@ might give more
> visibility to it.
>
> I'm also struggling a bit to find good historic examples of "this would
> have been better off as an experimental feature" - I used to think that it
> would have been good to commit DTCS with some sort of experimental flag,
> but that would not have made DTCS any better - it would have been better to
> do more testing, realise that it does not work and then not commit it at
> all of course.
>
> Does anyone have good examples of features where it would have made sense
> to commit them behind an experimental flag? SASI might be a good example,
> but for MVs - if we knew how painful they would be, they really would not
> have gotten committed at all, right?
>
> /Marcus
>
> On Sat, Sep 30, 2017 at 7:42 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>
> > Reviewers should be able to suggest when experimental is warranted, and
> > conversation on dev+jira to justify when it’s transitioned from
> > experimental to stable?
> >
> > We should remove the flag as soon as we’re (collectively) confident in a
> > feature’s behavior - at least correctness, if not performance.
> >
> >
> > > On Sep 29, 2017, at 10:31 PM, Marcus Eriksson <kr...@gmail.com>
> wrote:
> > >
> > > +1 on marking MVs experimental, but should there be some point in the
> > > future where we consider removing them from the code base unless they
> > have
> > > gotten significant improvement as well?
> > >
> > > We probably need to enforce some kind of process for adding new
> > > experimental features in the future - perhaps a mail like this one to
> > dev@
> > > motivating why it should be experimental?
> > >
> > > /Marcus
> > >
> > > On Sat, Sep 30, 2017 at 1:15 AM, Vinay Chella
> > <vc...@netflix.com.invalid>
> > > wrote:
> > >
> > >> We tried perf testing MVs internally here but did not see good results
> > with
> > >> it, hence paused its usage. +1 on tagging certain features which are
> not
> > >> PROD ready or not stable enough.
> > >>
> > >> Regards,
> > >> Vinay Chella
> > >>
> > >>> On Fri, Sep 29, 2017 at 7:22 PM, Ben Bromhead <be...@instaclustr.com>
> > wrote:
> > >>>
> > >>> I'm a fan of introducing experimental flags in general as well, +1
> > >>>
> > >>>
> > >>>
> > >>>> On Fri, 29 Sep 2017 at 13:22 Jon Haddad <jo...@jonhaddad.com> wrote:
> > >>>>
> > >>>> I’m very much +1 on this, and to new features in general.
> > >>>>
> > >>>> I think having a clear line in which we classify something as
> > >> production
> > >>>> ready would be nice.  It would be great if committers were using the
> > >>>> feature in prod and could vouch for it’s stability.
> > >>>>
> > >>>>> On Sep 29, 2017, at 1:09 PM, Blake Eggleston <beggleston@apple.com
> >
> > >>>> wrote:
> > >>>>>
> > >>>>> Hi dev@,
> > >>>>>
> > >>>>> I’d like to propose that we retroactively classify materialized
> views
> > >>> as
> > >>>> an experimental feature, disable them by default, and require users
> to
> > >>>> enable them through a config setting before using.
> > >>>>>
> > >>>>> Materialized views have several issues that make them (effectively)
> > >>>> unusable in production. Some of the issues aren’t just
> implementation
> > >>>> problems, but problems with the design that aren’t easily fixed.
> It’s
> > >>>> unfair of us to make features available to users in this state
> without
> > >>>> providing a clear warning that bad or unexpected things are likely
> to
> > >>>> happen if they use it.
> > >>>>>
> > >>>>> Obviously, this isn’t great news for users that have already
> adopted
> > >>>> MVs, and I don’t have a great answer for that. I think that’s sort
> of
> > a
> > >>>> sunk cost at this point. If they have any MV related problems,
> they’ll
> > >>> have
> > >>>> them whether they’re marked experimental or not. I would expect this
> > to
> > >>>> reduce the number of users adopting MVs in the future though, and if
> > >> they
> > >>>> do, it would be opt-in.
> > >>>>>
> > >>>>> Once MVs reach a point where they’re usable in production, we can
> > >>> remove
> > >>>> the flag. Specifics of how the experimental flag would work can be
> > >>> hammered
> > >>>> out in a forthcoming JIRA, but I’d imagine it would just prevent
> users
> > >>> from
> > >>>> creating new MVs, and maybe log warnings on startup for existing MVs
> > if
> > >>> the
> > >>>> flag isn’t enabled.
> > >>>>>
> > >>>>> Let me know what you think.
> > >>>>>
> > >>>>> Thanks,
> > >>>>>
> > >>>>> Blake
> > >>>>
> > >>>>
> > >>>> ------------------------------------------------------------
> ---------
> > >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> > >>>>
> > >>>> --
> > >>> Ben Bromhead
> > >>> CTO | Instaclustr <https://www.instaclustr.com/>
> > >>> +1 650 284 9692
> > >>> Reliability at Scale
> > >>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
> > >>>
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jon Haddad <jo...@jonhaddad.com>.

Having now helped a few folks that have put MVs into prod without realizing what they got themselves into, I’m +1 on a flag disabling the feature by default.  A WARN message would not have helped them.


> On Oct 2, 2017, at 10:56 AM, Blake Eggleston <be...@apple.com> wrote:
> 
> Yeah I’m not sure that just emitting a warning is enough. The point is to be super explicit that bad things will happen if you use MVs. I would (in a patch release) disable MV CREATE statements, and emit warnings for ALTER statements and on schema load if they’re not explicitly enabled. Only emitting a warning really reduces visibility where we need it: in the development process.
> 
> By only emitting warning, we're just protecting users that don't run even rudimentary tests before upgrading their clusters. If an operator is going to blindly deploy a database update to prod without testing, they’re going to poke their eye out on something anyway. Whether it’s an MV flag or something else. If we make this change clear in NEWS.txt, and the user@ list, I think that’s the best thing to do.
> 
> 
> On October 2, 2017 at 10:18:52 AM, Jeremiah D Jordan (jeremiah.jordan@gmail.com) wrote:
> 
> Hindsight is 20/20. For 8099 this is the reason we cut the 2.2 release before 8099 got merged.  
> 
> But moving forward with where we are now, if we are going to start adding some experimental flags to things, then I would definitely put SASI on this list as well.  
> 
> For both SASI and MV I don’t know that adding a flags in the cassandra.yaml which prevents their use is the right way to go. I would propose that we emit WARN from the native protocol mechanism when a user does an ALTER/CREATE what ever that tries to use an experiment feature, and probably in the system.log as well.  So someone who is starting new development using them will get a warning showing up in cqlsh “hey the thing you just used is experimental, proceed with caution” and also in their logs.  
> 
> These things are live on clusters right now, and I would not want someone to upgrade their cluster to a new *patch* release and suddenly something that may have been working for them now does not function. Anyway, we need to be careful about how this gets put into practice if we are going to do it retroactively.  
> 
> -Jeremiah  
> 
> 
>> On Oct 1, 2017, at 5:36 PM, Josh McKenzie <jm...@apache.org> wrote:  
>> 
>>> 
>>> I think committing 8099, or at the very least, parts of it, behind an  
>>> experimental flag would have been the right thing to do.  
>> 
>> With a major refactor like that, it's a staggering amount of extra work to  
>> have a parallel re-write of core components of a storage engine accessible  
>> in parallel to the major based on an experimental flag in the same branch.  
>> I think the complexity in the code-base of having two such channels in  
>> parallel would be an altogether different kind of burden along with making  
>> the work take considerably longer. The argument of modularizing a change  
>> like that, however, is something I can get behind as a matter of general  
>> principle. As we discussed at NGCC, the amount of static state in the C*  
>> code-base makes this an aspirational goal rather than a reality all too  
>> often, unfortunately.  
>> 
>> Not looking to get into the discussion of the appropriateness of 8099 and  
>> other major refactors like it (nio MessagingService for instance) - but  
>> there's a difference between building out new features and shielding the  
>> code-base and users from their complexity and reliability and refactoring  
>> core components of the code-base to keep it relevant.  
>> 
>> On Sun, Oct 1, 2017 at 5:01 PM, Dave Brosius <db...@apache.org> wrote:  
>> 
>>> triggers  
>>> 
>>> 
>>> On 10/01/2017 11:25 AM, Jeff Jirsa wrote:  
>>> 
>>>> Historical examples are anything that you wouldn’t bet your job on for  
>>>> the first release:  
>>>> 
>>>> Udf/uda in 2.2  
>>>> Incremental repair - would have yanked the flag following 9143  
>>>> SASI - probably still experimental  
>>>> Counters - all sorts of correctness issues originally, no longer true  
>>>> since the rewrite in 2.1  
>>>> Vnodes - or at least shuffle  
>>>> CDC - is the API going to change or is it good as-is?  
>>>> CQL - we’re on v3, what’s that say about v1?  
>>>> 
>>>> Basically anything where we can’t definitively say “this feature is going  
>>>> to work for you, build your product on it” because companies around the  
>>>> world are trying to make that determination on their own, and they don’t  
>>>> have the same insight that the active committers have.  
>>>> 
>>>> The transition out we could define as a fixed number of releases or a dev@  
>>>> vote, I don’t think you’ll find something that applies to all experimental  
>>>> features, so being flexible is probably the best bet there  
>>>> 
>>>> 
>>>> 
>>> 
>>> ---------------------------------------------------------------------  
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>> 
>>> 
> 
> 
> ---------------------------------------------------------------------  
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> For additional commands, e-mail: dev-help@cassandra.apache.org  
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

Yeah I’m not sure that just emitting a warning is enough. The point is to be super explicit that bad things will happen if you use MVs. I would (in a patch release) disable MV CREATE statements, and emit warnings for ALTER statements and on schema load if they’re not explicitly enabled. Only emitting a warning really reduces visibility where we need it: in the development process.

By only emitting warning, we're just protecting users that don't run even rudimentary tests before upgrading their clusters. If an operator is going to blindly deploy a database update to prod without testing, they’re going to poke their eye out on something anyway. Whether it’s an MV flag or something else. If we make this change clear in NEWS.txt, and the user@ list, I think that’s the best thing to do.

On October 2, 2017 at 10:18:52 AM, Jeremiah D Jordan (jeremiah.jordan@gmail.com) wrote:

Hindsight is 20/20. For 8099 this is the reason we cut the 2.2 release before 8099 got merged.  

But moving forward with where we are now, if we are going to start adding some experimental flags to things, then I would definitely put SASI on this list as well.  

For both SASI and MV I don’t know that adding a flags in the cassandra.yaml which prevents their use is the right way to go. I would propose that we emit WARN from the native protocol mechanism when a user does an ALTER/CREATE what ever that tries to use an experiment feature, and probably in the system.log as well.  So someone who is starting new development using them will get a warning showing up in cqlsh “hey the thing you just used is experimental, proceed with caution” and also in their logs.  

These things are live on clusters right now, and I would not want someone to upgrade their cluster to a new *patch* release and suddenly something that may have been working for them now does not function. Anyway, we need to be careful about how this gets put into practice if we are going to do it retroactively.  

-Jeremiah  

> On Oct 1, 2017, at 5:36 PM, Josh McKenzie <jm...@apache.org> wrote:  
>  
>>  
>> I think committing 8099, or at the very least, parts of it, behind an  
>> experimental flag would have been the right thing to do.  
>  
> With a major refactor like that, it's a staggering amount of extra work to  
> have a parallel re-write of core components of a storage engine accessible  
> in parallel to the major based on an experimental flag in the same branch.  
> I think the complexity in the code-base of having two such channels in  
> parallel would be an altogether different kind of burden along with making  
> the work take considerably longer. The argument of modularizing a change  
> like that, however, is something I can get behind as a matter of general  
> principle. As we discussed at NGCC, the amount of static state in the C*  
> code-base makes this an aspirational goal rather than a reality all too  
> often, unfortunately.  
>  
> Not looking to get into the discussion of the appropriateness of 8099 and  
> other major refactors like it (nio MessagingService for instance) - but  
> there's a difference between building out new features and shielding the  
> code-base and users from their complexity and reliability and refactoring  
> core components of the code-base to keep it relevant.  
>  
> On Sun, Oct 1, 2017 at 5:01 PM, Dave Brosius <db...@apache.org> wrote:  
>  
>> triggers  
>>  
>>  
>> On 10/01/2017 11:25 AM, Jeff Jirsa wrote:  
>>  
>>> Historical examples are anything that you wouldn’t bet your job on for  
>>> the first release:  
>>>  
>>> Udf/uda in 2.2  
>>> Incremental repair - would have yanked the flag following 9143  
>>> SASI - probably still experimental  
>>> Counters - all sorts of correctness issues originally, no longer true  
>>> since the rewrite in 2.1  
>>> Vnodes - or at least shuffle  
>>> CDC - is the API going to change or is it good as-is?  
>>> CQL - we’re on v3, what’s that say about v1?  
>>>  
>>> Basically anything where we can’t definitively say “this feature is going  
>>> to work for you, build your product on it” because companies around the  
>>> world are trying to make that determination on their own, and they don’t  
>>> have the same insight that the active committers have.  
>>>  
>>> The transition out we could define as a fixed number of releases or a dev@  
>>> vote, I don’t think you’ll find something that applies to all experimental  
>>> features, so being flexible is probably the best bet there  
>>>  
>>>  
>>>  
>>  
>> ---------------------------------------------------------------------  
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>  
>>  

---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Jon Haddad <jo...@jonhaddad.com>.

Developers are also not the only people that are able to make decisions.  Keeping it in the YAML means an operator can disable it vs a developer *maybe* seeing the warning.  Keep in mind not everyone creates tables through CQLSH.

> On Oct 2, 2017, at 2:05 PM, Blake Eggleston <be...@apple.com> wrote:
> 
> The message isn't materially different, but it will reach fewer people, later. People typically aren't as attentive to logs as they should be. Developers finding out about new warnings in the logs later than they could have, sometimes even after it's been deployed, is not uncommon. It's happened to me. Requiring a flag will reach everyone trying to use MVs as soon as they start developing against MVs. Logging a warning will reach a subset of users at some point, hopefully. The only downside I can think of for the flag is that it's not as polite.
> 
> On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmckenzie@apache.org) wrote:
> 
> "Nobody is talking about removing MVs."  
> Not precisely true for this email thread:  
> 
> "but should there be some point in the  
> future where we consider removing them from the code base unless they have  
> gotten significant improvement as well?"  
> 
> IMO a .yaml change requirement isn't materially different than barfing a  
> warning on someone's screen during the dev process when they use the DDL  
> for MV's. At the end of the day, it's just a question of how forceful you  
> want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT  
> READY' in big bold letters, that's not going to miscommunicate to a user  
> that 'feature X is ready' when it's not.  
> 
> Much like w/SASI, this is something that's in the code-base that for  
> certain use-cases apparently works just fine. Might be worth considering  
> the approach of making boundaries around those use-cases more rigid instead  
> of throwing the baby out with the bathwater.  
> 
> On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com> wrote:  
> 
>> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then  
>> I'm fine with it. I initially understood that we wanted to disable it  
>> definitively. Maybe we should then add an explicit error message when MV is  
>> disabled and someone tries to use it, something like:  
>> 
>> "MV has been disabled, to enable it, turn on the flag xxxx in  
>> cassandra.yaml" so users don't spend 3h searching around  
>> 
>> 
>> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:  
>> 
>>> There’s a big difference between removal of a protocol that every single  
>>> C* user had to use and disabling a feature which is objectively broken  
>> and  
>>> almost nobody is using. Nobody is talking about removing MVs. If you  
>> want  
>>> to use them you can enable them very trivially, but it should be an  
>>> explicit option because they really aren’t ready for general use.  
>>> 
>>> Claiming disabling by default == removal is not helpful to the  
>>> conversation and is very misleading.  
>>> 
>>> Let’s be practical here. The people that are most likely to put MVs in  
>>> production right now are people new to Cassandra that don’t know any  
>>> better. The people that *should* be using MVs are the contributors to  
>> the  
>>> project. People that actually wrote Cassandra code that can do a patch  
>> and  
>>> push it into prod, and get it submitted upstream when they fix something.  
>>> Yes, a lot of this stuff requires production usage to shake out the bugs,  
>>> that’s fine, but we shouldn’t lie to people and say “feature X is ready”  
>>> when it’s not. That’s a great way to get a reputation as “unstable” or  
>>> “not fit for production."  
>>> 
>>> Jon  
>>> 
>>> 
>>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:  
>>>> 
>>>> "I would (in a patch release) disable MV CREATE statements, and emit  
>>>> warnings for ALTER statements and on schema load if they’re not  
>>> explicitly  
>>>> enabled"  
>>>> 
>>>> --> I find this pretty extreme. Now we have an existing feature sitting  
>>>> there in the base code but forbidden from version xxx onward.  
>>>> 
>>>> Since when do we start removing feature in a patch release ?  
>> (forbidding  
>>> to  
>>>> create new MV == removing the feature, defacto)  
>>>> 
>>>> Even the Thrift protocol has gone through a long process of deprecation  
>>> and  
>>>> will be removed on 4.0  
>>>> 
>>>> And if we start opening the Pandora box like this, what's next ?  
>>> Forbidding  
>>>> to create SASI index too ? Removing Vnodes ?  
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <  
>>> jeremiah.jordan@gmail.com  
>>>>> wrote:  
>>>> 
>>>>>> Only emitting a warning really reduces visibility where we need it:  
>> in  
>>>>> the development process.  
>>>>> 
>>>>> How does emitting a native protocol warning reduce visibility during  
>> the  
>>>>> development process? If you run CREATE MV and cqlsh then prints out a  
>>>>> giant warning statement about how it is an experimental feature I  
>> think  
>>>>> that is pretty visible during development?  
>>>>> 
>>>>> I guess I can see just blocking new ones without a flag set, but we  
>> need  
>>>>> to be careful here. We need to make sure we don’t cause a problem for  
>>>>> someone that is using them currently, even with all the edge cases  
>>> issues  
>>>>> they have now.  
>>>>> 
>>>>> -Jeremiah  
>>>>> 
>>>>> 
>>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>  
>>>>> wrote:  
>>>>>> 
>>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.  
>>>>>> 
>>>>>> 
>>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (  
>>> aleksey@apple.com)  
>>>>> wrote:  
>>>>>> 
>>>>>> The idea is to check the flag in CreateViewStatement, so creation of  
>>> new  
>>>>> MVs doesn’t succeed without that flag flipped.  
>>>>>> 
>>>>>> Obviously, just disabling existing MVs working in a minor would be  
>>> silly.  
>>>>>> 
>>>>>> As for the warning - yes, that should also be emitted.  
>> Unconditionally.  
>>>>>> 
>>>>>> —  
>>>>>> AY  
>>>>>> 
>>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (  
>>>>> jeremiah.jordan@gmail.com) wrote:  
>>>>>> 
>>>>>> These things are live on clusters right now, and I would not want  
>>>>> someone to upgrade their cluster to a new *patch* release and suddenly  
>>>>> something that may have been working for them now does not function.  
>>>>> Anyway, we need to be careful about how this gets put into practice if  
>>> we  
>>>>> are going to do it retroactively.  
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------  
>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>>>> 
>>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------  
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>> 
>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

It is different because it allows the operators to set those boundaries rather than rely on users doing the right thing.

So ideally you’d have the flag (and given the state of MVs, both design and the implementation, I’d argue the default should be NOPE) - for the operators, and a warning in cqlsh - for the developers.

And no, we aren’t talking about throwing them out. But if we don’t manage to address its issues in a couple releases, then we should at least consider it eventually? At least https://issues.apache.org/jira/browse/CASSANDRA-10346 needs to be addressed IMO.

—
AY

On 2 October 2017 at 21:16:17, Josh McKenzie (jmckenzie@apache.org) wrote:

"Nobody is talking about removing MVs."  
Not precisely true for this email thread:  

"but should there be some point in the  
future where we consider removing them from the code base unless they have  
gotten significant improvement as well?"  

IMO a .yaml change requirement isn't materially different than barfing a  
warning on someone's screen during the dev process when they use the DDL  
for MV's. At the end of the day, it's just a question of how forceful you  
want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT  
READY' in big bold letters, that's not going to miscommunicate to a user  
that 'feature X is ready' when it's not.  

Much like w/SASI, this is something that's in the code-base that for  
certain use-cases apparently works just fine. Might be worth considering  
the approach of making boundaries around those use-cases more rigid instead  
of throwing the baby out with the bathwater.  

On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com> wrote:  

> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then  
> I'm fine with it. I initially understood that we wanted to disable it  
> definitively. Maybe we should then add an explicit error message when MV is  
> disabled and someone tries to use it, something like:  
>  
> "MV has been disabled, to enable it, turn on the flag xxxx in  
> cassandra.yaml" so users don't spend 3h searching around  
>  
>  
> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:  
>  
> > There’s a big difference between removal of a protocol that every single  
> > C* user had to use and disabling a feature which is objectively broken  
> and  
> > almost nobody is using. Nobody is talking about removing MVs. If you  
> want  
> > to use them you can enable them very trivially, but it should be an  
> > explicit option because they really aren’t ready for general use.  
> >  
> > Claiming disabling by default == removal is not helpful to the  
> > conversation and is very misleading.  
> >  
> > Let’s be practical here. The people that are most likely to put MVs in  
> > production right now are people new to Cassandra that don’t know any  
> > better. The people that *should* be using MVs are the contributors to  
> the  
> > project. People that actually wrote Cassandra code that can do a patch  
> and  
> > push it into prod, and get it submitted upstream when they fix something.  
> > Yes, a lot of this stuff requires production usage to shake out the bugs,  
> > that’s fine, but we shouldn’t lie to people and say “feature X is ready”  
> > when it’s not. That’s a great way to get a reputation as “unstable” or  
> > “not fit for production."  
> >  
> > Jon  
> >  
> >  
> > > On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:  
> > >  
> > > "I would (in a patch release) disable MV CREATE statements, and emit  
> > > warnings for ALTER statements and on schema load if they’re not  
> > explicitly  
> > > enabled"  
> > >  
> > > --> I find this pretty extreme. Now we have an existing feature sitting  
> > > there in the base code but forbidden from version xxx onward.  
> > >  
> > > Since when do we start removing feature in a patch release ?  
> (forbidding  
> > to  
> > > create new MV == removing the feature, defacto)  
> > >  
> > > Even the Thrift protocol has gone through a long process of deprecation  
> > and  
> > > will be removed on 4.0  
> > >  
> > > And if we start opening the Pandora box like this, what's next ?  
> > Forbidding  
> > > to create SASI index too ? Removing Vnodes ?  
> > >  
> > >  
> > >  
> > >  
> > > On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <  
> > jeremiah.jordan@gmail.com  
> > >> wrote:  
> > >  
> > >>> Only emitting a warning really reduces visibility where we need it:  
> in  
> > >> the development process.  
> > >>  
> > >> How does emitting a native protocol warning reduce visibility during  
> the  
> > >> development process? If you run CREATE MV and cqlsh then prints out a  
> > >> giant warning statement about how it is an experimental feature I  
> think  
> > >> that is pretty visible during development?  
> > >>  
> > >> I guess I can see just blocking new ones without a flag set, but we  
> need  
> > >> to be careful here. We need to make sure we don’t cause a problem for  
> > >> someone that is using them currently, even with all the edge cases  
> > issues  
> > >> they have now.  
> > >>  
> > >> -Jeremiah  
> > >>  
> > >>  
> > >>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>  
> > >> wrote:  
> > >>>  
> > >>> Yeah, I'm not proposing that we disable MVs in existing clusters.  
> > >>>  
> > >>>  
> > >>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (  
> > aleksey@apple.com)  
> > >> wrote:  
> > >>>  
> > >>> The idea is to check the flag in CreateViewStatement, so creation of  
> > new  
> > >> MVs doesn’t succeed without that flag flipped.  
> > >>>  
> > >>> Obviously, just disabling existing MVs working in a minor would be  
> > silly.  
> > >>>  
> > >>> As for the warning - yes, that should also be emitted.  
> Unconditionally.  
> > >>>  
> > >>> —  
> > >>> AY  
> > >>>  
> > >>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (  
> > >> jeremiah.jordan@gmail.com) wrote:  
> > >>>  
> > >>> These things are live on clusters right now, and I would not want  
> > >> someone to upgrade their cluster to a new *patch* release and suddenly  
> > >> something that may have been working for them now does not function.  
> > >> Anyway, we need to be careful about how this gets put into practice if  
> > we  
> > >> are going to do it retroactively.  
> > >>  
> > >>  
> > >> ---------------------------------------------------------------------  
> > >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > >> For additional commands, e-mail: dev-help@cassandra.apache.org  
> > >>  
> > >>  
> >  
> >  
> > ---------------------------------------------------------------------  
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > For additional commands, e-mail: dev-help@cassandra.apache.org  
> >  
> >  
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

Yep. And that would be nice to have in addition to the opt-in flag in the yaml for the operators that’s stricter than a warning.

—
AY

On 2 October 2017 at 22:21:33, Jeremiah D Jordan (jeremiah@datastax.com) wrote:

We are not saying to just put something in logs, we are talking about the warn actually showing up in cqlsh. 
When you issue a native protocol warn cqlsh will print it out on the console in front of you in the results of the query.

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

Yes, I understand what you're saying. The points I'm making about logs still apply. It's possible for drivers and object mappers to handle queries and schema changes, and have developers rarely open cqlsh. It's also not uncommon for schema changes to be done by a different group than the developers writing the application.

On October 2, 2017 at 2:21:38 PM, Jeremiah D Jordan (jeremiah@datastax.com) wrote:

Blake,  
We are not saying to just put something in logs, we are talking about the warn actually showing up in cqlsh.  
When you issue a native protocol warn cqlsh will print it out on the console in front of you in the results of the query.  
https://issues.apache.org/jira/browse/CASSANDRA-8930 <https://issues.apache.org/jira/browse/CASSANDRA-8930>  

For example for SASI it would look something like:  


cqlsh:ks> CREATE CUSTOM INDEX ON sasi_table (c) USING 'org.apache.cassandra.index.sasi.SASIIndex';  

Warnings :  
A SASI index was enabled for ‘ks.sasi_table'. SASI is still experimental, take extra caution when using it in production.  

cqlsh:ks>  

-Jeremiah  

> On Oct 2, 2017, at 5:05 PM, Blake Eggleston <be...@apple.com> wrote:  
>  
> The message isn't materially different, but it will reach fewer people, later. People typically aren't as attentive to logs as they should be. Developers finding out about new warnings in the logs later than they could have, sometimes even after it's been deployed, is not uncommon. It's happened to me. Requiring a flag will reach everyone trying to use MVs as soon as they start developing against MVs. Logging a warning will reach a subset of users at some point, hopefully. The only downside I can think of for the flag is that it's not as polite.  
>  
> On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmckenzie@apache.org) wrote:  
>  
> "Nobody is talking about removing MVs."  
> Not precisely true for this email thread:  
>  
> "but should there be some point in the  
> future where we consider removing them from the code base unless they have  
> gotten significant improvement as well?"  
>  
> IMO a .yaml change requirement isn't materially different than barfing a  
> warning on someone's screen during the dev process when they use the DDL  
> for MV's. At the end of the day, it's just a question of how forceful you  
> want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT  
> READY' in big bold letters, that's not going to miscommunicate to a user  
> that 'feature X is ready' when it's not.  
>  
> Much like w/SASI, this is something that's in the code-base that for  
> certain use-cases apparently works just fine. Might be worth considering  
> the approach of making boundaries around those use-cases more rigid instead  
> of throwing the baby out with the bathwater.  
>  
> On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com> wrote:  
>  
>> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then  
>> I'm fine with it. I initially understood that we wanted to disable it  
>> definitively. Maybe we should then add an explicit error message when MV is  
>> disabled and someone tries to use it, something like:  
>>  
>> "MV has been disabled, to enable it, turn on the flag xxxx in  
>> cassandra.yaml" so users don't spend 3h searching around  
>>  
>>  
>> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:  
>>  
>>> There’s a big difference between removal of a protocol that every single  
>>> C* user had to use and disabling a feature which is objectively broken  
>> and  
>>> almost nobody is using. Nobody is talking about removing MVs. If you  
>> want  
>>> to use them you can enable them very trivially, but it should be an  
>>> explicit option because they really aren’t ready for general use.  
>>>  
>>> Claiming disabling by default == removal is not helpful to the  
>>> conversation and is very misleading.  
>>>  
>>> Let’s be practical here. The people that are most likely to put MVs in  
>>> production right now are people new to Cassandra that don’t know any  
>>> better. The people that *should* be using MVs are the contributors to  
>> the  
>>> project. People that actually wrote Cassandra code that can do a patch  
>> and  
>>> push it into prod, and get it submitted upstream when they fix something.  
>>> Yes, a lot of this stuff requires production usage to shake out the bugs,  
>>> that’s fine, but we shouldn’t lie to people and say “feature X is ready”  
>>> when it’s not. That’s a great way to get a reputation as “unstable” or  
>>> “not fit for production."  
>>>  
>>> Jon  
>>>  
>>>  
>>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:  
>>>>  
>>>> "I would (in a patch release) disable MV CREATE statements, and emit  
>>>> warnings for ALTER statements and on schema load if they’re not  
>>> explicitly  
>>>> enabled"  
>>>>  
>>>> --> I find this pretty extreme. Now we have an existing feature sitting  
>>>> there in the base code but forbidden from version xxx onward.  
>>>>  
>>>> Since when do we start removing feature in a patch release ?  
>> (forbidding  
>>> to  
>>>> create new MV == removing the feature, defacto)  
>>>>  
>>>> Even the Thrift protocol has gone through a long process of deprecation  
>>> and  
>>>> will be removed on 4.0  
>>>>  
>>>> And if we start opening the Pandora box like this, what's next ?  
>>> Forbidding  
>>>> to create SASI index too ? Removing Vnodes ?  
>>>>  
>>>>  
>>>>  
>>>>  
>>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <  
>>> jeremiah.jordan@gmail.com  
>>>>> wrote:  
>>>>  
>>>>>> Only emitting a warning really reduces visibility where we need it:  
>> in  
>>>>> the development process.  
>>>>>  
>>>>> How does emitting a native protocol warning reduce visibility during  
>> the  
>>>>> development process? If you run CREATE MV and cqlsh then prints out a  
>>>>> giant warning statement about how it is an experimental feature I  
>> think  
>>>>> that is pretty visible during development?  
>>>>>  
>>>>> I guess I can see just blocking new ones without a flag set, but we  
>> need  
>>>>> to be careful here. We need to make sure we don’t cause a problem for  
>>>>> someone that is using them currently, even with all the edge cases  
>>> issues  
>>>>> they have now.  
>>>>>  
>>>>> -Jeremiah  
>>>>>  
>>>>>  
>>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>  
>>>>> wrote:  
>>>>>>  
>>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.  
>>>>>>  
>>>>>>  
>>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (  
>>> aleksey@apple.com)  
>>>>> wrote:  
>>>>>>  
>>>>>> The idea is to check the flag in CreateViewStatement, so creation of  
>>> new  
>>>>> MVs doesn’t succeed without that flag flipped.  
>>>>>>  
>>>>>> Obviously, just disabling existing MVs working in a minor would be  
>>> silly.  
>>>>>>  
>>>>>> As for the warning - yes, that should also be emitted.  
>> Unconditionally.  
>>>>>>  
>>>>>> —  
>>>>>> AY  
>>>>>>  
>>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (  
>>>>> jeremiah.jordan@gmail.com) wrote:  
>>>>>>  
>>>>>> These things are live on clusters right now, and I would not want  
>>>>> someone to upgrade their cluster to a new *patch* release and suddenly  
>>>>> something that may have been working for them now does not function.  
>>>>> Anyway, we need to be careful about how this gets put into practice if  
>>> we  
>>>>> are going to do it retroactively.  
>>>>>  
>>>>>  
>>>>> ---------------------------------------------------------------------  
>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>>>>  
>>>>>  
>>>  
>>>  
>>> ---------------------------------------------------------------------  
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>>  
>>>  
>>

Re: Proposal to retroactively mark materialized views experimental

Posted by Ben Bromhead <be...@instaclustr.com>.

Experimental / feature flags in the yaml file is a far better choice for
operators.

Explicit opt-in for "dangerous" features would greatly help protect new
users as well. Plenty of users ignore batch size warnings, tombstone
warnings, etc. A soft warnings only approach would not achieve the goals of
the original proposal.


On Mon, 2 Oct 2017 at 15:25 Voytek Jarnot <vo...@gmail.com> wrote:

> If a user (vs Cassandra dev) perspective is welcome - I'd recommend
> similarly identifying experimental features in the DESCRIBE / DESC cqlsh
> output as well.
>
> On Mon, Oct 2, 2017 at 4:21 PM, Jeremiah D Jordan <je...@datastax.com>
> wrote:
>
> > Blake,
> > We are not saying to just put something in logs, we are talking about the
> > warn actually showing up in cqlsh.
> > When you issue a native protocol warn cqlsh will print it out on the
> > console in front of you in the results of the query.
> > https://issues.apache.org/jira/browse/CASSANDRA-8930 <
> > https://issues.apache.org/jira/browse/CASSANDRA-8930>
> >
> > For example for SASI it would look something like:
> >
> >
> > cqlsh:ks> CREATE CUSTOM INDEX ON sasi_table (c) USING
> > 'org.apache.cassandra.index.sasi.SASIIndex';
> >
> > Warnings :
> > A SASI index was enabled for ‘ks.sasi_table'. SASI is still experimental,
> > take extra caution when using it in production.
> >
> > cqlsh:ks>
> >
> > -Jeremiah
> >
> > > On Oct 2, 2017, at 5:05 PM, Blake Eggleston <be...@apple.com>
> > wrote:
> > >
> > > The message isn't materially different, but it will reach fewer people,
> > later. People typically aren't as attentive to logs as they should be.
> > Developers finding out about new warnings in the logs later than they
> could
> > have, sometimes even after it's been deployed, is not uncommon. It's
> > happened to me. Requiring a flag will reach everyone trying to use MVs as
> > soon as they start developing against MVs. Logging a warning will reach a
> > subset of users at some point, hopefully. The only downside I can think
> of
> > for the flag is that it's not as polite.
> > >
> > > On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmckenzie@apache.org)
> > wrote:
> > >
> > > "Nobody is talking about removing MVs."
> > > Not precisely true for this email thread:
> > >
> > > "but should there be some point in the
> > > future where we consider removing them from the code base unless they
> > have
> > > gotten significant improvement as well?"
> > >
> > > IMO a .yaml change requirement isn't materially different than barfing
> a
> > > warning on someone's screen during the dev process when they use the
> DDL
> > > for MV's. At the end of the day, it's just a question of how forceful
> you
> > > want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS
> > NOT
> > > READY' in big bold letters, that's not going to miscommunicate to a
> user
> > > that 'feature X is ready' when it's not.
> > >
> > > Much like w/SASI, this is something that's in the code-base that for
> > > certain use-cases apparently works just fine. Might be worth
> considering
> > > the approach of making boundaries around those use-cases more rigid
> > instead
> > > of throwing the baby out with the bathwater.
> > >
> > > On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com>
> > wrote:
> > >
> > >> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml)
> > then
> > >> I'm fine with it. I initially understood that we wanted to disable it
> > >> definitively. Maybe we should then add an explicit error message when
> > MV is
> > >> disabled and someone tries to use it, something like:
> > >>
> > >> "MV has been disabled, to enable it, turn on the flag xxxx in
> > >> cassandra.yaml" so users don't spend 3h searching around
> > >>
> > >>
> > >> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
> > >>
> > >>> There’s a big difference between removal of a protocol that every
> > single
> > >>> C* user had to use and disabling a feature which is objectively
> broken
> > >> and
> > >>> almost nobody is using. Nobody is talking about removing MVs. If you
> > >> want
> > >>> to use them you can enable them very trivially, but it should be an
> > >>> explicit option because they really aren’t ready for general use.
> > >>>
> > >>> Claiming disabling by default == removal is not helpful to the
> > >>> conversation and is very misleading.
> > >>>
> > >>> Let’s be practical here. The people that are most likely to put MVs
> in
> > >>> production right now are people new to Cassandra that don’t know any
> > >>> better. The people that *should* be using MVs are the contributors to
> > >> the
> > >>> project. People that actually wrote Cassandra code that can do a
> patch
> > >> and
> > >>> push it into prod, and get it submitted upstream when they fix
> > something.
> > >>> Yes, a lot of this stuff requires production usage to shake out the
> > bugs,
> > >>> that’s fine, but we shouldn’t lie to people and say “feature X is
> > ready”
> > >>> when it’s not. That’s a great way to get a reputation as “unstable”
> or
> > >>> “not fit for production."
> > >>>
> > >>> Jon
> > >>>
> > >>>
> > >>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com>
> > wrote:
> > >>>>
> > >>>> "I would (in a patch release) disable MV CREATE statements, and emit
> > >>>> warnings for ALTER statements and on schema load if they’re not
> > >>> explicitly
> > >>>> enabled"
> > >>>>
> > >>>> --> I find this pretty extreme. Now we have an existing feature
> > sitting
> > >>>> there in the base code but forbidden from version xxx onward.
> > >>>>
> > >>>> Since when do we start removing feature in a patch release ?
> > >> (forbidding
> > >>> to
> > >>>> create new MV == removing the feature, defacto)
> > >>>>
> > >>>> Even the Thrift protocol has gone through a long process of
> > deprecation
> > >>> and
> > >>>> will be removed on 4.0
> > >>>>
> > >>>> And if we start opening the Pandora box like this, what's next ?
> > >>> Forbidding
> > >>>> to create SASI index too ? Removing Vnodes ?
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <
> > >>> jeremiah.jordan@gmail.com
> > >>>>> wrote:
> > >>>>
> > >>>>>> Only emitting a warning really reduces visibility where we need
> it:
> > >> in
> > >>>>> the development process.
> > >>>>>
> > >>>>> How does emitting a native protocol warning reduce visibility
> during
> > >> the
> > >>>>> development process? If you run CREATE MV and cqlsh then prints
> out a
> > >>>>> giant warning statement about how it is an experimental feature I
> > >> think
> > >>>>> that is pretty visible during development?
> > >>>>>
> > >>>>> I guess I can see just blocking new ones without a flag set, but we
> > >> need
> > >>>>> to be careful here. We need to make sure we don’t cause a problem
> for
> > >>>>> someone that is using them currently, even with all the edge cases
> > >>> issues
> > >>>>> they have now.
> > >>>>>
> > >>>>> -Jeremiah
> > >>>>>
> > >>>>>
> > >>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <beggleston@apple.com
> >
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.
> > >>>>>>
> > >>>>>>
> > >>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (
> > >>> aleksey@apple.com)
> > >>>>> wrote:
> > >>>>>>
> > >>>>>> The idea is to check the flag in CreateViewStatement, so creation
> of
> > >>> new
> > >>>>> MVs doesn’t succeed without that flag flipped.
> > >>>>>>
> > >>>>>> Obviously, just disabling existing MVs working in a minor would be
> > >>> silly.
> > >>>>>>
> > >>>>>> As for the warning - yes, that should also be emitted.
> > >> Unconditionally.
> > >>>>>>
> > >>>>>> —
> > >>>>>> AY
> > >>>>>>
> > >>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
> > >>>>> jeremiah.jordan@gmail.com) wrote:
> > >>>>>>
> > >>>>>> These things are live on clusters right now, and I would not want
> > >>>>> someone to upgrade their cluster to a new *patch* release and
> > suddenly
> > >>>>> something that may have been working for them now does not
> function.
> > >>>>> Anyway, we need to be careful about how this gets put into practice
> > if
> > >>> we
> > >>>>> are going to do it retroactively.
> > >>>>>
> > >>>>>
> > >>>>> ------------------------------------------------------------
> > ---------
> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > >>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> > >>>>>
> > >>>>>
> > >>>
> > >>>
> > >>> ---------------------------------------------------------------------
> > >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > >>> For additional commands, e-mail: dev-help@cassandra.apache.org
> > >>>
> > >>>
> > >>
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer

Re: Proposal to retroactively mark materialized views experimental

Posted by Voytek Jarnot <vo...@gmail.com>.

If a user (vs Cassandra dev) perspective is welcome - I'd recommend
similarly identifying experimental features in the DESCRIBE / DESC cqlsh
output as well.

On Mon, Oct 2, 2017 at 4:21 PM, Jeremiah D Jordan <je...@datastax.com>
wrote:

> Blake,
> We are not saying to just put something in logs, we are talking about the
> warn actually showing up in cqlsh.
> When you issue a native protocol warn cqlsh will print it out on the
> console in front of you in the results of the query.
> https://issues.apache.org/jira/browse/CASSANDRA-8930 <
> https://issues.apache.org/jira/browse/CASSANDRA-8930>
>
> For example for SASI it would look something like:
>
>
> cqlsh:ks> CREATE CUSTOM INDEX ON sasi_table (c) USING
> 'org.apache.cassandra.index.sasi.SASIIndex';
>
> Warnings :
> A SASI index was enabled for ‘ks.sasi_table'. SASI is still experimental,
> take extra caution when using it in production.
>
> cqlsh:ks>
>
> -Jeremiah
>
> > On Oct 2, 2017, at 5:05 PM, Blake Eggleston <be...@apple.com>
> wrote:
> >
> > The message isn't materially different, but it will reach fewer people,
> later. People typically aren't as attentive to logs as they should be.
> Developers finding out about new warnings in the logs later than they could
> have, sometimes even after it's been deployed, is not uncommon. It's
> happened to me. Requiring a flag will reach everyone trying to use MVs as
> soon as they start developing against MVs. Logging a warning will reach a
> subset of users at some point, hopefully. The only downside I can think of
> for the flag is that it's not as polite.
> >
> > On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmckenzie@apache.org)
> wrote:
> >
> > "Nobody is talking about removing MVs."
> > Not precisely true for this email thread:
> >
> > "but should there be some point in the
> > future where we consider removing them from the code base unless they
> have
> > gotten significant improvement as well?"
> >
> > IMO a .yaml change requirement isn't materially different than barfing a
> > warning on someone's screen during the dev process when they use the DDL
> > for MV's. At the end of the day, it's just a question of how forceful you
> > want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS
> NOT
> > READY' in big bold letters, that's not going to miscommunicate to a user
> > that 'feature X is ready' when it's not.
> >
> > Much like w/SASI, this is something that's in the code-base that for
> > certain use-cases apparently works just fine. Might be worth considering
> > the approach of making boundaries around those use-cases more rigid
> instead
> > of throwing the baby out with the bathwater.
> >
> > On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com>
> wrote:
> >
> >> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml)
> then
> >> I'm fine with it. I initially understood that we wanted to disable it
> >> definitively. Maybe we should then add an explicit error message when
> MV is
> >> disabled and someone tries to use it, something like:
> >>
> >> "MV has been disabled, to enable it, turn on the flag xxxx in
> >> cassandra.yaml" so users don't spend 3h searching around
> >>
> >>
> >> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
> >>
> >>> There’s a big difference between removal of a protocol that every
> single
> >>> C* user had to use and disabling a feature which is objectively broken
> >> and
> >>> almost nobody is using. Nobody is talking about removing MVs. If you
> >> want
> >>> to use them you can enable them very trivially, but it should be an
> >>> explicit option because they really aren’t ready for general use.
> >>>
> >>> Claiming disabling by default == removal is not helpful to the
> >>> conversation and is very misleading.
> >>>
> >>> Let’s be practical here. The people that are most likely to put MVs in
> >>> production right now are people new to Cassandra that don’t know any
> >>> better. The people that *should* be using MVs are the contributors to
> >> the
> >>> project. People that actually wrote Cassandra code that can do a patch
> >> and
> >>> push it into prod, and get it submitted upstream when they fix
> something.
> >>> Yes, a lot of this stuff requires production usage to shake out the
> bugs,
> >>> that’s fine, but we shouldn’t lie to people and say “feature X is
> ready”
> >>> when it’s not. That’s a great way to get a reputation as “unstable” or
> >>> “not fit for production."
> >>>
> >>> Jon
> >>>
> >>>
> >>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com>
> wrote:
> >>>>
> >>>> "I would (in a patch release) disable MV CREATE statements, and emit
> >>>> warnings for ALTER statements and on schema load if they’re not
> >>> explicitly
> >>>> enabled"
> >>>>
> >>>> --> I find this pretty extreme. Now we have an existing feature
> sitting
> >>>> there in the base code but forbidden from version xxx onward.
> >>>>
> >>>> Since when do we start removing feature in a patch release ?
> >> (forbidding
> >>> to
> >>>> create new MV == removing the feature, defacto)
> >>>>
> >>>> Even the Thrift protocol has gone through a long process of
> deprecation
> >>> and
> >>>> will be removed on 4.0
> >>>>
> >>>> And if we start opening the Pandora box like this, what's next ?
> >>> Forbidding
> >>>> to create SASI index too ? Removing Vnodes ?
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <
> >>> jeremiah.jordan@gmail.com
> >>>>> wrote:
> >>>>
> >>>>>> Only emitting a warning really reduces visibility where we need it:
> >> in
> >>>>> the development process.
> >>>>>
> >>>>> How does emitting a native protocol warning reduce visibility during
> >> the
> >>>>> development process? If you run CREATE MV and cqlsh then prints out a
> >>>>> giant warning statement about how it is an experimental feature I
> >> think
> >>>>> that is pretty visible during development?
> >>>>>
> >>>>> I guess I can see just blocking new ones without a flag set, but we
> >> need
> >>>>> to be careful here. We need to make sure we don’t cause a problem for
> >>>>> someone that is using them currently, even with all the edge cases
> >>> issues
> >>>>> they have now.
> >>>>>
> >>>>> -Jeremiah
> >>>>>
> >>>>>
> >>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.
> >>>>>>
> >>>>>>
> >>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (
> >>> aleksey@apple.com)
> >>>>> wrote:
> >>>>>>
> >>>>>> The idea is to check the flag in CreateViewStatement, so creation of
> >>> new
> >>>>> MVs doesn’t succeed without that flag flipped.
> >>>>>>
> >>>>>> Obviously, just disabling existing MVs working in a minor would be
> >>> silly.
> >>>>>>
> >>>>>> As for the warning - yes, that should also be emitted.
> >> Unconditionally.
> >>>>>>
> >>>>>> —
> >>>>>> AY
> >>>>>>
> >>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
> >>>>> jeremiah.jordan@gmail.com) wrote:
> >>>>>>
> >>>>>> These things are live on clusters right now, and I would not want
> >>>>> someone to upgrade their cluster to a new *patch* release and
> suddenly
> >>>>> something that may have been working for them now does not function.
> >>>>> Anyway, we need to be careful about how this gets put into practice
> if
> >>> we
> >>>>> are going to do it retroactively.
> >>>>>
> >>>>>
> >>>>> ------------------------------------------------------------
> ---------
> >>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>>
> >>>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>
> >>>
> >>
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremiah D Jordan <je...@datastax.com>.

Blake,
We are not saying to just put something in logs, we are talking about the warn actually showing up in cqlsh.
When you issue a native protocol warn cqlsh will print it out on the console in front of you in the results of the query.
https://issues.apache.org/jira/browse/CASSANDRA-8930 <https://issues.apache.org/jira/browse/CASSANDRA-8930>

For example for SASI it would look something like:


cqlsh:ks> CREATE CUSTOM INDEX ON sasi_table (c) USING 'org.apache.cassandra.index.sasi.SASIIndex';

Warnings :
A SASI index was enabled for ‘ks.sasi_table'. SASI is still experimental, take extra caution when using it in production.

cqlsh:ks>

-Jeremiah

> On Oct 2, 2017, at 5:05 PM, Blake Eggleston <be...@apple.com> wrote:
> 
> The message isn't materially different, but it will reach fewer people, later. People typically aren't as attentive to logs as they should be. Developers finding out about new warnings in the logs later than they could have, sometimes even after it's been deployed, is not uncommon. It's happened to me. Requiring a flag will reach everyone trying to use MVs as soon as they start developing against MVs. Logging a warning will reach a subset of users at some point, hopefully. The only downside I can think of for the flag is that it's not as polite.
> 
> On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmckenzie@apache.org) wrote:
> 
> "Nobody is talking about removing MVs."  
> Not precisely true for this email thread:  
> 
> "but should there be some point in the  
> future where we consider removing them from the code base unless they have  
> gotten significant improvement as well?"  
> 
> IMO a .yaml change requirement isn't materially different than barfing a  
> warning on someone's screen during the dev process when they use the DDL  
> for MV's. At the end of the day, it's just a question of how forceful you  
> want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT  
> READY' in big bold letters, that's not going to miscommunicate to a user  
> that 'feature X is ready' when it's not.  
> 
> Much like w/SASI, this is something that's in the code-base that for  
> certain use-cases apparently works just fine. Might be worth considering  
> the approach of making boundaries around those use-cases more rigid instead  
> of throwing the baby out with the bathwater.  
> 
> On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com> wrote:  
> 
>> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then  
>> I'm fine with it. I initially understood that we wanted to disable it  
>> definitively. Maybe we should then add an explicit error message when MV is  
>> disabled and someone tries to use it, something like:  
>> 
>> "MV has been disabled, to enable it, turn on the flag xxxx in  
>> cassandra.yaml" so users don't spend 3h searching around  
>> 
>> 
>> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:  
>> 
>>> There’s a big difference between removal of a protocol that every single  
>>> C* user had to use and disabling a feature which is objectively broken  
>> and  
>>> almost nobody is using. Nobody is talking about removing MVs. If you  
>> want  
>>> to use them you can enable them very trivially, but it should be an  
>>> explicit option because they really aren’t ready for general use.  
>>> 
>>> Claiming disabling by default == removal is not helpful to the  
>>> conversation and is very misleading.  
>>> 
>>> Let’s be practical here. The people that are most likely to put MVs in  
>>> production right now are people new to Cassandra that don’t know any  
>>> better. The people that *should* be using MVs are the contributors to  
>> the  
>>> project. People that actually wrote Cassandra code that can do a patch  
>> and  
>>> push it into prod, and get it submitted upstream when they fix something.  
>>> Yes, a lot of this stuff requires production usage to shake out the bugs,  
>>> that’s fine, but we shouldn’t lie to people and say “feature X is ready”  
>>> when it’s not. That’s a great way to get a reputation as “unstable” or  
>>> “not fit for production."  
>>> 
>>> Jon  
>>> 
>>> 
>>>> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:  
>>>> 
>>>> "I would (in a patch release) disable MV CREATE statements, and emit  
>>>> warnings for ALTER statements and on schema load if they’re not  
>>> explicitly  
>>>> enabled"  
>>>> 
>>>> --> I find this pretty extreme. Now we have an existing feature sitting  
>>>> there in the base code but forbidden from version xxx onward.  
>>>> 
>>>> Since when do we start removing feature in a patch release ?  
>> (forbidding  
>>> to  
>>>> create new MV == removing the feature, defacto)  
>>>> 
>>>> Even the Thrift protocol has gone through a long process of deprecation  
>>> and  
>>>> will be removed on 4.0  
>>>> 
>>>> And if we start opening the Pandora box like this, what's next ?  
>>> Forbidding  
>>>> to create SASI index too ? Removing Vnodes ?  
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <  
>>> jeremiah.jordan@gmail.com  
>>>>> wrote:  
>>>> 
>>>>>> Only emitting a warning really reduces visibility where we need it:  
>> in  
>>>>> the development process.  
>>>>> 
>>>>> How does emitting a native protocol warning reduce visibility during  
>> the  
>>>>> development process? If you run CREATE MV and cqlsh then prints out a  
>>>>> giant warning statement about how it is an experimental feature I  
>> think  
>>>>> that is pretty visible during development?  
>>>>> 
>>>>> I guess I can see just blocking new ones without a flag set, but we  
>> need  
>>>>> to be careful here. We need to make sure we don’t cause a problem for  
>>>>> someone that is using them currently, even with all the edge cases  
>>> issues  
>>>>> they have now.  
>>>>> 
>>>>> -Jeremiah  
>>>>> 
>>>>> 
>>>>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>  
>>>>> wrote:  
>>>>>> 
>>>>>> Yeah, I'm not proposing that we disable MVs in existing clusters.  
>>>>>> 
>>>>>> 
>>>>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (  
>>> aleksey@apple.com)  
>>>>> wrote:  
>>>>>> 
>>>>>> The idea is to check the flag in CreateViewStatement, so creation of  
>>> new  
>>>>> MVs doesn’t succeed without that flag flipped.  
>>>>>> 
>>>>>> Obviously, just disabling existing MVs working in a minor would be  
>>> silly.  
>>>>>> 
>>>>>> As for the warning - yes, that should also be emitted.  
>> Unconditionally.  
>>>>>> 
>>>>>> —  
>>>>>> AY  
>>>>>> 
>>>>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (  
>>>>> jeremiah.jordan@gmail.com) wrote:  
>>>>>> 
>>>>>> These things are live on clusters right now, and I would not want  
>>>>> someone to upgrade their cluster to a new *patch* release and suddenly  
>>>>> something that may have been working for them now does not function.  
>>>>> Anyway, we need to be careful about how this gets put into practice if  
>>> we  
>>>>> are going to do it retroactively.  
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------  
>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>>>> 
>>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------  
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>> 
>>> 
>>

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

The message isn't materially different, but it will reach fewer people, later. People typically aren't as attentive to logs as they should be. Developers finding out about new warnings in the logs later than they could have, sometimes even after it's been deployed, is not uncommon. It's happened to me. Requiring a flag will reach everyone trying to use MVs as soon as they start developing against MVs. Logging a warning will reach a subset of users at some point, hopefully. The only downside I can think of for the flag is that it's not as polite.

On October 2, 2017 at 1:16:10 PM, Josh McKenzie (jmckenzie@apache.org) wrote:

"Nobody is talking about removing MVs."  
Not precisely true for this email thread:  

"but should there be some point in the  
future where we consider removing them from the code base unless they have  
gotten significant improvement as well?"  

IMO a .yaml change requirement isn't materially different than barfing a  
warning on someone's screen during the dev process when they use the DDL  
for MV's. At the end of the day, it's just a question of how forceful you  
want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT  
READY' in big bold letters, that's not going to miscommunicate to a user  
that 'feature X is ready' when it's not.  

Much like w/SASI, this is something that's in the code-base that for  
certain use-cases apparently works just fine. Might be worth considering  
the approach of making boundaries around those use-cases more rigid instead  
of throwing the baby out with the bathwater.  

On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com> wrote:  

> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then  
> I'm fine with it. I initially understood that we wanted to disable it  
> definitively. Maybe we should then add an explicit error message when MV is  
> disabled and someone tries to use it, something like:  
>  
> "MV has been disabled, to enable it, turn on the flag xxxx in  
> cassandra.yaml" so users don't spend 3h searching around  
>  
>  
> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:  
>  
> > There’s a big difference between removal of a protocol that every single  
> > C* user had to use and disabling a feature which is objectively broken  
> and  
> > almost nobody is using. Nobody is talking about removing MVs. If you  
> want  
> > to use them you can enable them very trivially, but it should be an  
> > explicit option because they really aren’t ready for general use.  
> >  
> > Claiming disabling by default == removal is not helpful to the  
> > conversation and is very misleading.  
> >  
> > Let’s be practical here. The people that are most likely to put MVs in  
> > production right now are people new to Cassandra that don’t know any  
> > better. The people that *should* be using MVs are the contributors to  
> the  
> > project. People that actually wrote Cassandra code that can do a patch  
> and  
> > push it into prod, and get it submitted upstream when they fix something.  
> > Yes, a lot of this stuff requires production usage to shake out the bugs,  
> > that’s fine, but we shouldn’t lie to people and say “feature X is ready”  
> > when it’s not. That’s a great way to get a reputation as “unstable” or  
> > “not fit for production."  
> >  
> > Jon  
> >  
> >  
> > > On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:  
> > >  
> > > "I would (in a patch release) disable MV CREATE statements, and emit  
> > > warnings for ALTER statements and on schema load if they’re not  
> > explicitly  
> > > enabled"  
> > >  
> > > --> I find this pretty extreme. Now we have an existing feature sitting  
> > > there in the base code but forbidden from version xxx onward.  
> > >  
> > > Since when do we start removing feature in a patch release ?  
> (forbidding  
> > to  
> > > create new MV == removing the feature, defacto)  
> > >  
> > > Even the Thrift protocol has gone through a long process of deprecation  
> > and  
> > > will be removed on 4.0  
> > >  
> > > And if we start opening the Pandora box like this, what's next ?  
> > Forbidding  
> > > to create SASI index too ? Removing Vnodes ?  
> > >  
> > >  
> > >  
> > >  
> > > On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <  
> > jeremiah.jordan@gmail.com  
> > >> wrote:  
> > >  
> > >>> Only emitting a warning really reduces visibility where we need it:  
> in  
> > >> the development process.  
> > >>  
> > >> How does emitting a native protocol warning reduce visibility during  
> the  
> > >> development process? If you run CREATE MV and cqlsh then prints out a  
> > >> giant warning statement about how it is an experimental feature I  
> think  
> > >> that is pretty visible during development?  
> > >>  
> > >> I guess I can see just blocking new ones without a flag set, but we  
> need  
> > >> to be careful here. We need to make sure we don’t cause a problem for  
> > >> someone that is using them currently, even with all the edge cases  
> > issues  
> > >> they have now.  
> > >>  
> > >> -Jeremiah  
> > >>  
> > >>  
> > >>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>  
> > >> wrote:  
> > >>>  
> > >>> Yeah, I'm not proposing that we disable MVs in existing clusters.  
> > >>>  
> > >>>  
> > >>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (  
> > aleksey@apple.com)  
> > >> wrote:  
> > >>>  
> > >>> The idea is to check the flag in CreateViewStatement, so creation of  
> > new  
> > >> MVs doesn’t succeed without that flag flipped.  
> > >>>  
> > >>> Obviously, just disabling existing MVs working in a minor would be  
> > silly.  
> > >>>  
> > >>> As for the warning - yes, that should also be emitted.  
> Unconditionally.  
> > >>>  
> > >>> —  
> > >>> AY  
> > >>>  
> > >>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (  
> > >> jeremiah.jordan@gmail.com) wrote:  
> > >>>  
> > >>> These things are live on clusters right now, and I would not want  
> > >> someone to upgrade their cluster to a new *patch* release and suddenly  
> > >> something that may have been working for them now does not function.  
> > >> Anyway, we need to be careful about how this gets put into practice if  
> > we  
> > >> are going to do it retroactively.  
> > >>  
> > >>  
> > >> ---------------------------------------------------------------------  
> > >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > >> For additional commands, e-mail: dev-help@cassandra.apache.org  
> > >>  
> > >>  
> >  
> >  
> > ---------------------------------------------------------------------  
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> > For additional commands, e-mail: dev-help@cassandra.apache.org  
> >  
> >  
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

"Nobody is talking about removing MVs."
Not precisely true for this email thread:

"but should there be some point in the
future where we consider removing them from the code base unless they have
gotten significant improvement as well?"

IMO a .yaml change requirement isn't materially different than barfing a
warning on someone's screen during the dev process when they use the DDL
for MV's. At the end of the day, it's just a question of how forceful you
want that messaging to be. If the cqlsh client prints 'THIS FEATURE IS NOT
READY' in big bold letters, that's not going to miscommunicate to a user
that 'feature X is ready' when it's not.

Much like w/SASI, this is something that's in the code-base that for
certain use-cases apparently works just fine. Might be worth considering
the approach of making boundaries around those use-cases more rigid instead
of throwing the baby out with the bathwater.

On Mon, Oct 2, 2017 at 3:32 PM, DuyHai Doan <do...@gmail.com> wrote:

> Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then
> I'm fine with it. I initially understood that we wanted to disable it
> definitively. Maybe we should then add an explicit error message when MV is
> disabled and someone tries to use it, something like:
>
> "MV has been disabled, to enable it, turn on the flag xxxx in
> cassandra.yaml" so users don't spend 3h searching around
>
>
> On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
>
> > There’s a big difference between removal of a protocol that every single
> > C* user had to use and disabling a feature which is objectively broken
> and
> > almost nobody is using.  Nobody is talking about removing MVs.  If you
> want
> > to use them you can enable them very trivially, but it should be an
> > explicit option because they really aren’t ready for general use.
> >
> > Claiming disabling by default == removal is not helpful to the
> > conversation and is very misleading.
> >
> > Let’s be practical here.  The people that are most likely to put MVs in
> > production right now are people new to Cassandra that don’t know any
> > better.  The people that *should* be using MVs are the contributors to
> the
> > project.  People that actually wrote Cassandra code that can do a patch
> and
> > push it into prod, and get it submitted upstream when they fix something.
> > Yes, a lot of this stuff requires production usage to shake out the bugs,
> > that’s fine, but we shouldn’t lie to people and say “feature X is ready”
> > when it’s not.  That’s a great way to get a reputation as “unstable” or
> > “not fit for production."
> >
> > Jon
> >
> >
> > > On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:
> > >
> > > "I would (in a patch release) disable MV CREATE statements, and emit
> > > warnings for ALTER statements and on schema load if they’re not
> > explicitly
> > > enabled"
> > >
> > > --> I find this pretty extreme. Now we have an existing feature sitting
> > > there in the base code but forbidden from version xxx onward.
> > >
> > > Since when do we start removing feature in a patch release ?
> (forbidding
> > to
> > > create new MV == removing the feature, defacto)
> > >
> > > Even the Thrift protocol has gone through a long process of deprecation
> > and
> > > will be removed on 4.0
> > >
> > > And if we start opening the Pandora box like this, what's next ?
> > Forbidding
> > > to create SASI index too ? Removing Vnodes ?
> > >
> > >
> > >
> > >
> > > On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <
> > jeremiah.jordan@gmail.com
> > >> wrote:
> > >
> > >>> Only emitting a warning really reduces visibility where we need it:
> in
> > >> the development process.
> > >>
> > >> How does emitting a native protocol warning reduce visibility during
> the
> > >> development process?  If you run CREATE MV and cqlsh then prints out a
> > >> giant warning statement about how it is an experimental feature I
> think
> > >> that is pretty visible during development?
> > >>
> > >> I guess I can see just blocking new ones without a flag set, but we
> need
> > >> to be careful here.  We need to make sure we don’t cause a problem for
> > >> someone that is using them currently, even with all the edge cases
> > issues
> > >> they have now.
> > >>
> > >> -Jeremiah
> > >>
> > >>
> > >>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>
> > >> wrote:
> > >>>
> > >>> Yeah, I'm not proposing that we disable MVs in existing clusters.
> > >>>
> > >>>
> > >>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (
> > aleksey@apple.com)
> > >> wrote:
> > >>>
> > >>> The idea is to check the flag in CreateViewStatement, so creation of
> > new
> > >> MVs doesn’t succeed without that flag flipped.
> > >>>
> > >>> Obviously, just disabling existing MVs working in a minor would be
> > silly.
> > >>>
> > >>> As for the warning - yes, that should also be emitted.
> Unconditionally.
> > >>>
> > >>> —
> > >>> AY
> > >>>
> > >>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
> > >> jeremiah.jordan@gmail.com) wrote:
> > >>>
> > >>> These things are live on clusters right now, and I would not want
> > >> someone to upgrade their cluster to a new *patch* release and suddenly
> > >> something that may have been working for them now does not function.
> > >> Anyway, we need to be careful about how this gets put into practice if
> > we
> > >> are going to do it retroactively.
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > >> For additional commands, e-mail: dev-help@cassandra.apache.org
> > >>
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>

Re: Proposal to retroactively mark materialized views experimental

Posted by DuyHai Doan <do...@gmail.com>.

Ok so IF there is a flag to enable MV (à-la UDA/UDF in cassandra.yaml) then
I'm fine with it. I initially understood that we wanted to disable it
definitively. Maybe we should then add an explicit error message when MV is
disabled and someone tries to use it, something like:

"MV has been disabled, to enable it, turn on the flag xxxx in
cassandra.yaml" so users don't spend 3h searching around


On Mon, Oct 2, 2017 at 9:07 PM, Jon Haddad <jo...@jonhaddad.com> wrote:

> There’s a big difference between removal of a protocol that every single
> C* user had to use and disabling a feature which is objectively broken and
> almost nobody is using.  Nobody is talking about removing MVs.  If you want
> to use them you can enable them very trivially, but it should be an
> explicit option because they really aren’t ready for general use.
>
> Claiming disabling by default == removal is not helpful to the
> conversation and is very misleading.
>
> Let’s be practical here.  The people that are most likely to put MVs in
> production right now are people new to Cassandra that don’t know any
> better.  The people that *should* be using MVs are the contributors to the
> project.  People that actually wrote Cassandra code that can do a patch and
> push it into prod, and get it submitted upstream when they fix something.
> Yes, a lot of this stuff requires production usage to shake out the bugs,
> that’s fine, but we shouldn’t lie to people and say “feature X is ready”
> when it’s not.  That’s a great way to get a reputation as “unstable” or
> “not fit for production."
>
> Jon
>
>
> > On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:
> >
> > "I would (in a patch release) disable MV CREATE statements, and emit
> > warnings for ALTER statements and on schema load if they’re not
> explicitly
> > enabled"
> >
> > --> I find this pretty extreme. Now we have an existing feature sitting
> > there in the base code but forbidden from version xxx onward.
> >
> > Since when do we start removing feature in a patch release ? (forbidding
> to
> > create new MV == removing the feature, defacto)
> >
> > Even the Thrift protocol has gone through a long process of deprecation
> and
> > will be removed on 4.0
> >
> > And if we start opening the Pandora box like this, what's next ?
> Forbidding
> > to create SASI index too ? Removing Vnodes ?
> >
> >
> >
> >
> > On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <
> jeremiah.jordan@gmail.com
> >> wrote:
> >
> >>> Only emitting a warning really reduces visibility where we need it: in
> >> the development process.
> >>
> >> How does emitting a native protocol warning reduce visibility during the
> >> development process?  If you run CREATE MV and cqlsh then prints out a
> >> giant warning statement about how it is an experimental feature I think
> >> that is pretty visible during development?
> >>
> >> I guess I can see just blocking new ones without a flag set, but we need
> >> to be careful here.  We need to make sure we don’t cause a problem for
> >> someone that is using them currently, even with all the edge cases
> issues
> >> they have now.
> >>
> >> -Jeremiah
> >>
> >>
> >>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>
> >> wrote:
> >>>
> >>> Yeah, I'm not proposing that we disable MVs in existing clusters.
> >>>
> >>>
> >>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (
> aleksey@apple.com)
> >> wrote:
> >>>
> >>> The idea is to check the flag in CreateViewStatement, so creation of
> new
> >> MVs doesn’t succeed without that flag flipped.
> >>>
> >>> Obviously, just disabling existing MVs working in a minor would be
> silly.
> >>>
> >>> As for the warning - yes, that should also be emitted. Unconditionally.
> >>>
> >>> —
> >>> AY
> >>>
> >>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
> >> jeremiah.jordan@gmail.com) wrote:
> >>>
> >>> These things are live on clusters right now, and I would not want
> >> someone to upgrade their cluster to a new *patch* release and suddenly
> >> something that may have been working for them now does not function.
> >> Anyway, we need to be careful about how this gets put into practice if
> we
> >> are going to do it retroactively.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jon Haddad <jo...@jonhaddad.com>.

There’s a big difference between removal of a protocol that every single C* user had to use and disabling a feature which is objectively broken and almost nobody is using.  Nobody is talking about removing MVs.  If you want to use them you can enable them very trivially, but it should be an explicit option because they really aren’t ready for general use.

Claiming disabling by default == removal is not helpful to the conversation and is very misleading.  

Let’s be practical here.  The people that are most likely to put MVs in production right now are people new to Cassandra that don’t know any better.  The people that *should* be using MVs are the contributors to the project.  People that actually wrote Cassandra code that can do a patch and push it into prod, and get it submitted upstream when they fix something.  Yes, a lot of this stuff requires production usage to shake out the bugs, that’s fine, but we shouldn’t lie to people and say “feature X is ready” when it’s not.  That’s a great way to get a reputation as “unstable” or “not fit for production."

Jon


> On Oct 2, 2017, at 11:54 AM, DuyHai Doan <do...@gmail.com> wrote:
> 
> "I would (in a patch release) disable MV CREATE statements, and emit
> warnings for ALTER statements and on schema load if they’re not explicitly
> enabled"
> 
> --> I find this pretty extreme. Now we have an existing feature sitting
> there in the base code but forbidden from version xxx onward.
> 
> Since when do we start removing feature in a patch release ? (forbidding to
> create new MV == removing the feature, defacto)
> 
> Even the Thrift protocol has gone through a long process of deprecation and
> will be removed on 4.0
> 
> And if we start opening the Pandora box like this, what's next ? Forbidding
> to create SASI index too ? Removing Vnodes ?
> 
> 
> 
> 
> On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <jeremiah.jordan@gmail.com
>> wrote:
> 
>>> Only emitting a warning really reduces visibility where we need it: in
>> the development process.
>> 
>> How does emitting a native protocol warning reduce visibility during the
>> development process?  If you run CREATE MV and cqlsh then prints out a
>> giant warning statement about how it is an experimental feature I think
>> that is pretty visible during development?
>> 
>> I guess I can see just blocking new ones without a flag set, but we need
>> to be careful here.  We need to make sure we don’t cause a problem for
>> someone that is using them currently, even with all the edge cases issues
>> they have now.
>> 
>> -Jeremiah
>> 
>> 
>>> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>
>> wrote:
>>> 
>>> Yeah, I'm not proposing that we disable MVs in existing clusters.
>>> 
>>> 
>>> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (aleksey@apple.com)
>> wrote:
>>> 
>>> The idea is to check the flag in CreateViewStatement, so creation of new
>> MVs doesn’t succeed without that flag flipped.
>>> 
>>> Obviously, just disabling existing MVs working in a minor would be silly.
>>> 
>>> As for the warning - yes, that should also be emitted. Unconditionally.
>>> 
>>> —
>>> AY
>>> 
>>> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
>> jeremiah.jordan@gmail.com) wrote:
>>> 
>>> These things are live on clusters right now, and I would not want
>> someone to upgrade their cluster to a new *patch* release and suddenly
>> something that may have been working for them now does not function.
>> Anyway, we need to be careful about how this gets put into practice if we
>> are going to do it retroactively.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by DuyHai Doan <do...@gmail.com>.

"I would (in a patch release) disable MV CREATE statements, and emit
warnings for ALTER statements and on schema load if they’re not explicitly
enabled"

--> I find this pretty extreme. Now we have an existing feature sitting
there in the base code but forbidden from version xxx onward.

Since when do we start removing feature in a patch release ? (forbidding to
create new MV == removing the feature, defacto)

Even the Thrift protocol has gone through a long process of deprecation and
will be removed on 4.0

And if we start opening the Pandora box like this, what's next ? Forbidding
to create SASI index too ? Removing Vnodes ?




On Mon, Oct 2, 2017 at 8:16 PM, Jeremiah D Jordan <jeremiah.jordan@gmail.com
> wrote:

> > Only emitting a warning really reduces visibility where we need it: in
> the development process.
>
> How does emitting a native protocol warning reduce visibility during the
> development process?  If you run CREATE MV and cqlsh then prints out a
> giant warning statement about how it is an experimental feature I think
> that is pretty visible during development?
>
> I guess I can see just blocking new ones without a flag set, but we need
> to be careful here.  We need to make sure we don’t cause a problem for
> someone that is using them currently, even with all the edge cases issues
> they have now.
>
> -Jeremiah
>
>
> > On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com>
> wrote:
> >
> > Yeah, I'm not proposing that we disable MVs in existing clusters.
> >
> >
> > On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (aleksey@apple.com)
> wrote:
> >
> > The idea is to check the flag in CreateViewStatement, so creation of new
> MVs doesn’t succeed without that flag flipped.
> >
> > Obviously, just disabling existing MVs working in a minor would be silly.
> >
> > As for the warning - yes, that should also be emitted. Unconditionally.
> >
> > —
> > AY
> >
> > On 2 October 2017 at 18:18:52, Jeremiah D Jordan (
> jeremiah.jordan@gmail.com) wrote:
> >
> > These things are live on clusters right now, and I would not want
> someone to upgrade their cluster to a new *patch* release and suddenly
> something that may have been working for them now does not function.
> Anyway, we need to be careful about how this gets put into practice if we
> are going to do it retroactively.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremiah D Jordan <je...@gmail.com>.

> Only emitting a warning really reduces visibility where we need it: in the development process.

How does emitting a native protocol warning reduce visibility during the development process?  If you run CREATE MV and cqlsh then prints out a giant warning statement about how it is an experimental feature I think that is pretty visible during development?

I guess I can see just blocking new ones without a flag set, but we need to be careful here.  We need to make sure we don’t cause a problem for someone that is using them currently, even with all the edge cases issues they have now.

-Jeremiah


> On Oct 2, 2017, at 2:01 PM, Blake Eggleston <be...@apple.com> wrote:
> 
> Yeah, I'm not proposing that we disable MVs in existing clusters.
> 
> 
> On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (aleksey@apple.com) wrote:
> 
> The idea is to check the flag in CreateViewStatement, so creation of new MVs doesn’t succeed without that flag flipped.  
> 
> Obviously, just disabling existing MVs working in a minor would be silly.  
> 
> As for the warning - yes, that should also be emitted. Unconditionally.  
> 
> —  
> AY  
> 
> On 2 October 2017 at 18:18:52, Jeremiah D Jordan (jeremiah.jordan@gmail.com) wrote:  
> 
> These things are live on clusters right now, and I would not want someone to upgrade their cluster to a new *patch* release and suddenly something that may have been working for them now does not function. Anyway, we need to be careful about how this gets put into practice if we are going to do it retroactively. 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

Yeah, I'm not proposing that we disable MVs in existing clusters.


On October 2, 2017 at 10:58:11 AM, Aleksey Yeshchenko (aleksey@apple.com) wrote:

The idea is to check the flag in CreateViewStatement, so creation of new MVs doesn’t succeed without that flag flipped.  

Obviously, just disabling existing MVs working in a minor would be silly.  

As for the warning - yes, that should also be emitted. Unconditionally.  

—  
AY  

On 2 October 2017 at 18:18:52, Jeremiah D Jordan (jeremiah.jordan@gmail.com) wrote:  

These things are live on clusters right now, and I would not want someone to upgrade their cluster to a new *patch* release and suddenly something that may have been working for them now does not function. Anyway, we need to be careful about how this gets put into practice if we are going to do it retroactively.

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

Oh, come on. You're being disingenuous.

I invented both algorithms, so I get some say in which is more complex.  I fully understand the behaviour of early reopen and can explain it to a lay person in around five minutes.  Last time I posted an analysis of MVs it took me several days to get it straight in my head just enough to be sure the novel problems I was pointing out existed - and in no way did I have confidence I had established all the problems.  It wasn't until well after it was completed we realised it had some hugely fundamental limitations around primary keys.  I would NOT be able to explain the algorithm or its implications to a lay person AT ALL.

That said, I would absolutely be comfortable marking incremental repair and SASI experimental if this is required to cover MVs with the moniker.  The former is less complex than  MVs, but It fits a similar category of complex distributed systems implications we hadn't properly modelled. It *has* now had extensive testing in the wild though. Conversely SASI has had very little burn test, but employs fairly well established approaches, and suffers from very little distributed systems complexity.

> On 4 Oct 2017, at 11:12, Josh McKenzie <jm...@apache.org> wrote:
> 
> I don't agree at face value that early re-open is in sum a lot simpler than
> MV, or that adding CQL and deprecating Thrift was a lot simpler, or the
> 8099 refactor, etc. Different types of complexity, certainly, and MV's are
> arguably harder to prove correct due to surface area of exposure to failure
> states. Definitions of complexity aside, I do agree with the general
> principle that MV's are very complex and, as with many other things in the
> DB, boundary conditions are insufficiently understood and tested at this
> time. There's also a recency bias to the defects and active work people are
> seeing with MV as there has been a recent focus on stabilizing that rather
> than with the long tail we've seen with other, more pervasive and
> foundational changes to the code-base over the course of the past few years.
> 
> MV's aren't the only thing in the DB that I think qualify for 'flagging as
> not-production-ready' by the criteria people are attempting to selectively
> apply to the feature here. If we go the route of flagging one already
> released feature experimental because we lack confidence in it, there are
> other things we similarly lack confidence in that should be treated
> similarly (incremental repair, SASI to name two that immediately come to
> mind). I personally don't think changing the qualification and user
> experience of features post-release sends a good message to said users; if
> we all agreed unanimously that these features were this failure-prone and
> high-risk, it would be more appropriate to make that change however that's
> obviously not the case here.
> 
> 
> On Wed, Oct 4, 2017 at 10:41 AM, Benedict Elliott Smith <_@belliottsmith.com
>> wrote:
> 
>> So, as the author of one of the disasters you mention (early re-open), I
>> would prefer to learn from the mistake and not repeat it.  Unfortunately we
>> seem to be in the habit of repeating it, and that feature was a lot *lot*
>> simpler.
>> 
>> Let’s not kid ourselves: MVs are by far and away the most complicated
>> feature we have ever delivered.  We do not fully understand it, even in
>> theory, let alone can we be sure we have the implementation right.
>> 
>> So, if we all agree our testing is ordinarily insufficient, can’t we agree
>> it is probably *really* insufficient here?
>> 
>> I don’t want to give the impression I’m shifting the goals.  I’ve been
>> against MV inclusion as they stand for some time, as were several others.
>> I think in the new world order of project/community structure, they
>> probably would have been rejected as they stand.
>> 
>> I’ve consistently listed my own requirements for considering them
>> production ready:  extensive modelling and simulation of the algorithm’s
>> properties (in lieu of formal proofs), *safe* default behaviour (rollback
>> CASSANDRA-10230, or make it a per-table option, and default to fast only
>> for existing tables to avoid surprise), tools for detecting and repairing
>> inconsistencies, and more extensive testing.
>> 
>> Many of these things were agreed as prerequisites for release of 3.0, but
>> ultimately they were not delivered.
>> 
>> I do, however, absolutely agree with Sylvain that we need to minimise
>> surprise in a patch version.
>> 
>> 
>> On 4 Oct 2017, at 08:58, Josh McKenzie <jm...@apache.org> wrote:
>> 
>>>> and providing a feature we don't fully understand, have not fully
>>> documented the caveats of, let alone discovered all the problems with nor
>>> had that knowledge percolate fully into the wider community.
>>> There appear to be varying levels of understanding of the implementation
>>> details of MV's (that seem to directly correlate with faith in the
>>> feature's correctness for the use-cases recommended) on this email thread
>>> so while I respect a sense of general wariness about the state of
>>> correctness testing with C*, I don't agree that the thoroughness of
>> testing
>>> of MV's is any different than any other feature we've added to the
>>> code-base since the project's inception.
>>> 
>>> That's not to say I think the current extent of our testing before GA on
>>> features is adequate; I don't, but I don't think it makes sense to draw
>> an
>>> arbitrary line in the sand with already released features that are in use
>>> in production clusters, flagging said features as experimental after the
>>> fact, and thus eroding users' trust in our collective definition of done.
>>> What's to stop us from flagging other, seemingly arbitrary features
>> people
>>> are relying on in production as experimental in the future? What does
>> that
>>> mean for their faith in the project and their job security? SASI? LWT?
>>> Counters? Triggers? Repair and compaction due to (still arising)
>> edge-cases
>>> and defects in early re-open and incremental repair? All of these
>> features
>>> still have edge-cases due to the inherent complexity of the code-base and
>>> problem domain in which we work.
>>> 
>>> Right now there appear to be the two camps of 'I can't clearly articulate
>>> what Good Enough is since it's Complicated, but I know we're not there'
>> and
>>> 'if people are relying on it in production without issue it's by
>> definition
>>> good enough for their use-case'. It's a compromise; nothing is ever
>> perfect
>>> (as we all know). I'm all for us saying 'We need better testing of
>> features
>>> going forward', 'We need better metrics for the coverage and branch
>> testing
>>> of things in C*', etc, and definitely in favor of us spending some time
>> to
>>> increase our coverage for existing features.
>>> 
>>> I don't think MV's are any different than anything else in this code-base
>>> in terms of how well vetted the features are, for better or for worse.
>>> 
>>> On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves <ku...@instaclustr.com>
>> wrote:
>>> 
>>>>> 
>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does not
>>>>> imply "experimental" or "beta" or anything like that.
>>>>> It does make life easier for both operators and the C* developers.
>>>> 
>>>> I would be all for a mv_enabled option, assuming it's enabled by default
>>>> for all existing branches. I don't think saying that you are meant to
>> read
>>>> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
>>>> expecting them to is a bit insane. Also Assuming that if they read it
>>>> they'd understand all implications is also a bit questionable. If deemed
>>>> suitable to turn it off that can be done in the next major/minor, but I
>>>> think that would be unlikely, as we should really require sufficient
>>>> evidence that it's dangerous which I just don't think we have. I'm
>> still of
>>>> the opinion that MV in their current state are no worse off than a lot
>> of
>>>> other features, and marking them as experimental and disabling now would
>>>> just be detrimental to their development and annoy users. Also if we
>> give
>>>> them that treatment then there a whole load of other defaults we should
>>>> change and disable which is just not acceptable in a patch release. It's
>>>> not really necessary anyway, we don't have anyone crying bloody murder
>> on
>>>> the mailing list about how everything went to hell because they used
>>>> feature x.
>>>> 
>>>> No one has really provided any counter evidence yet that MV's are in
>> some
>>>> awful state and they are going to shoot users. There are a few existing
>>>> issues that I've brought up already, but they are really quite minor,
>>>> nothing comparable to "lol you can't repair if you use vnodes, sorry". I
>>>> think we really need some real examples/evidence before making calls
>> like
>>>> "lets disable this feature in a patch release and mark it experimental"
>>>> 
>>>>> I personally believe it is better to offer the feature as experimental
>>>>> until we iron out all of the problems
>>>> 
>>>> What problems are you referring to, and how exactly will we know when
>> all
>>>> of them have been sufficiently ironed? If we mark it as experimental how
>>>> exactly are we going to get people to use said feature to find issues?
>>>> 
>>>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

I don't agree at face value that early re-open is in sum a lot simpler than
MV, or that adding CQL and deprecating Thrift was a lot simpler, or the
8099 refactor, etc. Different types of complexity, certainly, and MV's are
arguably harder to prove correct due to surface area of exposure to failure
states. Definitions of complexity aside, I do agree with the general
principle that MV's are very complex and, as with many other things in the
DB, boundary conditions are insufficiently understood and tested at this
time. There's also a recency bias to the defects and active work people are
seeing with MV as there has been a recent focus on stabilizing that rather
than with the long tail we've seen with other, more pervasive and
foundational changes to the code-base over the course of the past few years.

MV's aren't the only thing in the DB that I think qualify for 'flagging as
not-production-ready' by the criteria people are attempting to selectively
apply to the feature here. If we go the route of flagging one already
released feature experimental because we lack confidence in it, there are
other things we similarly lack confidence in that should be treated
similarly (incremental repair, SASI to name two that immediately come to
mind). I personally don't think changing the qualification and user
experience of features post-release sends a good message to said users; if
we all agreed unanimously that these features were this failure-prone and
high-risk, it would be more appropriate to make that change however that's
obviously not the case here.


On Wed, Oct 4, 2017 at 10:41 AM, Benedict Elliott Smith <_@belliottsmith.com
> wrote:

> So, as the author of one of the disasters you mention (early re-open), I
> would prefer to learn from the mistake and not repeat it.  Unfortunately we
> seem to be in the habit of repeating it, and that feature was a lot *lot*
> simpler.
>
> Let’s not kid ourselves: MVs are by far and away the most complicated
> feature we have ever delivered.  We do not fully understand it, even in
> theory, let alone can we be sure we have the implementation right.
>
> So, if we all agree our testing is ordinarily insufficient, can’t we agree
> it is probably *really* insufficient here?
>
> I don’t want to give the impression I’m shifting the goals.  I’ve been
> against MV inclusion as they stand for some time, as were several others.
> I think in the new world order of project/community structure, they
> probably would have been rejected as they stand.
>
> I’ve consistently listed my own requirements for considering them
> production ready:  extensive modelling and simulation of the algorithm’s
> properties (in lieu of formal proofs), *safe* default behaviour (rollback
> CASSANDRA-10230, or make it a per-table option, and default to fast only
> for existing tables to avoid surprise), tools for detecting and repairing
> inconsistencies, and more extensive testing.
>
> Many of these things were agreed as prerequisites for release of 3.0, but
> ultimately they were not delivered.
>
> I do, however, absolutely agree with Sylvain that we need to minimise
> surprise in a patch version.
>
>
> On 4 Oct 2017, at 08:58, Josh McKenzie <jm...@apache.org> wrote:
>
> >> and providing a feature we don't fully understand, have not fully
> > documented the caveats of, let alone discovered all the problems with nor
> > had that knowledge percolate fully into the wider community.
> > There appear to be varying levels of understanding of the implementation
> > details of MV's (that seem to directly correlate with faith in the
> > feature's correctness for the use-cases recommended) on this email thread
> > so while I respect a sense of general wariness about the state of
> > correctness testing with C*, I don't agree that the thoroughness of
> testing
> > of MV's is any different than any other feature we've added to the
> > code-base since the project's inception.
> >
> > That's not to say I think the current extent of our testing before GA on
> > features is adequate; I don't, but I don't think it makes sense to draw
> an
> > arbitrary line in the sand with already released features that are in use
> > in production clusters, flagging said features as experimental after the
> > fact, and thus eroding users' trust in our collective definition of done.
> > What's to stop us from flagging other, seemingly arbitrary features
> people
> > are relying on in production as experimental in the future? What does
> that
> > mean for their faith in the project and their job security? SASI? LWT?
> > Counters? Triggers? Repair and compaction due to (still arising)
> edge-cases
> > and defects in early re-open and incremental repair? All of these
> features
> > still have edge-cases due to the inherent complexity of the code-base and
> > problem domain in which we work.
> >
> > Right now there appear to be the two camps of 'I can't clearly articulate
> > what Good Enough is since it's Complicated, but I know we're not there'
> and
> > 'if people are relying on it in production without issue it's by
> definition
> > good enough for their use-case'. It's a compromise; nothing is ever
> perfect
> > (as we all know). I'm all for us saying 'We need better testing of
> features
> > going forward', 'We need better metrics for the coverage and branch
> testing
> > of things in C*', etc, and definitely in favor of us spending some time
> to
> > increase our coverage for existing features.
> >
> > I don't think MV's are any different than anything else in this code-base
> > in terms of how well vetted the features are, for better or for worse.
> >
> > On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves <ku...@instaclustr.com>
> wrote:
> >
> >>>
> >>> The flag name `cdc_enabled` is simple and, without adjectives, does not
> >>> imply "experimental" or "beta" or anything like that.
> >>> It does make life easier for both operators and the C* developers.
> >>
> >> I would be all for a mv_enabled option, assuming it's enabled by default
> >> for all existing branches. I don't think saying that you are meant to
> read
> >> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
> >> expecting them to is a bit insane. Also Assuming that if they read it
> >> they'd understand all implications is also a bit questionable. If deemed
> >> suitable to turn it off that can be done in the next major/minor, but I
> >> think that would be unlikely, as we should really require sufficient
> >> evidence that it's dangerous which I just don't think we have. I'm
> still of
> >> the opinion that MV in their current state are no worse off than a lot
> of
> >> other features, and marking them as experimental and disabling now would
> >> just be detrimental to their development and annoy users. Also if we
> give
> >> them that treatment then there a whole load of other defaults we should
> >> change and disable which is just not acceptable in a patch release. It's
> >> not really necessary anyway, we don't have anyone crying bloody murder
> on
> >> the mailing list about how everything went to hell because they used
> >> feature x.
> >>
> >> No one has really provided any counter evidence yet that MV's are in
> some
> >> awful state and they are going to shoot users. There are a few existing
> >> issues that I've brought up already, but they are really quite minor,
> >> nothing comparable to "lol you can't repair if you use vnodes, sorry". I
> >> think we really need some real examples/evidence before making calls
> like
> >> "lets disable this feature in a patch release and mark it experimental"
> >>
> >>> I personally believe it is better to offer the feature as experimental
> >>> until we iron out all of the problems
> >>
> >> What problems are you referring to, and how exactly will we know when
> all
> >> of them have been sufficiently ironed? If we mark it as experimental how
> >> exactly are we going to get people to use said feature to find issues?
> >> 
> >>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

So, as the author of one of the disasters you mention (early re-open), I would prefer to learn from the mistake and not repeat it.  Unfortunately we seem to be in the habit of repeating it, and that feature was a lot *lot* simpler.

Let’s not kid ourselves: MVs are by far and away the most complicated feature we have ever delivered.  We do not fully understand it, even in theory, let alone can we be sure we have the implementation right.

So, if we all agree our testing is ordinarily insufficient, can’t we agree it is probably *really* insufficient here?

I don’t want to give the impression I’m shifting the goals.  I’ve been against MV inclusion as they stand for some time, as were several others.  I think in the new world order of project/community structure, they probably would have been rejected as they stand.

I’ve consistently listed my own requirements for considering them production ready:  extensive modelling and simulation of the algorithm’s properties (in lieu of formal proofs), *safe* default behaviour (rollback CASSANDRA-10230, or make it a per-table option, and default to fast only for existing tables to avoid surprise), tools for detecting and repairing inconsistencies, and more extensive testing.

Many of these things were agreed as prerequisites for release of 3.0, but ultimately they were not delivered.

I do, however, absolutely agree with Sylvain that we need to minimise surprise in a patch version.


On 4 Oct 2017, at 08:58, Josh McKenzie <jm...@apache.org> wrote:

>> and providing a feature we don't fully understand, have not fully
> documented the caveats of, let alone discovered all the problems with nor
> had that knowledge percolate fully into the wider community.
> There appear to be varying levels of understanding of the implementation
> details of MV's (that seem to directly correlate with faith in the
> feature's correctness for the use-cases recommended) on this email thread
> so while I respect a sense of general wariness about the state of
> correctness testing with C*, I don't agree that the thoroughness of testing
> of MV's is any different than any other feature we've added to the
> code-base since the project's inception.
> 
> That's not to say I think the current extent of our testing before GA on
> features is adequate; I don't, but I don't think it makes sense to draw an
> arbitrary line in the sand with already released features that are in use
> in production clusters, flagging said features as experimental after the
> fact, and thus eroding users' trust in our collective definition of done.
> What's to stop us from flagging other, seemingly arbitrary features people
> are relying on in production as experimental in the future? What does that
> mean for their faith in the project and their job security? SASI? LWT?
> Counters? Triggers? Repair and compaction due to (still arising) edge-cases
> and defects in early re-open and incremental repair? All of these features
> still have edge-cases due to the inherent complexity of the code-base and
> problem domain in which we work.
> 
> Right now there appear to be the two camps of 'I can't clearly articulate
> what Good Enough is since it's Complicated, but I know we're not there' and
> 'if people are relying on it in production without issue it's by definition
> good enough for their use-case'. It's a compromise; nothing is ever perfect
> (as we all know). I'm all for us saying 'We need better testing of features
> going forward', 'We need better metrics for the coverage and branch testing
> of things in C*', etc, and definitely in favor of us spending some time to
> increase our coverage for existing features.
> 
> I don't think MV's are any different than anything else in this code-base
> in terms of how well vetted the features are, for better or for worse.
> 
> On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves <ku...@instaclustr.com> wrote:
> 
>>> 
>>> The flag name `cdc_enabled` is simple and, without adjectives, does not
>>> imply "experimental" or "beta" or anything like that.
>>> It does make life easier for both operators and the C* developers.
>> 
>> I would be all for a mv_enabled option, assuming it's enabled by default
>> for all existing branches. I don't think saying that you are meant to read
>> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
>> expecting them to is a bit insane. Also Assuming that if they read it
>> they'd understand all implications is also a bit questionable. If deemed
>> suitable to turn it off that can be done in the next major/minor, but I
>> think that would be unlikely, as we should really require sufficient
>> evidence that it's dangerous which I just don't think we have. I'm still of
>> the opinion that MV in their current state are no worse off than a lot of
>> other features, and marking them as experimental and disabling now would
>> just be detrimental to their development and annoy users. Also if we give
>> them that treatment then there a whole load of other defaults we should
>> change and disable which is just not acceptable in a patch release. It's
>> not really necessary anyway, we don't have anyone crying bloody murder on
>> the mailing list about how everything went to hell because they used
>> feature x.
>> 
>> No one has really provided any counter evidence yet that MV's are in some
>> awful state and they are going to shoot users. There are a few existing
>> issues that I've brought up already, but they are really quite minor,
>> nothing comparable to "lol you can't repair if you use vnodes, sorry". I
>> think we really need some real examples/evidence before making calls like
>> "lets disable this feature in a patch release and mark it experimental"
>> 
>>> I personally believe it is better to offer the feature as experimental
>>> until we iron out all of the problems
>> 
>> What problems are you referring to, and how exactly will we know when all
>> of them have been sufficiently ironed? If we mark it as experimental how
>> exactly are we going to get people to use said feature to find issues?
>> 
>>

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

> and providing a feature we don't fully understand, have not fully
documented the caveats of, let alone discovered all the problems with nor
had that knowledge percolate fully into the wider community.
There appear to be varying levels of understanding of the implementation
details of MV's (that seem to directly correlate with faith in the
feature's correctness for the use-cases recommended) on this email thread
so while I respect a sense of general wariness about the state of
correctness testing with C*, I don't agree that the thoroughness of testing
of MV's is any different than any other feature we've added to the
code-base since the project's inception.

That's not to say I think the current extent of our testing before GA on
features is adequate; I don't, but I don't think it makes sense to draw an
arbitrary line in the sand with already released features that are in use
in production clusters, flagging said features as experimental after the
fact, and thus eroding users' trust in our collective definition of done.
What's to stop us from flagging other, seemingly arbitrary features people
are relying on in production as experimental in the future? What does that
mean for their faith in the project and their job security? SASI? LWT?
Counters? Triggers? Repair and compaction due to (still arising) edge-cases
and defects in early re-open and incremental repair? All of these features
still have edge-cases due to the inherent complexity of the code-base and
problem domain in which we work.

Right now there appear to be the two camps of 'I can't clearly articulate
what Good Enough is since it's Complicated, but I know we're not there' and
'if people are relying on it in production without issue it's by definition
good enough for their use-case'. It's a compromise; nothing is ever perfect
(as we all know). I'm all for us saying 'We need better testing of features
going forward', 'We need better metrics for the coverage and branch testing
of things in C*', etc, and definitely in favor of us spending some time to
increase our coverage for existing features.

I don't think MV's are any different than anything else in this code-base
in terms of how well vetted the features are, for better or for worse.

On Wed, Oct 4, 2017 at 5:21 AM, kurt greaves <ku...@instaclustr.com> wrote:

> >
> > The flag name `cdc_enabled` is simple and, without adjectives, does not
> > imply "experimental" or "beta" or anything like that.
> > It does make life easier for both operators and the C* developers.
>
> I would be all for a mv_enabled option, assuming it's enabled by default
> for all existing branches. I don't think saying that you are meant to read
> NEWS.txt before upgrading a patch is acceptable. Most people don't, and
> expecting them to is a bit insane. Also Assuming that if they read it
> they'd understand all implications is also a bit questionable. If deemed
> suitable to turn it off that can be done in the next major/minor, but I
> think that would be unlikely, as we should really require sufficient
> evidence that it's dangerous which I just don't think we have. I'm still of
> the opinion that MV in their current state are no worse off than a lot of
> other features, and marking them as experimental and disabling now would
> just be detrimental to their development and annoy users. Also if we give
> them that treatment then there a whole load of other defaults we should
> change and disable which is just not acceptable in a patch release. It's
> not really necessary anyway, we don't have anyone crying bloody murder on
> the mailing list about how everything went to hell because they used
> feature x.
>
> No one has really provided any counter evidence yet that MV's are in some
> awful state and they are going to shoot users. There are a few existing
> issues that I've brought up already, but they are really quite minor,
> nothing comparable to "lol you can't repair if you use vnodes, sorry". I
> think we really need some real examples/evidence before making calls like
> "lets disable this feature in a patch release and mark it experimental"
>
> >  I personally believe it is better to offer the feature as experimental
> > until we iron out all of the problems
>
> What problems are you referring to, and how exactly will we know when all
> of them have been sufficiently ironed? If we mark it as experimental how
> exactly are we going to get people to use said feature to find issues?
> 
>

Re: Proposal to retroactively mark materialized views experimental

Posted by kurt greaves <ku...@instaclustr.com>.

>
> The flag name `cdc_enabled` is simple and, without adjectives, does not
> imply "experimental" or "beta" or anything like that.
> It does make life easier for both operators and the C* developers.

I would be all for a mv_enabled option, assuming it's enabled by default
for all existing branches. I don't think saying that you are meant to read
NEWS.txt before upgrading a patch is acceptable. Most people don't, and
expecting them to is a bit insane. Also Assuming that if they read it
they'd understand all implications is also a bit questionable. If deemed
suitable to turn it off that can be done in the next major/minor, but I
think that would be unlikely, as we should really require sufficient
evidence that it's dangerous which I just don't think we have. I'm still of
the opinion that MV in their current state are no worse off than a lot of
other features, and marking them as experimental and disabling now would
just be detrimental to their development and annoy users. Also if we give
them that treatment then there a whole load of other defaults we should
change and disable which is just not acceptable in a patch release. It's
not really necessary anyway, we don't have anyone crying bloody murder on
the mailing list about how everything went to hell because they used
feature x.

No one has really provided any counter evidence yet that MV's are in some
awful state and they are going to shoot users. There are a few existing
issues that I've brought up already, but they are really quite minor,
nothing comparable to "lol you can't repair if you use vnodes, sorry". I
think we really need some real examples/evidence before making calls like
"lets disable this feature in a patch release and mark it experimental"

>  I personally believe it is better to offer the feature as experimental
> until we iron out all of the problems

What problems are you referring to, and how exactly will we know when all
of them have been sufficiently ironed? If we mark it as experimental how
exactly are we going to get people to use said feature to find issues?

Re: Proposal to retroactively mark materialized views experimental

Posted by Jonathan Haddad <jo...@jonhaddad.com>.

I agree with Aleksey on all points here. Adding that we should update the
docs with warnings about the potential issues with correctness.
On Wed, Oct 4, 2017 at 8:25 AM Aleksey Yeshchenko <al...@apple.com> wrote:

> We already have those for UDFs and CDC.
>
> We should have more: for triggers, SASI, and MVs, at least. Operators need
> a way to disable features they haven’t validated.
>
> We already have sufficient consensus to introduce the flags, and we
> should. There also seems to be sufficient consensus on emitting warnings.
>
> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree
> with Sylvain that flipping the default in a minor would be invasive. We
> shouldn’t do that.
>
> For trunk, though, I think we should default to off. When it comes to
> releasing 4.0 we can collectively decide if there is sufficient trust in
> MVs at the time to warrant flipping the default to true. Ultimately we can
> decide this in a PMC vote. If I misread the consensus regarding the default
> for 4.0, then we might as well vote on that. What I see is sufficient
> distrust coming from core committers, including the author of the v1
> design, to warrant opt-in for MVs.
>
> If we don’t trust in them as developers, we shouldn’t be cavalier with the
> users, either. Not until that trust is gained/regained.
>
> —
> AY
>
> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org) wrote:
>
> Introducing feature flags for enabling or disabling different code paths
> is not sustainable in the long run. It's hard enough to keep up with
> integration testing with the couple of Jenkins jobs that we have.
> Running jobs for all permutations of flags that we keep around, would
> turn out impractical. But if we don't, I'm pretty sure something will
> fall off the radar and it won't take long until someone reports that
> enabling feature X after the latest upgrade will simply not work anymore.
>
> There may also be some more subtle assumptions and cross dependencies
> between features that may cause side effects by disabling a feature (or
> parts of it), even if it's just e.g. a metric value that suddenly won't
> get updated anymore, but is used somewhere else. We'll also have to
> consider migration paths for turning a feature on and off again without
> causing any downtime. If I was to turn on e.g. MVs on a single node in
> my cluster, then this should not cause any issues on the other nodes
> that still have MV code paths disabled. Again, this would need to be
> tested.
>
> So to be clear, my point is that any flags should be implemented in a
> really non-invasive way on the user facing side only, e.g. by emitting a
> log message or cqlsh error. At this point, I'm not really sure if it
> would be a good idea to add them to cassandra.yaml, as I'm pretty sure
> that eventually they will be used to change the behaviour of our code,
> beside printing a log message.
>
>
> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>> CDC sounds like it is in the same basket, but it already has the
> >>> `cdc_enabled` yaml flag which defaults false.
> >> I went this route because I was incredibly wary of changing the CL
> >> code and wanted to shield non-CDC users from any and all risk I
> >> reasonably could.
> >
> > This approach so far is my favourite. (Thanks Josh.)
> >
> > The flag name `cdc_enabled` is simple and, without adjectives, does not
> > imply "experimental" or "beta" or anything like that.
> > It does make life easier for both operators and the C* developers.
> >
> > I'm also fond of how Apache projects often vote both on the release as
> well
> > as its stability flag: Alpha|Beta|GA (General Availability).
> > https://httpd.apache.org/dev/release.html
> > http://www.apache.org/legal/release-policy.html#release-types
> >
> > Given the importance of The Database, i'd be keen to see attached such
> > community-agreed quality references. And going further, not just to the
> > releases but also to substantial new features (those yet to reach GA).
> Then
> > the downloads page could provide a table something like
> > https://paste.apache.org/FzrQ
> >
> > It's just one idea to throw out there, and while it hijacks the thread a
> > bit, it could even with just the quality tag on releases go a long way
> with
> > user trust. Especially if we really are humble about it and use GA
> > appropriately. For example I'm perfectly happy using a beta in production
> > if I see the community otherwise has good processes in place and there's
> > strong testing and staging resources to take advantage of. And as Kurt
> has
> > implied many users are indeed smart and wise enough to know how to safely
> > test and cautiously use even alpha features in production.
> >
> > Anyway, with or without the above idea, yaml flag names that don't
> > use adjectives could address Kurt's concerns about pulling the rug from
> > under the feet of existing users. Such a flag is but a small improvement
> > suitable for a minor release (you must read the NEWS.txt before even a
> > patch upgrade), and the documentation is only making explicit what should
> > have been all along. Users shouldn't feel that we're returning features
> > into "alpha|beta" mode when what we're actually doing is improving the
> > community's quality assurance documentation.
> >
> > Mick
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

Yep. Almost!

Changing the default in a minor version might break some scripts/tooling that manipulate schema and potentially create new MVs (as dangerous as it currently is - manipulating schema in that way), and that would still not be very nice.

Introducing a flag and leaving it at false in a minor is harmless.

—
AY

On 4 October 2017 at 18:24:16, Stefan Podkowinski (spod@apache.org) wrote:

If "disabling a feature" is just about preventing some CQL from 
execution along with a warning log message, I'm fine with that. But if 
that's being the case, I don't really understand why making this change 
in a minor version would be a problem, since existing MVs wouldn't be 
affected anyways and should just work as before, even with the enabled 
flag set to false.

Re: Proposal to retroactively mark materialized views experimental

Posted by Stefan Podkowinski <sp...@apache.org>.

If "disabling a feature" is just about preventing some CQL from
execution along with a warning log message, I'm fine with that. But if
that's being the case, I don't really understand why making this change
in a minor version would be a problem, since existing MVs wouldn't be
affected anyways and should just work as before, even with the enabled
flag set to false.


On 04.10.17 17:24, Aleksey Yeshchenko wrote:
> We already have those for UDFs and CDC.
>
> We should have more: for triggers, SASI, and MVs, at least. Operators need a way to disable features they haven’t validated.
>
> We already have sufficient consensus to introduce the flags, and we should. There also seems to be sufficient consensus on emitting warnings.
>
> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree with Sylvain that flipping the default in a minor would be invasive. We shouldn’t do that.
>
> For trunk, though, I think we should default to off. When it comes to releasing 4.0 we can collectively decide if there is sufficient trust in MVs at the time to warrant flipping the default to true. Ultimately we can decide this in a PMC vote. If I misread the consensus regarding the default for 4.0, then we might as well vote on that. What I see is sufficient distrust coming from core committers, including the author of the v1 design, to warrant opt-in for MVs.
>
> If we don’t trust in them as developers, we shouldn’t be cavalier with the users, either. Not until that trust is gained/regained.
>
> —
> AY
>
> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org) wrote:
>
> Introducing feature flags for enabling or disabling different code paths  
> is not sustainable in the long run. It's hard enough to keep up with  
> integration testing with the couple of Jenkins jobs that we have.  
> Running jobs for all permutations of flags that we keep around, would  
> turn out impractical. But if we don't, I'm pretty sure something will  
> fall off the radar and it won't take long until someone reports that  
> enabling feature X after the latest upgrade will simply not work anymore.  
>
> There may also be some more subtle assumptions and cross dependencies  
> between features that may cause side effects by disabling a feature (or  
> parts of it), even if it's just e.g. a metric value that suddenly won't  
> get updated anymore, but is used somewhere else. We'll also have to  
> consider migration paths for turning a feature on and off again without  
> causing any downtime. If I was to turn on e.g. MVs on a single node in  
> my cluster, then this should not cause any issues on the other nodes  
> that still have MV code paths disabled. Again, this would need to be tested.  
>
> So to be clear, my point is that any flags should be implemented in a  
> really non-invasive way on the user facing side only, e.g. by emitting a  
> log message or cqlsh error. At this point, I'm not really sure if it  
> would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
> that eventually they will be used to change the behaviour of our code,  
> beside printing a log message.  
>
>
> On 04.10.17 10:03, Mick Semb Wever wrote:  
>>>> CDC sounds like it is in the same basket, but it already has the  
>>>> `cdc_enabled` yaml flag which defaults false.  
>>> I went this route because I was incredibly wary of changing the CL  
>>> code and wanted to shield non-CDC users from any and all risk I  
>>> reasonably could.  
>>  
>> This approach so far is my favourite. (Thanks Josh.)  
>>  
>> The flag name `cdc_enabled` is simple and, without adjectives, does not  
>> imply "experimental" or "beta" or anything like that.  
>> It does make life easier for both operators and the C* developers.  
>>  
>> I'm also fond of how Apache projects often vote both on the release as well  
>> as its stability flag: Alpha|Beta|GA (General Availability).  
>> https://httpd.apache.org/dev/release.html  
>> http://www.apache.org/legal/release-policy.html#release-types  
>>  
>> Given the importance of The Database, i'd be keen to see attached such  
>> community-agreed quality references. And going further, not just to the  
>> releases but also to substantial new features (those yet to reach GA). Then  
>> the downloads page could provide a table something like  
>> https://paste.apache.org/FzrQ  
>>  
>> It's just one idea to throw out there, and while it hijacks the thread a  
>> bit, it could even with just the quality tag on releases go a long way with  
>> user trust. Especially if we really are humble about it and use GA  
>> appropriately. For example I'm perfectly happy using a beta in production  
>> if I see the community otherwise has good processes in place and there's  
>> strong testing and staging resources to take advantage of. And as Kurt has  
>> implied many users are indeed smart and wise enough to know how to safely  
>> test and cautiously use even alpha features in production.  
>>  
>> Anyway, with or without the above idea, yaml flag names that don't  
>> use adjectives could address Kurt's concerns about pulling the rug from  
>> under the feet of existing users. Such a flag is but a small improvement  
>> suitable for a minor release (you must read the NEWS.txt before even a  
>> patch upgrade), and the documentation is only making explicit what should  
>> have been all along. Users shouldn't feel that we're returning features  
>> into "alpha|beta" mode when what we're actually doing is improving the  
>> community's quality assurance documentation.  
>>  
>> Mick  
>>  
>
> ---------------------------------------------------------------------  
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> For additional commands, e-mail: dev-help@cassandra.apache.org  
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by kurt greaves <ku...@instaclustr.com>.

>
> My concerns have consistently been more fundamental - the basic properties
> of a theoretically bug-free MV are simultaneously problematic and unknown.
> How can we say we have a working feature, when we our definition of
> ‘working’ is unknown?

Well, we know to some extent that MV's are theoretically possible, because
they work as is. There might be limitations to the design, as I know you've
raised before and were ignored, if that's what you're referring to here.
But for the most part they have been working just fine bar implementation
bugs. I wouldn't go as far as saying our definition of working is unknown,
that's a tad over the top. As I've already said, if you're aware of some
serious fundamental flaws, it shouldn't be too difficult to point them out.
I'd suspect they would have already been encountered so evidence shouldn't
be hard to find.

We have unsafe defaults

Yeah we do, but don't be changing them in a patch release. For the record,
MV's aren't something that is automatically created that everyone has to
deal with, it's still opt-in. I think incremental repairs and vnodes are
far worse defaults, and cause much more damage.

Nobody in the wild is even using MVs in the way they were designed, because
> they were too slow.  In no other endeavour have we gone “well, it’s too
> slow, so let’s just accept data loss by default”
>
 How exactly can you use MVs in a way different from how they were
designed? I don't understand, all the use cases I've seen have been
perfectly legitimate. I mean, you can use them with a bad data model, but I
wouldn't say that's not the way they were designed. That's possible for
normal tables as well.

In no other endeavour have we gone “well, it’s too slow, so let’s just
> accept data loss by default”
>
 Most people we've talked to prefer the simplicity of using MV's over the
speed. I don't know what the sentiment was at the beginning, but in the
past several months we've only cared about correctness, not speed.


Anyway, if there are fundamental flaws then they need to be addressed. As
soon as possible. I'm still going to tell you that telling existing users
who have already gone down this path that the feature is no longer
supported in production is a terrible idea. You are telling users "actually
you need to stop using that and re-write all your applications and do it
all yourself because we screwed up". It doesn't magically remove any fault
on our part for putting it in in the first place. The correct solution for
us and the users is to fix these issues, as soon as possible, and alert
everyone to the fact that these problems can occur. If we can't fix it,
then we can turn it off in the next major, otherwise we should just note
the flaws and restrict any functionality that just isn't going to work.

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

Kurt, we seem to be talking past each other.  While I am concerned about implementation bugs - which I am certain still exist - I have not at any point raised this issue.

My concerns have consistently been more fundamental - the basic properties of a theoretically bug-free MV are simultaneously problematic and unknown.  How can we say we have a working feature, when we our definition of ‘working’ is unknown?

We have unsafe defaults. Nobody in the wild is even using MVs in the way they were designed, because they were too slow.  In no other endeavour have we gone “well, it’s too slow, so let’s just accept data loss by default”
The only analysis that I know of to be done on the properties of MVs is my own, and I declared it insufficient after finding multiple surprising limitations in the design that most operators would not expect.  We have not even conveyed these known limitations to them.
It was agreed that further analysis would be done.  It has not been done, and might have uncovered some of the surprising MV timestamp behaviours.

We should without question be restoring safe default behaviour.  This would likely discourage everyone from using MVs, though, because of the performance.  So nobody wants to do it.  This is just unacceptable to me, and I’ve yet to hear anybody engage with this.

We should without question know the properties of the feature so we can document them.  This is hard, so nobody wants to do it.  Similarly, I’ve yet to hear a compelling response.

> On 5 Oct 2017, at 04:40, Aleksey Yeshchenko <al...@apple.com> wrote:
> 
> Auth/roles has a non-negligible impact on performance, and isn’t otherwise required by most users. And for some operators getting involved at CQL layer isn’t really viable. If you have hundreds/thousands of clusters and only control the yaml and configs.
> 
> Adding another flag is something we can do today, and in a minor. The capability framework work seems to be abandoned at the moment, and in the best case scenario will only happen in 4.0, if at all.
> 
> —
> AY
> 
> On 5 October 2017 at 01:19:25, kurt greaves (kurt@instaclustr.com) wrote:
> 
> Operators do need a 
> way to disable features, but it makes a lot more sense to have that as part 
> of the auth/roles system rather than yaml properties.

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

Auth/roles has a non-negligible impact on performance, and isn’t otherwise required by most users. And for some operators getting involved at CQL layer isn’t really viable. If you have hundreds/thousands of clusters and only control the yaml and configs.

Adding another flag is something we can do today, and in a minor. The capability framework work seems to be abandoned at the moment, and in the best case scenario will only happen in 4.0, if at all.

—
AY

On 5 October 2017 at 01:19:25, kurt greaves (kurt@instaclustr.com) wrote:

Operators do need a 
way to disable features, but it makes a lot more sense to have that as part 
of the auth/roles system rather than yaml properties.

Re: Proposal to retroactively mark materialized views experimental

Posted by kurt greaves <ku...@instaclustr.com>.

>
> So you’d rather continue to lie to users about the stability of the
> feature rather than admitting it was merged in prematurely?

It was merged prematurely, but a lot has changed since then and a lot of
fixes have been made, and now it's really no more awful than any other
component of Cassandra. A lot of the commentary in here is coming from
people who have had no part of the recent changes to MV's, and as 3.11.1 is
not even out yet I doubt anyone really has any idea how stable they
currently are. In fact I doubt many people have even operated or consulted
on clusters with the most recent versions of 3.0 or 3.11. It's kind of
really annoying that this has come up now, after we've already done a lot
of the work to fix the known issues, and everyone seems to just be saying
"they are so broken" but no one can really provide any evidence why.

Ideally, we’d follow a process that looks a lot more like this:
> 1. New feature is built with an opt in flag.  Unknowns are documented, the
> risk of using the feature is known to the end user.
> 2. People test and use the feature that know what they’re doing.  They are
> able to read the code, submit patches, and help flush out the issues.  They
> do so in low risk environments.  In the case of MVs, they can afford to
> drop and rebuild the view over a week, or rebuild the cluster altogether.
> We may not even need to worry as much about backwards compatibility.
> 3. The feature matures.  More tests are written.  More people become aware
> of how to contribute to the feature’s stability.
> 4. After a while, we vote on removing the feature flag and declare it
> stable for general usage.


No I don't think this works very well for Cassandra because features are
often heavily intertwined with other components of Cassandra, and often a
new feature relies on making changes to other components of Cassandra. At
least this is true for any feature that is large enough to justify having
an opt-in flag. This will lead down the path of "oh it's only
experimental/opt-in so we don't need to worry about testing every single
component", which is wrong.
We provide a database, and users expect stability from all aspects of the
database, at all times. We should be working to fix bugs early so we can
have confidence in the entire database from very early in the release
branch. We shouldn't provide a database that people will say "don't use it
in production until at least the .15 patch release".

What I see is sufficient distrust coming from core committers, including
> the author of the v1 design, to warrant opt-in for MVs.

Core committers who have had almost nothing to do with MV since quite some
time ago.  Also I'm skeptical of how much first hand experience these core
committers have with MV's.

We already have those for UDFs and CDC.
> We should have more: for triggers, SASI, and MVs, at least. Operators need
> a way to disable features they haven’t validated.

After a bit more thought I've changed my mind on this. Operators do need a
way to disable features, but it makes a lot more sense to have that as part
of the auth/roles system rather than yaml properties. Plus as previously
noted, I'm not of the opinion we should release features (even in a
beta/experimental form) at all, and we should be reasonably confident in
the entire system and any new features being introduced prior to releasing
them. We should also better practice incremental release of features,
starting with a bare minimum, or subset of what we want the end product to
be, rather than releasing a massive change and then calling it experimental
for years until we can somehow deduce that it is stable enough. This could
have been done for MV's by starting with an append only use case, and then
moving onto the more complex transactional use case.

Re: Proposal to retroactively mark materialized views experimental

Posted by Pavel Yaskevich <po...@gmail.com>.

On Wed, Oct 4, 2017 at 12:23 PM, Jon Haddad <jo...@jonhaddad.com> wrote:

> The default part I was referring to incremental repair.
>
> SASI still has a pretty fatal issue where nodes OOM:
> https://issues.apache.org/jira/browse/CASSANDRA-12662 <
> https://issues.apache.org/jira/browse/CASSANDRA-12662>
>

If you read the comments in the issue originator of the problem states
that "Cassandra fairly quickly crashes with OOM, a glance over hprof shows
4Gb of PartitionUpdates." which to me doesn't seem like it's a SASI issue
but more of the issue of underlaying storage which SASI uses.



>
>
> > On Oct 4, 2017, at 12:21 PM, Pavel Yaskevich <po...@gmail.com> wrote:
> >
> > On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad <jon@jonhaddad.com <mailto:
> jon@jonhaddad.com>> wrote:
> >
> >> MVs work fine for *some use cases*, not the general use case.  That’s
> why
> >> there should be a flag.  To opt into the feature when the behavior is
> only
> >> known to be correct under a certain set of circumstances.  Nobody is
> saying
> >> the flag should be “enable_terrible_feature_
> nobody_tested_and_we_all_hate”,
> >> or something ridiculous like that.  It’s not an attack against the work
> >> done by anyone, the level of effort put in, or minimizing the
> complexity of
> >> the problem.  “enable_materialized_views” would be just fine.
> >>
> >> We should be honest to people about what they’re getting into.  You may
> >> not be aware of this, but a lot of people still believe Cassandra isn’t
> a
> >> DB that you should put in prod.  It’s because features like SASI, MVs,
> or
> >> incremental repair get merged in prematurely (or even made the default),
> >> without having been thoroughly tested, understood and vetted by trusted
> >> community members.  New users hit the snags because they deploy the
> >> bleeding edge code and hit the bugs.
> >>
> >
> > I beg to differ in case of SASI, it has been tested and vetted and ported
> > to different versions. I'm pretty sure it still has better test coverage
> > then most of the project does, it's not a "default" and you actually have
> > to opt-in to it by creating a custom index, how is that premature or
> > misleading to users?
> >
> >
> >>
> >> That’s not how the process should work.
> >>
> >> Ideally, we’d follow a process that looks a lot more like this:
> >>
> >> 1. New feature is built with an opt in flag.  Unknowns are documented,
> the
> >> risk of using the feature is known to the end user.
> >> 2. People test and use the feature that know what they’re doing.  They
> are
> >> able to read the code, submit patches, and help flush out the issues.
> They
> >> do so in low risk environments.  In the case of MVs, they can afford to
> >> drop and rebuild the view over a week, or rebuild the cluster
> altogether.
> >> We may not even need to worry as much about backwards compatibility.
> >> 3. The feature matures.  More tests are written.  More people become
> aware
> >> of how to contribute to the feature’s stability.
> >> 4. After a while, we vote on removing the feature flag and declare it
> >> stable for general usage.
> >>
> >> If nobody actually cares about a feature (why it was it written in the
> >> first place?), then it would never get to 2, 3, 4.  It would take a
> while
> >> for big features like MVs to be marked stable, and that’s fine, because
> it
> >> takes a long time to actually stabilize them.  I think we can all agree
> >> they are really, really hard problems to solve, and maybe it takes a
> while.
> >>
> >> Jon
> >>
> >>
> >>
> >>> On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jm...@apache.org>
> wrote:
> >>>
> >>>>
> >>>> So you’d rather continue to lie to users about the stability of the
> >>>> feature rather than admitting it was merged in prematurely?
> >>>
> >>>
> >>> Much like w/SASI, this is something that's in the code-base that for
> >>>> certain use-cases apparently works just fine.
> >>>
> >>> I don't know of any outstanding issues with the feature,
> >>>
> >>> There appear to be varying levels of understanding of the
> implementation
> >>>> details of MV's (that seem to directly correlate with faith in the
> >>>> feature's correctness for the use-cases recommended)
> >>>
> >>> We have users in the wild relying on MV's with apparent success (same
> >> holds
> >>>> true of all the other punching bags that have come up in this thread)
> >>>
> >>> You're right, Jon. That's clearly exactly what I'm saying.
> >>>
> >>>
> >>> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
> >>>
> >>>> So you’d rather continue to lie to users about the stability of the
> >>>> feature rather than admitting it was merged in prematurely?  I’d
> rather
> >>>> come clean and avoid future problems, and give people the opportunity
> to
> >>>> stop using MVs rather than let them keep taking risks they’re unaware
> >> of.
> >>>> This is incredibly irresponsible in my opinion.
> >>>>
> >>>>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jm...@apache.org>
> >> wrote:
> >>>>>
> >>>>>>
> >>>>>> Oh, come on. You're being disingenuous.
> >>>>>
> >>>>> Not my intent. MV's (and SASI, for example) are fairly well isolated;
> >> we
> >>>>> have a history of other changes that are much more broadly and higher
> >>>>> impact risk-wise across the code-base.
> >>>>>
> >>>>> If I were an operator and built a critical part of my business on a
> >>>>> released feature that developers then decided to default-disable as
> >>>>> 'experimental' post-hoc, I'd think long and hard about using any new
> >>>>> features in that project in the future (and revisit my confidence in
> >> all
> >>>>> other features I relied on, and the software as a whole). We have
> users
> >>>> in
> >>>>> the wild relying on MV's with apparent success (same holds true of
> all
> >>>> the
> >>>>> other punching bags that have come up in this thread) and I'd hate to
> >> see
> >>>>> us alienate them by being over-aggressive in the way we handle this.
> >>>>>
> >>>>> I'd much rather we continue to aggressively improve and continue to
> >>>> analyze
> >>>>> MV's stability before a 4.0 release and then use the experimental
> flag
> >> in
> >>>>> the future, if at all possible.
> >>>>>
> >>>>> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
> >>>> belliottsmith.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Can't we promote these behavioural flags to keyspace properties
> (with
> >>>>>> suitable permissions to edit necessary)?
> >>>>>>
> >>>>>> I agree that enabling/disabling features shouldn't require a rolling
> >>>>>> restart, and nor should switching their consistency safety level.
> >>>>>>
> >>>>>> I think this would be the most suitable equivalent to ALLOW
> FILTERING
> >>>> for
> >>>>>> MVs.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Not to detract from the discussion about whether or not to
> classify X
> >>>> or
> >>>>>> Y as experimental but https://issues.apache.org/
> >>>> jira/browse/CASSANDRA-8303
> >>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was
> originally
> >>>>>> about operators preventing users from abusing features (e.g. allow
> >>>>>> filtering).  Could that concept be extended to features like MVs or
> >>>> SASI or
> >>>>>> anything else?  On the one hand it is nice to be able to set those
> >>>> things
> >>>>>> dynamically without a rolling restart as well as by user.  On the
> >> other
> >>>>>> it’s less clear about defaults.  There could be a property file or
> >> just
> >>>> in
> >>>>>> the yaml, the operator could specify the default features that are
> >>>> enabled
> >>>>>> for users and then it could be overridden within that framework.
> >>>>>>>
> >>>>>>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <
> aleksey@apple.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>> We already have those for UDFs and CDC.
> >>>>>>>>
> >>>>>>>> We should have more: for triggers, SASI, and MVs, at least.
> >> Operators
> >>>>>> need a way to disable features they haven’t validated.
> >>>>>>>>
> >>>>>>>> We already have sufficient consensus to introduce the flags, and
> we
> >>>>>> should. There also seems to be sufficient consensus on emitting
> >>>> warnings.
> >>>>>>>>
> >>>>>>>> The debate is now on their defaults for MVs in 3.0, 3.11, and
> 4.0. I
> >>>>>> agree with Sylvain that flipping the default in a minor would be
> >>>> invasive.
> >>>>>> We shouldn’t do that.
> >>>>>>>>
> >>>>>>>> For trunk, though, I think we should default to off. When it comes
> >> to
> >>>>>> releasing 4.0 we can collectively decide if there is sufficient
> trust
> >> in
> >>>>>> MVs at the time to warrant flipping the default to true. Ultimately
> we
> >>>> can
> >>>>>> decide this in a PMC vote. If I misread the consensus regarding the
> >>>> default
> >>>>>> for 4.0, then we might as well vote on that. What I see is
> sufficient
> >>>>>> distrust coming from core committers, including the author of the v1
> >>>>>> design, to warrant opt-in for MVs.
> >>>>>>>>
> >>>>>>>> If we don’t trust in them as developers, we shouldn’t be cavalier
> >> with
> >>>>>> the users, either. Not until that trust is gained/regained.
> >>>>>>>>
> >>>>>>>> —
> >>>>>>>> AY
> >>>>>>>>
> >>>>>>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (
> spod@apache.org)
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Introducing feature flags for enabling or disabling different code
> >>>> paths
> >>>>>>>> is not sustainable in the long run. It's hard enough to keep up
> with
> >>>>>>>> integration testing with the couple of Jenkins jobs that we have.
> >>>>>>>> Running jobs for all permutations of flags that we keep around,
> >> would
> >>>>>>>> turn out impractical. But if we don't, I'm pretty sure something
> >> will
> >>>>>>>> fall off the radar and it won't take long until someone reports
> that
> >>>>>>>> enabling feature X after the latest upgrade will simply not work
> >>>>>> anymore.
> >>>>>>>>
> >>>>>>>> There may also be some more subtle assumptions and cross
> >> dependencies
> >>>>>>>> between features that may cause side effects by disabling a
> feature
> >>>> (or
> >>>>>>>> parts of it), even if it's just e.g. a metric value that suddenly
> >>>> won't
> >>>>>>>> get updated anymore, but is used somewhere else. We'll also have
> to
> >>>>>>>> consider migration paths for turning a feature on and off again
> >>>> without
> >>>>>>>> causing any downtime. If I was to turn on e.g. MVs on a single
> node
> >> in
> >>>>>>>> my cluster, then this should not cause any issues on the other
> nodes
> >>>>>>>> that still have MV code paths disabled. Again, this would need to
> be
> >>>>>> tested.
> >>>>>>>>
> >>>>>>>> So to be clear, my point is that any flags should be implemented
> in
> >> a
> >>>>>>>> really non-invasive way on the user facing side only, e.g. by
> >>>> emitting a
> >>>>>>>> log message or cqlsh error. At this point, I'm not really sure if
> it
> >>>>>>>> would be a good idea to add them to cassandra.yaml, as I'm pretty
> >> sure
> >>>>>>>> that eventually they will be used to change the behaviour of our
> >> code,
> >>>>>>>> beside printing a log message.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>>>>>>>>>> CDC sounds like it is in the same basket, but it already has
> the
> >>>>>>>>>>> `cdc_enabled` yaml flag which defaults false.
> >>>>>>>>>> I went this route because I was incredibly wary of changing the
> CL
> >>>>>>>>>> code and wanted to shield non-CDC users from any and all risk I
> >>>>>>>>>> reasonably could.
> >>>>>>>>>
> >>>>>>>>> This approach so far is my favourite. (Thanks Josh.)
> >>>>>>>>>
> >>>>>>>>> The flag name `cdc_enabled` is simple and, without adjectives,
> does
> >>>> not
> >>>>>>>>> imply "experimental" or "beta" or anything like that.
> >>>>>>>>> It does make life easier for both operators and the C*
> developers.
> >>>>>>>>>
> >>>>>>>>> I'm also fond of how Apache projects often vote both on the
> release
> >>>> as
> >>>>>> well
> >>>>>>>>> as its stability flag: Alpha|Beta|GA (General Availability).
> >>>>>>>>> https://httpd.apache.org/dev/release.html
> >>>>>>>>> http://www.apache.org/legal/release-policy.html#release-types
> >>>>>>>>>
> >>>>>>>>> Given the importance of The Database, i'd be keen to see attached
> >>>> such
> >>>>>>>>> community-agreed quality references. And going further, not just
> to
> >>>> the
> >>>>>>>>> releases but also to substantial new features (those yet to reach
> >>>> GA).
> >>>>>> Then
> >>>>>>>>> the downloads page could provide a table something like
> >>>>>>>>> https://paste.apache.org/FzrQ
> >>>>>>>>>
> >>>>>>>>> It's just one idea to throw out there, and while it hijacks the
> >>>> thread
> >>>>>> a
> >>>>>>>>> bit, it could even with just the quality tag on releases go a
> long
> >>>> way
> >>>>>> with
> >>>>>>>>> user trust. Especially if we really are humble about it and use
> GA
> >>>>>>>>> appropriately. For example I'm perfectly happy using a beta in
> >>>>>> production
> >>>>>>>>> if I see the community otherwise has good processes in place and
> >>>>>> there's
> >>>>>>>>> strong testing and staging resources to take advantage of. And as
> >>>> Kurt
> >>>>>> has
> >>>>>>>>> implied many users are indeed smart and wise enough to know how
> to
> >>>>>> safely
> >>>>>>>>> test and cautiously use even alpha features in production.
> >>>>>>>>>
> >>>>>>>>> Anyway, with or without the above idea, yaml flag names that
> don't
> >>>>>>>>> use adjectives could address Kurt's concerns about pulling the
> rug
> >>>> from
> >>>>>>>>> under the feet of existing users. Such a flag is but a small
> >>>>>> improvement
> >>>>>>>>> suitable for a minor release (you must read the NEWS.txt before
> >> even
> >>>> a
> >>>>>>>>> patch upgrade), and the documentation is only making explicit
> what
> >>>>>> should
> >>>>>>>>> have been all along. Users shouldn't feel that we're returning
> >>>> features
> >>>>>>>>> into "alpha|beta" mode when what we're actually doing is
> improving
> >>>> the
> >>>>>>>>> community's quality assurance documentation.
> >>>>>>>>>
> >>>>>>>>> Mick
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ------------------------------------------------------------
> >> ---------
> >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------
> ---------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>
> >>>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org <mailto:
> dev-unsubscribe@cassandra.apache.org>
> >> For additional commands, e-mail: dev-help@cassandra.apache.org <mailto:
> dev-help@cassandra.apache.org>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jon Haddad <jo...@jonhaddad.com>.

The default part I was referring to incremental repair.

SASI still has a pretty fatal issue where nodes OOM: https://issues.apache.org/jira/browse/CASSANDRA-12662 <https://issues.apache.org/jira/browse/CASSANDRA-12662> 



> On Oct 4, 2017, at 12:21 PM, Pavel Yaskevich <po...@gmail.com> wrote:
> 
> On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad <jon@jonhaddad.com <ma...@jonhaddad.com>> wrote:
> 
>> MVs work fine for *some use cases*, not the general use case.  That’s why
>> there should be a flag.  To opt into the feature when the behavior is only
>> known to be correct under a certain set of circumstances.  Nobody is saying
>> the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”,
>> or something ridiculous like that.  It’s not an attack against the work
>> done by anyone, the level of effort put in, or minimizing the complexity of
>> the problem.  “enable_materialized_views” would be just fine.
>> 
>> We should be honest to people about what they’re getting into.  You may
>> not be aware of this, but a lot of people still believe Cassandra isn’t a
>> DB that you should put in prod.  It’s because features like SASI, MVs,  or
>> incremental repair get merged in prematurely (or even made the default),
>> without having been thoroughly tested, understood and vetted by trusted
>> community members.  New users hit the snags because they deploy the
>> bleeding edge code and hit the bugs.
>> 
> 
> I beg to differ in case of SASI, it has been tested and vetted and ported
> to different versions. I'm pretty sure it still has better test coverage
> then most of the project does, it's not a "default" and you actually have
> to opt-in to it by creating a custom index, how is that premature or
> misleading to users?
> 
> 
>> 
>> That’s not how the process should work.
>> 
>> Ideally, we’d follow a process that looks a lot more like this:
>> 
>> 1. New feature is built with an opt in flag.  Unknowns are documented, the
>> risk of using the feature is known to the end user.
>> 2. People test and use the feature that know what they’re doing.  They are
>> able to read the code, submit patches, and help flush out the issues.  They
>> do so in low risk environments.  In the case of MVs, they can afford to
>> drop and rebuild the view over a week, or rebuild the cluster altogether.
>> We may not even need to worry as much about backwards compatibility.
>> 3. The feature matures.  More tests are written.  More people become aware
>> of how to contribute to the feature’s stability.
>> 4. After a while, we vote on removing the feature flag and declare it
>> stable for general usage.
>> 
>> If nobody actually cares about a feature (why it was it written in the
>> first place?), then it would never get to 2, 3, 4.  It would take a while
>> for big features like MVs to be marked stable, and that’s fine, because it
>> takes a long time to actually stabilize them.  I think we can all agree
>> they are really, really hard problems to solve, and maybe it takes a while.
>> 
>> Jon
>> 
>> 
>> 
>>> On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jm...@apache.org> wrote:
>>> 
>>>> 
>>>> So you’d rather continue to lie to users about the stability of the
>>>> feature rather than admitting it was merged in prematurely?
>>> 
>>> 
>>> Much like w/SASI, this is something that's in the code-base that for
>>>> certain use-cases apparently works just fine.
>>> 
>>> I don't know of any outstanding issues with the feature,
>>> 
>>> There appear to be varying levels of understanding of the implementation
>>>> details of MV's (that seem to directly correlate with faith in the
>>>> feature's correctness for the use-cases recommended)
>>> 
>>> We have users in the wild relying on MV's with apparent success (same
>> holds
>>>> true of all the other punching bags that have come up in this thread)
>>> 
>>> You're right, Jon. That's clearly exactly what I'm saying.
>>> 
>>> 
>>> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
>>> 
>>>> So you’d rather continue to lie to users about the stability of the
>>>> feature rather than admitting it was merged in prematurely?  I’d rather
>>>> come clean and avoid future problems, and give people the opportunity to
>>>> stop using MVs rather than let them keep taking risks they’re unaware
>> of.
>>>> This is incredibly irresponsible in my opinion.
>>>> 
>>>>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jm...@apache.org>
>> wrote:
>>>>> 
>>>>>> 
>>>>>> Oh, come on. You're being disingenuous.
>>>>> 
>>>>> Not my intent. MV's (and SASI, for example) are fairly well isolated;
>> we
>>>>> have a history of other changes that are much more broadly and higher
>>>>> impact risk-wise across the code-base.
>>>>> 
>>>>> If I were an operator and built a critical part of my business on a
>>>>> released feature that developers then decided to default-disable as
>>>>> 'experimental' post-hoc, I'd think long and hard about using any new
>>>>> features in that project in the future (and revisit my confidence in
>> all
>>>>> other features I relied on, and the software as a whole). We have users
>>>> in
>>>>> the wild relying on MV's with apparent success (same holds true of all
>>>> the
>>>>> other punching bags that have come up in this thread) and I'd hate to
>> see
>>>>> us alienate them by being over-aggressive in the way we handle this.
>>>>> 
>>>>> I'd much rather we continue to aggressively improve and continue to
>>>> analyze
>>>>> MV's stability before a 4.0 release and then use the experimental flag
>> in
>>>>> the future, if at all possible.
>>>>> 
>>>>> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
>>>> belliottsmith.com>
>>>>> wrote:
>>>>> 
>>>>>> Can't we promote these behavioural flags to keyspace properties (with
>>>>>> suitable permissions to edit necessary)?
>>>>>> 
>>>>>> I agree that enabling/disabling features shouldn't require a rolling
>>>>>> restart, and nor should switching their consistency safety level.
>>>>>> 
>>>>>> I think this would be the most suitable equivalent to ALLOW FILTERING
>>>> for
>>>>>> MVs.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Not to detract from the discussion about whether or not to classify X
>>>> or
>>>>>> Y as experimental but https://issues.apache.org/
>>>> jira/browse/CASSANDRA-8303
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
>>>>>> about operators preventing users from abusing features (e.g. allow
>>>>>> filtering).  Could that concept be extended to features like MVs or
>>>> SASI or
>>>>>> anything else?  On the one hand it is nice to be able to set those
>>>> things
>>>>>> dynamically without a rolling restart as well as by user.  On the
>> other
>>>>>> it’s less clear about defaults.  There could be a property file or
>> just
>>>> in
>>>>>> the yaml, the operator could specify the default features that are
>>>> enabled
>>>>>> for users and then it could be overridden within that framework.
>>>>>>> 
>>>>>>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com>
>>>>>> wrote:
>>>>>>>> 
>>>>>>>> We already have those for UDFs and CDC.
>>>>>>>> 
>>>>>>>> We should have more: for triggers, SASI, and MVs, at least.
>> Operators
>>>>>> need a way to disable features they haven’t validated.
>>>>>>>> 
>>>>>>>> We already have sufficient consensus to introduce the flags, and we
>>>>>> should. There also seems to be sufficient consensus on emitting
>>>> warnings.
>>>>>>>> 
>>>>>>>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
>>>>>> agree with Sylvain that flipping the default in a minor would be
>>>> invasive.
>>>>>> We shouldn’t do that.
>>>>>>>> 
>>>>>>>> For trunk, though, I think we should default to off. When it comes
>> to
>>>>>> releasing 4.0 we can collectively decide if there is sufficient trust
>> in
>>>>>> MVs at the time to warrant flipping the default to true. Ultimately we
>>>> can
>>>>>> decide this in a PMC vote. If I misread the consensus regarding the
>>>> default
>>>>>> for 4.0, then we might as well vote on that. What I see is sufficient
>>>>>> distrust coming from core committers, including the author of the v1
>>>>>> design, to warrant opt-in for MVs.
>>>>>>>> 
>>>>>>>> If we don’t trust in them as developers, we shouldn’t be cavalier
>> with
>>>>>> the users, either. Not until that trust is gained/regained.
>>>>>>>> 
>>>>>>>> —
>>>>>>>> AY
>>>>>>>> 
>>>>>>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org)
>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Introducing feature flags for enabling or disabling different code
>>>> paths
>>>>>>>> is not sustainable in the long run. It's hard enough to keep up with
>>>>>>>> integration testing with the couple of Jenkins jobs that we have.
>>>>>>>> Running jobs for all permutations of flags that we keep around,
>> would
>>>>>>>> turn out impractical. But if we don't, I'm pretty sure something
>> will
>>>>>>>> fall off the radar and it won't take long until someone reports that
>>>>>>>> enabling feature X after the latest upgrade will simply not work
>>>>>> anymore.
>>>>>>>> 
>>>>>>>> There may also be some more subtle assumptions and cross
>> dependencies
>>>>>>>> between features that may cause side effects by disabling a feature
>>>> (or
>>>>>>>> parts of it), even if it's just e.g. a metric value that suddenly
>>>> won't
>>>>>>>> get updated anymore, but is used somewhere else. We'll also have to
>>>>>>>> consider migration paths for turning a feature on and off again
>>>> without
>>>>>>>> causing any downtime. If I was to turn on e.g. MVs on a single node
>> in
>>>>>>>> my cluster, then this should not cause any issues on the other nodes
>>>>>>>> that still have MV code paths disabled. Again, this would need to be
>>>>>> tested.
>>>>>>>> 
>>>>>>>> So to be clear, my point is that any flags should be implemented in
>> a
>>>>>>>> really non-invasive way on the user facing side only, e.g. by
>>>> emitting a
>>>>>>>> log message or cqlsh error. At this point, I'm not really sure if it
>>>>>>>> would be a good idea to add them to cassandra.yaml, as I'm pretty
>> sure
>>>>>>>> that eventually they will be used to change the behaviour of our
>> code,
>>>>>>>> beside printing a log message.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 04.10.17 10:03, Mick Semb Wever wrote:
>>>>>>>>>>> CDC sounds like it is in the same basket, but it already has the
>>>>>>>>>>> `cdc_enabled` yaml flag which defaults false.
>>>>>>>>>> I went this route because I was incredibly wary of changing the CL
>>>>>>>>>> code and wanted to shield non-CDC users from any and all risk I
>>>>>>>>>> reasonably could.
>>>>>>>>> 
>>>>>>>>> This approach so far is my favourite. (Thanks Josh.)
>>>>>>>>> 
>>>>>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does
>>>> not
>>>>>>>>> imply "experimental" or "beta" or anything like that.
>>>>>>>>> It does make life easier for both operators and the C* developers.
>>>>>>>>> 
>>>>>>>>> I'm also fond of how Apache projects often vote both on the release
>>>> as
>>>>>> well
>>>>>>>>> as its stability flag: Alpha|Beta|GA (General Availability).
>>>>>>>>> https://httpd.apache.org/dev/release.html
>>>>>>>>> http://www.apache.org/legal/release-policy.html#release-types
>>>>>>>>> 
>>>>>>>>> Given the importance of The Database, i'd be keen to see attached
>>>> such
>>>>>>>>> community-agreed quality references. And going further, not just to
>>>> the
>>>>>>>>> releases but also to substantial new features (those yet to reach
>>>> GA).
>>>>>> Then
>>>>>>>>> the downloads page could provide a table something like
>>>>>>>>> https://paste.apache.org/FzrQ
>>>>>>>>> 
>>>>>>>>> It's just one idea to throw out there, and while it hijacks the
>>>> thread
>>>>>> a
>>>>>>>>> bit, it could even with just the quality tag on releases go a long
>>>> way
>>>>>> with
>>>>>>>>> user trust. Especially if we really are humble about it and use GA
>>>>>>>>> appropriately. For example I'm perfectly happy using a beta in
>>>>>> production
>>>>>>>>> if I see the community otherwise has good processes in place and
>>>>>> there's
>>>>>>>>> strong testing and staging resources to take advantage of. And as
>>>> Kurt
>>>>>> has
>>>>>>>>> implied many users are indeed smart and wise enough to know how to
>>>>>> safely
>>>>>>>>> test and cautiously use even alpha features in production.
>>>>>>>>> 
>>>>>>>>> Anyway, with or without the above idea, yaml flag names that don't
>>>>>>>>> use adjectives could address Kurt's concerns about pulling the rug
>>>> from
>>>>>>>>> under the feet of existing users. Such a flag is but a small
>>>>>> improvement
>>>>>>>>> suitable for a minor release (you must read the NEWS.txt before
>> even
>>>> a
>>>>>>>>> patch upgrade), and the documentation is only making explicit what
>>>>>> should
>>>>>>>>> have been all along. Users shouldn't feel that we're returning
>>>> features
>>>>>>>>> into "alpha|beta" mode when what we're actually doing is improving
>>>> the
>>>>>>>>> community's quality assurance documentation.
>>>>>>>>> 
>>>>>>>>> Mick
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ------------------------------------------------------------
>> ---------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org <ma...@cassandra.apache.org>
>> For additional commands, e-mail: dev-help@cassandra.apache.org <ma...@cassandra.apache.org>

Re: Proposal to retroactively mark materialized views experimental

Posted by Pavel Yaskevich <po...@gmail.com>.

On Wed, Oct 4, 2017 at 12:09 PM, Jon Haddad <jo...@jonhaddad.com> wrote:

> MVs work fine for *some use cases*, not the general use case.  That’s why
> there should be a flag.  To opt into the feature when the behavior is only
> known to be correct under a certain set of circumstances.  Nobody is saying
> the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”,
> or something ridiculous like that.  It’s not an attack against the work
> done by anyone, the level of effort put in, or minimizing the complexity of
> the problem.  “enable_materialized_views” would be just fine.
>
> We should be honest to people about what they’re getting into.  You may
> not be aware of this, but a lot of people still believe Cassandra isn’t a
> DB that you should put in prod.  It’s because features like SASI, MVs,  or
> incremental repair get merged in prematurely (or even made the default),
> without having been thoroughly tested, understood and vetted by trusted
> community members.  New users hit the snags because they deploy the
> bleeding edge code and hit the bugs.
>

I beg to differ in case of SASI, it has been tested and vetted and ported
to different versions. I'm pretty sure it still has better test coverage
then most of the project does, it's not a "default" and you actually have
to opt-in to it by creating a custom index, how is that premature or
misleading to users?


>
> That’s not how the process should work.
>
> Ideally, we’d follow a process that looks a lot more like this:
>
> 1. New feature is built with an opt in flag.  Unknowns are documented, the
> risk of using the feature is known to the end user.
> 2. People test and use the feature that know what they’re doing.  They are
> able to read the code, submit patches, and help flush out the issues.  They
> do so in low risk environments.  In the case of MVs, they can afford to
> drop and rebuild the view over a week, or rebuild the cluster altogether.
> We may not even need to worry as much about backwards compatibility.
> 3. The feature matures.  More tests are written.  More people become aware
> of how to contribute to the feature’s stability.
> 4. After a while, we vote on removing the feature flag and declare it
> stable for general usage.
>
> If nobody actually cares about a feature (why it was it written in the
> first place?), then it would never get to 2, 3, 4.  It would take a while
> for big features like MVs to be marked stable, and that’s fine, because it
> takes a long time to actually stabilize them.  I think we can all agree
> they are really, really hard problems to solve, and maybe it takes a while.
>
> Jon
>
>
>
> > On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jm...@apache.org> wrote:
> >
> >>
> >> So you’d rather continue to lie to users about the stability of the
> >> feature rather than admitting it was merged in prematurely?
> >
> >
> > Much like w/SASI, this is something that's in the code-base that for
> >> certain use-cases apparently works just fine.
> >
> > I don't know of any outstanding issues with the feature,
> >
> > There appear to be varying levels of understanding of the implementation
> >> details of MV's (that seem to directly correlate with faith in the
> >> feature's correctness for the use-cases recommended)
> >
> > We have users in the wild relying on MV's with apparent success (same
> holds
> >> true of all the other punching bags that have come up in this thread)
> >
> > You're right, Jon. That's clearly exactly what I'm saying.
> >
> >
> > On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
> >
> >> So you’d rather continue to lie to users about the stability of the
> >> feature rather than admitting it was merged in prematurely?  I’d rather
> >> come clean and avoid future problems, and give people the opportunity to
> >> stop using MVs rather than let them keep taking risks they’re unaware
> of.
> >> This is incredibly irresponsible in my opinion.
> >>
> >>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jm...@apache.org>
> wrote:
> >>>
> >>>>
> >>>> Oh, come on. You're being disingenuous.
> >>>
> >>> Not my intent. MV's (and SASI, for example) are fairly well isolated;
> we
> >>> have a history of other changes that are much more broadly and higher
> >>> impact risk-wise across the code-base.
> >>>
> >>> If I were an operator and built a critical part of my business on a
> >>> released feature that developers then decided to default-disable as
> >>> 'experimental' post-hoc, I'd think long and hard about using any new
> >>> features in that project in the future (and revisit my confidence in
> all
> >>> other features I relied on, and the software as a whole). We have users
> >> in
> >>> the wild relying on MV's with apparent success (same holds true of all
> >> the
> >>> other punching bags that have come up in this thread) and I'd hate to
> see
> >>> us alienate them by being over-aggressive in the way we handle this.
> >>>
> >>> I'd much rather we continue to aggressively improve and continue to
> >> analyze
> >>> MV's stability before a 4.0 release and then use the experimental flag
> in
> >>> the future, if at all possible.
> >>>
> >>> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
> >> belliottsmith.com>
> >>> wrote:
> >>>
> >>>> Can't we promote these behavioural flags to keyspace properties (with
> >>>> suitable permissions to edit necessary)?
> >>>>
> >>>> I agree that enabling/disabling features shouldn't require a rolling
> >>>> restart, and nor should switching their consistency safety level.
> >>>>
> >>>> I think this would be the most suitable equivalent to ALLOW FILTERING
> >> for
> >>>> MVs.
> >>>>
> >>>>
> >>>>
> >>>>> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>> Not to detract from the discussion about whether or not to classify X
> >> or
> >>>> Y as experimental but https://issues.apache.org/
> >> jira/browse/CASSANDRA-8303
> >>>> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
> >>>> about operators preventing users from abusing features (e.g. allow
> >>>> filtering).  Could that concept be extended to features like MVs or
> >> SASI or
> >>>> anything else?  On the one hand it is nice to be able to set those
> >> things
> >>>> dynamically without a rolling restart as well as by user.  On the
> other
> >>>> it’s less clear about defaults.  There could be a property file or
> just
> >> in
> >>>> the yaml, the operator could specify the default features that are
> >> enabled
> >>>> for users and then it could be overridden within that framework.
> >>>>>
> >>>>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com>
> >>>> wrote:
> >>>>>>
> >>>>>> We already have those for UDFs and CDC.
> >>>>>>
> >>>>>> We should have more: for triggers, SASI, and MVs, at least.
> Operators
> >>>> need a way to disable features they haven’t validated.
> >>>>>>
> >>>>>> We already have sufficient consensus to introduce the flags, and we
> >>>> should. There also seems to be sufficient consensus on emitting
> >> warnings.
> >>>>>>
> >>>>>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
> >>>> agree with Sylvain that flipping the default in a minor would be
> >> invasive.
> >>>> We shouldn’t do that.
> >>>>>>
> >>>>>> For trunk, though, I think we should default to off. When it comes
> to
> >>>> releasing 4.0 we can collectively decide if there is sufficient trust
> in
> >>>> MVs at the time to warrant flipping the default to true. Ultimately we
> >> can
> >>>> decide this in a PMC vote. If I misread the consensus regarding the
> >> default
> >>>> for 4.0, then we might as well vote on that. What I see is sufficient
> >>>> distrust coming from core committers, including the author of the v1
> >>>> design, to warrant opt-in for MVs.
> >>>>>>
> >>>>>> If we don’t trust in them as developers, we shouldn’t be cavalier
> with
> >>>> the users, either. Not until that trust is gained/regained.
> >>>>>>
> >>>>>> —
> >>>>>> AY
> >>>>>>
> >>>>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org)
> >>>> wrote:
> >>>>>>
> >>>>>> Introducing feature flags for enabling or disabling different code
> >> paths
> >>>>>> is not sustainable in the long run. It's hard enough to keep up with
> >>>>>> integration testing with the couple of Jenkins jobs that we have.
> >>>>>> Running jobs for all permutations of flags that we keep around,
> would
> >>>>>> turn out impractical. But if we don't, I'm pretty sure something
> will
> >>>>>> fall off the radar and it won't take long until someone reports that
> >>>>>> enabling feature X after the latest upgrade will simply not work
> >>>> anymore.
> >>>>>>
> >>>>>> There may also be some more subtle assumptions and cross
> dependencies
> >>>>>> between features that may cause side effects by disabling a feature
> >> (or
> >>>>>> parts of it), even if it's just e.g. a metric value that suddenly
> >> won't
> >>>>>> get updated anymore, but is used somewhere else. We'll also have to
> >>>>>> consider migration paths for turning a feature on and off again
> >> without
> >>>>>> causing any downtime. If I was to turn on e.g. MVs on a single node
> in
> >>>>>> my cluster, then this should not cause any issues on the other nodes
> >>>>>> that still have MV code paths disabled. Again, this would need to be
> >>>> tested.
> >>>>>>
> >>>>>> So to be clear, my point is that any flags should be implemented in
> a
> >>>>>> really non-invasive way on the user facing side only, e.g. by
> >> emitting a
> >>>>>> log message or cqlsh error. At this point, I'm not really sure if it
> >>>>>> would be a good idea to add them to cassandra.yaml, as I'm pretty
> sure
> >>>>>> that eventually they will be used to change the behaviour of our
> code,
> >>>>>> beside printing a log message.
> >>>>>>
> >>>>>>
> >>>>>> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>>>>>>>> CDC sounds like it is in the same basket, but it already has the
> >>>>>>>>> `cdc_enabled` yaml flag which defaults false.
> >>>>>>>> I went this route because I was incredibly wary of changing the CL
> >>>>>>>> code and wanted to shield non-CDC users from any and all risk I
> >>>>>>>> reasonably could.
> >>>>>>>
> >>>>>>> This approach so far is my favourite. (Thanks Josh.)
> >>>>>>>
> >>>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does
> >> not
> >>>>>>> imply "experimental" or "beta" or anything like that.
> >>>>>>> It does make life easier for both operators and the C* developers.
> >>>>>>>
> >>>>>>> I'm also fond of how Apache projects often vote both on the release
> >> as
> >>>> well
> >>>>>>> as its stability flag: Alpha|Beta|GA (General Availability).
> >>>>>>> https://httpd.apache.org/dev/release.html
> >>>>>>> http://www.apache.org/legal/release-policy.html#release-types
> >>>>>>>
> >>>>>>> Given the importance of The Database, i'd be keen to see attached
> >> such
> >>>>>>> community-agreed quality references. And going further, not just to
> >> the
> >>>>>>> releases but also to substantial new features (those yet to reach
> >> GA).
> >>>> Then
> >>>>>>> the downloads page could provide a table something like
> >>>>>>> https://paste.apache.org/FzrQ
> >>>>>>>
> >>>>>>> It's just one idea to throw out there, and while it hijacks the
> >> thread
> >>>> a
> >>>>>>> bit, it could even with just the quality tag on releases go a long
> >> way
> >>>> with
> >>>>>>> user trust. Especially if we really are humble about it and use GA
> >>>>>>> appropriately. For example I'm perfectly happy using a beta in
> >>>> production
> >>>>>>> if I see the community otherwise has good processes in place and
> >>>> there's
> >>>>>>> strong testing and staging resources to take advantage of. And as
> >> Kurt
> >>>> has
> >>>>>>> implied many users are indeed smart and wise enough to know how to
> >>>> safely
> >>>>>>> test and cautiously use even alpha features in production.
> >>>>>>>
> >>>>>>> Anyway, with or without the above idea, yaml flag names that don't
> >>>>>>> use adjectives could address Kurt's concerns about pulling the rug
> >> from
> >>>>>>> under the feet of existing users. Such a flag is but a small
> >>>> improvement
> >>>>>>> suitable for a minor release (you must read the NEWS.txt before
> even
> >> a
> >>>>>>> patch upgrade), and the documentation is only making explicit what
> >>>> should
> >>>>>>> have been all along. Users shouldn't feel that we're returning
> >> features
> >>>>>>> into "alpha|beta" mode when what we're actually doing is improving
> >> the
> >>>>>>> community's quality assurance documentation.
> >>>>>>>
> >>>>>>> Mick
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> ------------------------------------------------------------
> ---------
> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>
> >>>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jon Haddad <jo...@jonhaddad.com>.

MVs work fine for *some use cases*, not the general use case.  That’s why there should be a flag.  To opt into the feature when the behavior is only known to be correct under a certain set of circumstances.  Nobody is saying the flag should be “enable_terrible_feature_nobody_tested_and_we_all_hate”, or something ridiculous like that.  It’s not an attack against the work done by anyone, the level of effort put in, or minimizing the complexity of the problem.  “enable_materialized_views” would be just fine.

We should be honest to people about what they’re getting into.  You may not be aware of this, but a lot of people still believe Cassandra isn’t a DB that you should put in prod.  It’s because features like SASI, MVs,  or incremental repair get merged in prematurely (or even made the default), without having been thoroughly tested, understood and vetted by trusted community members.  New users hit the snags because they deploy the bleeding edge code and hit the bugs. 

That’s not how the process should work.  

Ideally, we’d follow a process that looks a lot more like this:

1. New feature is built with an opt in flag.  Unknowns are documented, the risk of using the feature is known to the end user.  
2. People test and use the feature that know what they’re doing.  They are able to read the code, submit patches, and help flush out the issues.  They do so in low risk environments.  In the case of MVs, they can afford to drop and rebuild the view over a week, or rebuild the cluster altogether.  We may not even need to worry as much about backwards compatibility.
3. The feature matures.  More tests are written.  More people become aware of how to contribute to the feature’s stability.
4. After a while, we vote on removing the feature flag and declare it stable for general usage.

If nobody actually cares about a feature (why it was it written in the first place?), then it would never get to 2, 3, 4.  It would take a while for big features like MVs to be marked stable, and that’s fine, because it takes a long time to actually stabilize them.  I think we can all agree they are really, really hard problems to solve, and maybe it takes a while.

Jon



> On Oct 4, 2017, at 11:44 AM, Josh McKenzie <jm...@apache.org> wrote:
> 
>> 
>> So you’d rather continue to lie to users about the stability of the
>> feature rather than admitting it was merged in prematurely?
> 
> 
> Much like w/SASI, this is something that's in the code-base that for
>> certain use-cases apparently works just fine.
> 
> I don't know of any outstanding issues with the feature,
> 
> There appear to be varying levels of understanding of the implementation
>> details of MV's (that seem to directly correlate with faith in the
>> feature's correctness for the use-cases recommended)
> 
> We have users in the wild relying on MV's with apparent success (same holds
>> true of all the other punching bags that have come up in this thread)
> 
> You're right, Jon. That's clearly exactly what I'm saying.
> 
> 
> On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <jo...@jonhaddad.com> wrote:
> 
>> So you’d rather continue to lie to users about the stability of the
>> feature rather than admitting it was merged in prematurely?  I’d rather
>> come clean and avoid future problems, and give people the opportunity to
>> stop using MVs rather than let them keep taking risks they’re unaware of.
>> This is incredibly irresponsible in my opinion.
>> 
>>> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jm...@apache.org> wrote:
>>> 
>>>> 
>>>> Oh, come on. You're being disingenuous.
>>> 
>>> Not my intent. MV's (and SASI, for example) are fairly well isolated; we
>>> have a history of other changes that are much more broadly and higher
>>> impact risk-wise across the code-base.
>>> 
>>> If I were an operator and built a critical part of my business on a
>>> released feature that developers then decided to default-disable as
>>> 'experimental' post-hoc, I'd think long and hard about using any new
>>> features in that project in the future (and revisit my confidence in all
>>> other features I relied on, and the software as a whole). We have users
>> in
>>> the wild relying on MV's with apparent success (same holds true of all
>> the
>>> other punching bags that have come up in this thread) and I'd hate to see
>>> us alienate them by being over-aggressive in the way we handle this.
>>> 
>>> I'd much rather we continue to aggressively improve and continue to
>> analyze
>>> MV's stability before a 4.0 release and then use the experimental flag in
>>> the future, if at all possible.
>>> 
>>> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
>> belliottsmith.com>
>>> wrote:
>>> 
>>>> Can't we promote these behavioural flags to keyspace properties (with
>>>> suitable permissions to edit necessary)?
>>>> 
>>>> I agree that enabling/disabling features shouldn't require a rolling
>>>> restart, and nor should switching their consistency safety level.
>>>> 
>>>> I think this would be the most suitable equivalent to ALLOW FILTERING
>> for
>>>> MVs.
>>>> 
>>>> 
>>>> 
>>>>> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> Not to detract from the discussion about whether or not to classify X
>> or
>>>> Y as experimental but https://issues.apache.org/
>> jira/browse/CASSANDRA-8303
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
>>>> about operators preventing users from abusing features (e.g. allow
>>>> filtering).  Could that concept be extended to features like MVs or
>> SASI or
>>>> anything else?  On the one hand it is nice to be able to set those
>> things
>>>> dynamically without a rolling restart as well as by user.  On the other
>>>> it’s less clear about defaults.  There could be a property file or just
>> in
>>>> the yaml, the operator could specify the default features that are
>> enabled
>>>> for users and then it could be overridden within that framework.
>>>>> 
>>>>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com>
>>>> wrote:
>>>>>> 
>>>>>> We already have those for UDFs and CDC.
>>>>>> 
>>>>>> We should have more: for triggers, SASI, and MVs, at least. Operators
>>>> need a way to disable features they haven’t validated.
>>>>>> 
>>>>>> We already have sufficient consensus to introduce the flags, and we
>>>> should. There also seems to be sufficient consensus on emitting
>> warnings.
>>>>>> 
>>>>>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
>>>> agree with Sylvain that flipping the default in a minor would be
>> invasive.
>>>> We shouldn’t do that.
>>>>>> 
>>>>>> For trunk, though, I think we should default to off. When it comes to
>>>> releasing 4.0 we can collectively decide if there is sufficient trust in
>>>> MVs at the time to warrant flipping the default to true. Ultimately we
>> can
>>>> decide this in a PMC vote. If I misread the consensus regarding the
>> default
>>>> for 4.0, then we might as well vote on that. What I see is sufficient
>>>> distrust coming from core committers, including the author of the v1
>>>> design, to warrant opt-in for MVs.
>>>>>> 
>>>>>> If we don’t trust in them as developers, we shouldn’t be cavalier with
>>>> the users, either. Not until that trust is gained/regained.
>>>>>> 
>>>>>> —
>>>>>> AY
>>>>>> 
>>>>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org)
>>>> wrote:
>>>>>> 
>>>>>> Introducing feature flags for enabling or disabling different code
>> paths
>>>>>> is not sustainable in the long run. It's hard enough to keep up with
>>>>>> integration testing with the couple of Jenkins jobs that we have.
>>>>>> Running jobs for all permutations of flags that we keep around, would
>>>>>> turn out impractical. But if we don't, I'm pretty sure something will
>>>>>> fall off the radar and it won't take long until someone reports that
>>>>>> enabling feature X after the latest upgrade will simply not work
>>>> anymore.
>>>>>> 
>>>>>> There may also be some more subtle assumptions and cross dependencies
>>>>>> between features that may cause side effects by disabling a feature
>> (or
>>>>>> parts of it), even if it's just e.g. a metric value that suddenly
>> won't
>>>>>> get updated anymore, but is used somewhere else. We'll also have to
>>>>>> consider migration paths for turning a feature on and off again
>> without
>>>>>> causing any downtime. If I was to turn on e.g. MVs on a single node in
>>>>>> my cluster, then this should not cause any issues on the other nodes
>>>>>> that still have MV code paths disabled. Again, this would need to be
>>>> tested.
>>>>>> 
>>>>>> So to be clear, my point is that any flags should be implemented in a
>>>>>> really non-invasive way on the user facing side only, e.g. by
>> emitting a
>>>>>> log message or cqlsh error. At this point, I'm not really sure if it
>>>>>> would be a good idea to add them to cassandra.yaml, as I'm pretty sure
>>>>>> that eventually they will be used to change the behaviour of our code,
>>>>>> beside printing a log message.
>>>>>> 
>>>>>> 
>>>>>> On 04.10.17 10:03, Mick Semb Wever wrote:
>>>>>>>>> CDC sounds like it is in the same basket, but it already has the
>>>>>>>>> `cdc_enabled` yaml flag which defaults false.
>>>>>>>> I went this route because I was incredibly wary of changing the CL
>>>>>>>> code and wanted to shield non-CDC users from any and all risk I
>>>>>>>> reasonably could.
>>>>>>> 
>>>>>>> This approach so far is my favourite. (Thanks Josh.)
>>>>>>> 
>>>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does
>> not
>>>>>>> imply "experimental" or "beta" or anything like that.
>>>>>>> It does make life easier for both operators and the C* developers.
>>>>>>> 
>>>>>>> I'm also fond of how Apache projects often vote both on the release
>> as
>>>> well
>>>>>>> as its stability flag: Alpha|Beta|GA (General Availability).
>>>>>>> https://httpd.apache.org/dev/release.html
>>>>>>> http://www.apache.org/legal/release-policy.html#release-types
>>>>>>> 
>>>>>>> Given the importance of The Database, i'd be keen to see attached
>> such
>>>>>>> community-agreed quality references. And going further, not just to
>> the
>>>>>>> releases but also to substantial new features (those yet to reach
>> GA).
>>>> Then
>>>>>>> the downloads page could provide a table something like
>>>>>>> https://paste.apache.org/FzrQ
>>>>>>> 
>>>>>>> It's just one idea to throw out there, and while it hijacks the
>> thread
>>>> a
>>>>>>> bit, it could even with just the quality tag on releases go a long
>> way
>>>> with
>>>>>>> user trust. Especially if we really are humble about it and use GA
>>>>>>> appropriately. For example I'm perfectly happy using a beta in
>>>> production
>>>>>>> if I see the community otherwise has good processes in place and
>>>> there's
>>>>>>> strong testing and staging resources to take advantage of. And as
>> Kurt
>>>> has
>>>>>>> implied many users are indeed smart and wise enough to know how to
>>>> safely
>>>>>>> test and cautiously use even alpha features in production.
>>>>>>> 
>>>>>>> Anyway, with or without the above idea, yaml flag names that don't
>>>>>>> use adjectives could address Kurt's concerns about pulling the rug
>> from
>>>>>>> under the feet of existing users. Such a flag is but a small
>>>> improvement
>>>>>>> suitable for a minor release (you must read the NEWS.txt before even
>> a
>>>>>>> patch upgrade), and the documentation is only making explicit what
>>>> should
>>>>>>> have been all along. Users shouldn't feel that we're returning
>> features
>>>>>>> into "alpha|beta" mode when what we're actually doing is improving
>> the
>>>>>>> community's quality assurance documentation.
>>>>>>> 
>>>>>>> Mick
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>> 
>>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

>
> So you’d rather continue to lie to users about the stability of the
> feature rather than admitting it was merged in prematurely?


Much like w/SASI, this is something that's in the code-base that for
> certain use-cases apparently works just fine.

I don't know of any outstanding issues with the feature,

There appear to be varying levels of understanding of the implementation
> details of MV's (that seem to directly correlate with faith in the
> feature's correctness for the use-cases recommended)

We have users in the wild relying on MV's with apparent success (same holds
> true of all the other punching bags that have come up in this thread)

You're right, Jon. That's clearly exactly what I'm saying.


On Wed, Oct 4, 2017 at 2:39 PM, Jon Haddad <jo...@jonhaddad.com> wrote:

> So you’d rather continue to lie to users about the stability of the
> feature rather than admitting it was merged in prematurely?  I’d rather
> come clean and avoid future problems, and give people the opportunity to
> stop using MVs rather than let them keep taking risks they’re unaware of.
> This is incredibly irresponsible in my opinion.
>
> > On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jm...@apache.org> wrote:
> >
> >>
> >> Oh, come on. You're being disingenuous.
> >
> > Not my intent. MV's (and SASI, for example) are fairly well isolated; we
> > have a history of other changes that are much more broadly and higher
> > impact risk-wise across the code-base.
> >
> > If I were an operator and built a critical part of my business on a
> > released feature that developers then decided to default-disable as
> > 'experimental' post-hoc, I'd think long and hard about using any new
> > features in that project in the future (and revisit my confidence in all
> > other features I relied on, and the software as a whole). We have users
> in
> > the wild relying on MV's with apparent success (same holds true of all
> the
> > other punching bags that have come up in this thread) and I'd hate to see
> > us alienate them by being over-aggressive in the way we handle this.
> >
> > I'd much rather we continue to aggressively improve and continue to
> analyze
> > MV's stability before a 4.0 release and then use the experimental flag in
> > the future, if at all possible.
> >
> > On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_@
> belliottsmith.com>
> > wrote:
> >
> >> Can't we promote these behavioural flags to keyspace properties (with
> >> suitable permissions to edit necessary)?
> >>
> >> I agree that enabling/disabling features shouldn't require a rolling
> >> restart, and nor should switching their consistency safety level.
> >>
> >> I think this would be the most suitable equivalent to ALLOW FILTERING
> for
> >> MVs.
> >>
> >>
> >>
> >>> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
> >> wrote:
> >>>
> >>> Not to detract from the discussion about whether or not to classify X
> or
> >> Y as experimental but https://issues.apache.org/
> jira/browse/CASSANDRA-8303
> >> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
> >> about operators preventing users from abusing features (e.g. allow
> >> filtering).  Could that concept be extended to features like MVs or
> SASI or
> >> anything else?  On the one hand it is nice to be able to set those
> things
> >> dynamically without a rolling restart as well as by user.  On the other
> >> it’s less clear about defaults.  There could be a property file or just
> in
> >> the yaml, the operator could specify the default features that are
> enabled
> >> for users and then it could be overridden within that framework.
> >>>
> >>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com>
> >> wrote:
> >>>>
> >>>> We already have those for UDFs and CDC.
> >>>>
> >>>> We should have more: for triggers, SASI, and MVs, at least. Operators
> >> need a way to disable features they haven’t validated.
> >>>>
> >>>> We already have sufficient consensus to introduce the flags, and we
> >> should. There also seems to be sufficient consensus on emitting
> warnings.
> >>>>
> >>>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
> >> agree with Sylvain that flipping the default in a minor would be
> invasive.
> >> We shouldn’t do that.
> >>>>
> >>>> For trunk, though, I think we should default to off. When it comes to
> >> releasing 4.0 we can collectively decide if there is sufficient trust in
> >> MVs at the time to warrant flipping the default to true. Ultimately we
> can
> >> decide this in a PMC vote. If I misread the consensus regarding the
> default
> >> for 4.0, then we might as well vote on that. What I see is sufficient
> >> distrust coming from core committers, including the author of the v1
> >> design, to warrant opt-in for MVs.
> >>>>
> >>>> If we don’t trust in them as developers, we shouldn’t be cavalier with
> >> the users, either. Not until that trust is gained/regained.
> >>>>
> >>>> —
> >>>> AY
> >>>>
> >>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org)
> >> wrote:
> >>>>
> >>>> Introducing feature flags for enabling or disabling different code
> paths
> >>>> is not sustainable in the long run. It's hard enough to keep up with
> >>>> integration testing with the couple of Jenkins jobs that we have.
> >>>> Running jobs for all permutations of flags that we keep around, would
> >>>> turn out impractical. But if we don't, I'm pretty sure something will
> >>>> fall off the radar and it won't take long until someone reports that
> >>>> enabling feature X after the latest upgrade will simply not work
> >> anymore.
> >>>>
> >>>> There may also be some more subtle assumptions and cross dependencies
> >>>> between features that may cause side effects by disabling a feature
> (or
> >>>> parts of it), even if it's just e.g. a metric value that suddenly
> won't
> >>>> get updated anymore, but is used somewhere else. We'll also have to
> >>>> consider migration paths for turning a feature on and off again
> without
> >>>> causing any downtime. If I was to turn on e.g. MVs on a single node in
> >>>> my cluster, then this should not cause any issues on the other nodes
> >>>> that still have MV code paths disabled. Again, this would need to be
> >> tested.
> >>>>
> >>>> So to be clear, my point is that any flags should be implemented in a
> >>>> really non-invasive way on the user facing side only, e.g. by
> emitting a
> >>>> log message or cqlsh error. At this point, I'm not really sure if it
> >>>> would be a good idea to add them to cassandra.yaml, as I'm pretty sure
> >>>> that eventually they will be used to change the behaviour of our code,
> >>>> beside printing a log message.
> >>>>
> >>>>
> >>>> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>>>>>> CDC sounds like it is in the same basket, but it already has the
> >>>>>>> `cdc_enabled` yaml flag which defaults false.
> >>>>>> I went this route because I was incredibly wary of changing the CL
> >>>>>> code and wanted to shield non-CDC users from any and all risk I
> >>>>>> reasonably could.
> >>>>>
> >>>>> This approach so far is my favourite. (Thanks Josh.)
> >>>>>
> >>>>> The flag name `cdc_enabled` is simple and, without adjectives, does
> not
> >>>>> imply "experimental" or "beta" or anything like that.
> >>>>> It does make life easier for both operators and the C* developers.
> >>>>>
> >>>>> I'm also fond of how Apache projects often vote both on the release
> as
> >> well
> >>>>> as its stability flag: Alpha|Beta|GA (General Availability).
> >>>>> https://httpd.apache.org/dev/release.html
> >>>>> http://www.apache.org/legal/release-policy.html#release-types
> >>>>>
> >>>>> Given the importance of The Database, i'd be keen to see attached
> such
> >>>>> community-agreed quality references. And going further, not just to
> the
> >>>>> releases but also to substantial new features (those yet to reach
> GA).
> >> Then
> >>>>> the downloads page could provide a table something like
> >>>>> https://paste.apache.org/FzrQ
> >>>>>
> >>>>> It's just one idea to throw out there, and while it hijacks the
> thread
> >> a
> >>>>> bit, it could even with just the quality tag on releases go a long
> way
> >> with
> >>>>> user trust. Especially if we really are humble about it and use GA
> >>>>> appropriately. For example I'm perfectly happy using a beta in
> >> production
> >>>>> if I see the community otherwise has good processes in place and
> >> there's
> >>>>> strong testing and staging resources to take advantage of. And as
> Kurt
> >> has
> >>>>> implied many users are indeed smart and wise enough to know how to
> >> safely
> >>>>> test and cautiously use even alpha features in production.
> >>>>>
> >>>>> Anyway, with or without the above idea, yaml flag names that don't
> >>>>> use adjectives could address Kurt's concerns about pulling the rug
> from
> >>>>> under the feet of existing users. Such a flag is but a small
> >> improvement
> >>>>> suitable for a minor release (you must read the NEWS.txt before even
> a
> >>>>> patch upgrade), and the documentation is only making explicit what
> >> should
> >>>>> have been all along. Users shouldn't feel that we're returning
> features
> >>>>> into "alpha|beta" mode when what we're actually doing is improving
> the
> >>>>> community's quality assurance documentation.
> >>>>>
> >>>>> Mick
> >>>>>
> >>>>
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >>>> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>>>
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jon Haddad <jo...@jonhaddad.com>.

So you’d rather continue to lie to users about the stability of the feature rather than admitting it was merged in prematurely?  I’d rather come clean and avoid future problems, and give people the opportunity to stop using MVs rather than let them keep taking risks they’re unaware of.  This is incredibly irresponsible in my opinion.  

> On Oct 4, 2017, at 11:26 AM, Josh McKenzie <jm...@apache.org> wrote:
> 
>> 
>> Oh, come on. You're being disingenuous.
> 
> Not my intent. MV's (and SASI, for example) are fairly well isolated; we
> have a history of other changes that are much more broadly and higher
> impact risk-wise across the code-base.
> 
> If I were an operator and built a critical part of my business on a
> released feature that developers then decided to default-disable as
> 'experimental' post-hoc, I'd think long and hard about using any new
> features in that project in the future (and revisit my confidence in all
> other features I relied on, and the software as a whole). We have users in
> the wild relying on MV's with apparent success (same holds true of all the
> other punching bags that have come up in this thread) and I'd hate to see
> us alienate them by being over-aggressive in the way we handle this.
> 
> I'd much rather we continue to aggressively improve and continue to analyze
> MV's stability before a 4.0 release and then use the experimental flag in
> the future, if at all possible.
> 
> On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_...@belliottsmith.com>
> wrote:
> 
>> Can't we promote these behavioural flags to keyspace properties (with
>> suitable permissions to edit necessary)?
>> 
>> I agree that enabling/disabling features shouldn't require a rolling
>> restart, and nor should switching their consistency safety level.
>> 
>> I think this would be the most suitable equivalent to ALLOW FILTERING for
>> MVs.
>> 
>> 
>> 
>>> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
>> wrote:
>>> 
>>> Not to detract from the discussion about whether or not to classify X or
>> Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303
>> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
>> about operators preventing users from abusing features (e.g. allow
>> filtering).  Could that concept be extended to features like MVs or SASI or
>> anything else?  On the one hand it is nice to be able to set those things
>> dynamically without a rolling restart as well as by user.  On the other
>> it’s less clear about defaults.  There could be a property file or just in
>> the yaml, the operator could specify the default features that are enabled
>> for users and then it could be overridden within that framework.
>>> 
>>>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com>
>> wrote:
>>>> 
>>>> We already have those for UDFs and CDC.
>>>> 
>>>> We should have more: for triggers, SASI, and MVs, at least. Operators
>> need a way to disable features they haven’t validated.
>>>> 
>>>> We already have sufficient consensus to introduce the flags, and we
>> should. There also seems to be sufficient consensus on emitting warnings.
>>>> 
>>>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
>> agree with Sylvain that flipping the default in a minor would be invasive.
>> We shouldn’t do that.
>>>> 
>>>> For trunk, though, I think we should default to off. When it comes to
>> releasing 4.0 we can collectively decide if there is sufficient trust in
>> MVs at the time to warrant flipping the default to true. Ultimately we can
>> decide this in a PMC vote. If I misread the consensus regarding the default
>> for 4.0, then we might as well vote on that. What I see is sufficient
>> distrust coming from core committers, including the author of the v1
>> design, to warrant opt-in for MVs.
>>>> 
>>>> If we don’t trust in them as developers, we shouldn’t be cavalier with
>> the users, either. Not until that trust is gained/regained.
>>>> 
>>>> —
>>>> AY
>>>> 
>>>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org)
>> wrote:
>>>> 
>>>> Introducing feature flags for enabling or disabling different code paths
>>>> is not sustainable in the long run. It's hard enough to keep up with
>>>> integration testing with the couple of Jenkins jobs that we have.
>>>> Running jobs for all permutations of flags that we keep around, would
>>>> turn out impractical. But if we don't, I'm pretty sure something will
>>>> fall off the radar and it won't take long until someone reports that
>>>> enabling feature X after the latest upgrade will simply not work
>> anymore.
>>>> 
>>>> There may also be some more subtle assumptions and cross dependencies
>>>> between features that may cause side effects by disabling a feature (or
>>>> parts of it), even if it's just e.g. a metric value that suddenly won't
>>>> get updated anymore, but is used somewhere else. We'll also have to
>>>> consider migration paths for turning a feature on and off again without
>>>> causing any downtime. If I was to turn on e.g. MVs on a single node in
>>>> my cluster, then this should not cause any issues on the other nodes
>>>> that still have MV code paths disabled. Again, this would need to be
>> tested.
>>>> 
>>>> So to be clear, my point is that any flags should be implemented in a
>>>> really non-invasive way on the user facing side only, e.g. by emitting a
>>>> log message or cqlsh error. At this point, I'm not really sure if it
>>>> would be a good idea to add them to cassandra.yaml, as I'm pretty sure
>>>> that eventually they will be used to change the behaviour of our code,
>>>> beside printing a log message.
>>>> 
>>>> 
>>>> On 04.10.17 10:03, Mick Semb Wever wrote:
>>>>>>> CDC sounds like it is in the same basket, but it already has the
>>>>>>> `cdc_enabled` yaml flag which defaults false.
>>>>>> I went this route because I was incredibly wary of changing the CL
>>>>>> code and wanted to shield non-CDC users from any and all risk I
>>>>>> reasonably could.
>>>>> 
>>>>> This approach so far is my favourite. (Thanks Josh.)
>>>>> 
>>>>> The flag name `cdc_enabled` is simple and, without adjectives, does not
>>>>> imply "experimental" or "beta" or anything like that.
>>>>> It does make life easier for both operators and the C* developers.
>>>>> 
>>>>> I'm also fond of how Apache projects often vote both on the release as
>> well
>>>>> as its stability flag: Alpha|Beta|GA (General Availability).
>>>>> https://httpd.apache.org/dev/release.html
>>>>> http://www.apache.org/legal/release-policy.html#release-types
>>>>> 
>>>>> Given the importance of The Database, i'd be keen to see attached such
>>>>> community-agreed quality references. And going further, not just to the
>>>>> releases but also to substantial new features (those yet to reach GA).
>> Then
>>>>> the downloads page could provide a table something like
>>>>> https://paste.apache.org/FzrQ
>>>>> 
>>>>> It's just one idea to throw out there, and while it hijacks the thread
>> a
>>>>> bit, it could even with just the quality tag on releases go a long way
>> with
>>>>> user trust. Especially if we really are humble about it and use GA
>>>>> appropriately. For example I'm perfectly happy using a beta in
>> production
>>>>> if I see the community otherwise has good processes in place and
>> there's
>>>>> strong testing and staging resources to take advantage of. And as Kurt
>> has
>>>>> implied many users are indeed smart and wise enough to know how to
>> safely
>>>>> test and cautiously use even alpha features in production.
>>>>> 
>>>>> Anyway, with or without the above idea, yaml flag names that don't
>>>>> use adjectives could address Kurt's concerns about pulling the rug from
>>>>> under the feet of existing users. Such a flag is but a small
>> improvement
>>>>> suitable for a minor release (you must read the NEWS.txt before even a
>>>>> patch upgrade), and the documentation is only making explicit what
>> should
>>>>> have been all along. Users shouldn't feel that we're returning features
>>>>> into "alpha|beta" mode when what we're actually doing is improving the
>>>>> community's quality assurance documentation.
>>>>> 
>>>>> Mick
>>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>> 
>>> 
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

Strongly disagree with MV’s being isolated part.

You can feel the touch of the MVs in the read path, write path, metadata handling, whether you use them or not. And comparing any of those before/after MVs were introduced makes me sad every time I face any of it. It made our codebase objectively worse.

On 4 October 2017 at 19:26:43, Josh McKenzie (jmckenzie@apache.org) wrote:

MV's (and SASI, for example) are fairly well isolated


Well, if the developers keep pushing untested complex features onto the project, then refuse to admit their mistakes,

then as an operator you *should* think long and hard and you *should* revisit your confidence. Or else you are a shitty operator.



On 4 October 2017 at 19:26:43, Josh McKenzie (jmckenzie@apache.org) wrote:

If I were an operator and built a critical part of my business on a 
released feature that developers then decided to default-disable as 
'experimental' post-hoc, I'd think long and hard about using any new 
features in that project in the future (and revisit my confidence in all 
other features I relied on, and the software as a whole).


—

AY

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

>
> Oh, come on. You're being disingenuous.

Not my intent. MV's (and SASI, for example) are fairly well isolated; we
have a history of other changes that are much more broadly and higher
impact risk-wise across the code-base.

If I were an operator and built a critical part of my business on a
released feature that developers then decided to default-disable as
'experimental' post-hoc, I'd think long and hard about using any new
features in that project in the future (and revisit my confidence in all
other features I relied on, and the software as a whole). We have users in
the wild relying on MV's with apparent success (same holds true of all the
other punching bags that have come up in this thread) and I'd hate to see
us alienate them by being over-aggressive in the way we handle this.

I'd much rather we continue to aggressively improve and continue to analyze
MV's stability before a 4.0 release and then use the experimental flag in
the future, if at all possible.

On Wed, Oct 4, 2017 at 2:01 PM, Benedict Elliott Smith <_...@belliottsmith.com>
wrote:

> Can't we promote these behavioural flags to keyspace properties (with
> suitable permissions to edit necessary)?
>
> I agree that enabling/disabling features shouldn't require a rolling
> restart, and nor should switching their consistency safety level.
>
> I think this would be the most suitable equivalent to ALLOW FILTERING for
> MVs.
>
>
>
> > On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com>
> wrote:
> >
> > Not to detract from the discussion about whether or not to classify X or
> Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303
> <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally
> about operators preventing users from abusing features (e.g. allow
> filtering).  Could that concept be extended to features like MVs or SASI or
> anything else?  On the one hand it is nice to be able to set those things
> dynamically without a rolling restart as well as by user.  On the other
> it’s less clear about defaults.  There could be a property file or just in
> the yaml, the operator could specify the default features that are enabled
> for users and then it could be overridden within that framework.
> >
> >> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com>
> wrote:
> >>
> >> We already have those for UDFs and CDC.
> >>
> >> We should have more: for triggers, SASI, and MVs, at least. Operators
> need a way to disable features they haven’t validated.
> >>
> >> We already have sufficient consensus to introduce the flags, and we
> should. There also seems to be sufficient consensus on emitting warnings.
> >>
> >> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I
> agree with Sylvain that flipping the default in a minor would be invasive.
> We shouldn’t do that.
> >>
> >> For trunk, though, I think we should default to off. When it comes to
> releasing 4.0 we can collectively decide if there is sufficient trust in
> MVs at the time to warrant flipping the default to true. Ultimately we can
> decide this in a PMC vote. If I misread the consensus regarding the default
> for 4.0, then we might as well vote on that. What I see is sufficient
> distrust coming from core committers, including the author of the v1
> design, to warrant opt-in for MVs.
> >>
> >> If we don’t trust in them as developers, we shouldn’t be cavalier with
> the users, either. Not until that trust is gained/regained.
> >>
> >> —
> >> AY
> >>
> >> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org)
> wrote:
> >>
> >> Introducing feature flags for enabling or disabling different code paths
> >> is not sustainable in the long run. It's hard enough to keep up with
> >> integration testing with the couple of Jenkins jobs that we have.
> >> Running jobs for all permutations of flags that we keep around, would
> >> turn out impractical. But if we don't, I'm pretty sure something will
> >> fall off the radar and it won't take long until someone reports that
> >> enabling feature X after the latest upgrade will simply not work
> anymore.
> >>
> >> There may also be some more subtle assumptions and cross dependencies
> >> between features that may cause side effects by disabling a feature (or
> >> parts of it), even if it's just e.g. a metric value that suddenly won't
> >> get updated anymore, but is used somewhere else. We'll also have to
> >> consider migration paths for turning a feature on and off again without
> >> causing any downtime. If I was to turn on e.g. MVs on a single node in
> >> my cluster, then this should not cause any issues on the other nodes
> >> that still have MV code paths disabled. Again, this would need to be
> tested.
> >>
> >> So to be clear, my point is that any flags should be implemented in a
> >> really non-invasive way on the user facing side only, e.g. by emitting a
> >> log message or cqlsh error. At this point, I'm not really sure if it
> >> would be a good idea to add them to cassandra.yaml, as I'm pretty sure
> >> that eventually they will be used to change the behaviour of our code,
> >> beside printing a log message.
> >>
> >>
> >> On 04.10.17 10:03, Mick Semb Wever wrote:
> >>>>> CDC sounds like it is in the same basket, but it already has the
> >>>>> `cdc_enabled` yaml flag which defaults false.
> >>>> I went this route because I was incredibly wary of changing the CL
> >>>> code and wanted to shield non-CDC users from any and all risk I
> >>>> reasonably could.
> >>>
> >>> This approach so far is my favourite. (Thanks Josh.)
> >>>
> >>> The flag name `cdc_enabled` is simple and, without adjectives, does not
> >>> imply "experimental" or "beta" or anything like that.
> >>> It does make life easier for both operators and the C* developers.
> >>>
> >>> I'm also fond of how Apache projects often vote both on the release as
> well
> >>> as its stability flag: Alpha|Beta|GA (General Availability).
> >>> https://httpd.apache.org/dev/release.html
> >>> http://www.apache.org/legal/release-policy.html#release-types
> >>>
> >>> Given the importance of The Database, i'd be keen to see attached such
> >>> community-agreed quality references. And going further, not just to the
> >>> releases but also to substantial new features (those yet to reach GA).
> Then
> >>> the downloads page could provide a table something like
> >>> https://paste.apache.org/FzrQ
> >>>
> >>> It's just one idea to throw out there, and while it hijacks the thread
> a
> >>> bit, it could even with just the quality tag on releases go a long way
> with
> >>> user trust. Especially if we really are humble about it and use GA
> >>> appropriately. For example I'm perfectly happy using a beta in
> production
> >>> if I see the community otherwise has good processes in place and
> there's
> >>> strong testing and staging resources to take advantage of. And as Kurt
> has
> >>> implied many users are indeed smart and wise enough to know how to
> safely
> >>> test and cautiously use even alpha features in production.
> >>>
> >>> Anyway, with or without the above idea, yaml flag names that don't
> >>> use adjectives could address Kurt's concerns about pulling the rug from
> >>> under the feet of existing users. Such a flag is but a small
> improvement
> >>> suitable for a minor release (you must read the NEWS.txt before even a
> >>> patch upgrade), and the documentation is only making explicit what
> should
> >>> have been all along. Users shouldn't feel that we're returning features
> >>> into "alpha|beta" mode when what we're actually doing is improving the
> >>> community's quality assurance documentation.
> >>>
> >>> Mick
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> >> For additional commands, e-mail: dev-help@cassandra.apache.org
> >>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

Can't we promote these behavioural flags to keyspace properties (with suitable permissions to edit necessary)?

I agree that enabling/disabling features shouldn't require a rolling restart, and nor should switching their consistency safety level.

I think this would be the most suitable equivalent to ALLOW FILTERING for MVs.



> On 4 Oct 2017, at 12:31, Jeremy Hanna <je...@gmail.com> wrote:
> 
> Not to detract from the discussion about whether or not to classify X or Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303 <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally about operators preventing users from abusing features (e.g. allow filtering).  Could that concept be extended to features like MVs or SASI or anything else?  On the one hand it is nice to be able to set those things dynamically without a rolling restart as well as by user.  On the other it’s less clear about defaults.  There could be a property file or just in the yaml, the operator could specify the default features that are enabled for users and then it could be overridden within that framework.
> 
>> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com> wrote:
>> 
>> We already have those for UDFs and CDC.
>> 
>> We should have more: for triggers, SASI, and MVs, at least. Operators need a way to disable features they haven’t validated.
>> 
>> We already have sufficient consensus to introduce the flags, and we should. There also seems to be sufficient consensus on emitting warnings.
>> 
>> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree with Sylvain that flipping the default in a minor would be invasive. We shouldn’t do that.
>> 
>> For trunk, though, I think we should default to off. When it comes to releasing 4.0 we can collectively decide if there is sufficient trust in MVs at the time to warrant flipping the default to true. Ultimately we can decide this in a PMC vote. If I misread the consensus regarding the default for 4.0, then we might as well vote on that. What I see is sufficient distrust coming from core committers, including the author of the v1 design, to warrant opt-in for MVs.
>> 
>> If we don’t trust in them as developers, we shouldn’t be cavalier with the users, either. Not until that trust is gained/regained.
>> 
>> —
>> AY
>> 
>> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org) wrote:
>> 
>> Introducing feature flags for enabling or disabling different code paths  
>> is not sustainable in the long run. It's hard enough to keep up with  
>> integration testing with the couple of Jenkins jobs that we have.  
>> Running jobs for all permutations of flags that we keep around, would  
>> turn out impractical. But if we don't, I'm pretty sure something will  
>> fall off the radar and it won't take long until someone reports that  
>> enabling feature X after the latest upgrade will simply not work anymore.  
>> 
>> There may also be some more subtle assumptions and cross dependencies  
>> between features that may cause side effects by disabling a feature (or  
>> parts of it), even if it's just e.g. a metric value that suddenly won't  
>> get updated anymore, but is used somewhere else. We'll also have to  
>> consider migration paths for turning a feature on and off again without  
>> causing any downtime. If I was to turn on e.g. MVs on a single node in  
>> my cluster, then this should not cause any issues on the other nodes  
>> that still have MV code paths disabled. Again, this would need to be tested.  
>> 
>> So to be clear, my point is that any flags should be implemented in a  
>> really non-invasive way on the user facing side only, e.g. by emitting a  
>> log message or cqlsh error. At this point, I'm not really sure if it  
>> would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
>> that eventually they will be used to change the behaviour of our code,  
>> beside printing a log message.  
>> 
>> 
>> On 04.10.17 10:03, Mick Semb Wever wrote:  
>>>>> CDC sounds like it is in the same basket, but it already has the  
>>>>> `cdc_enabled` yaml flag which defaults false.  
>>>> I went this route because I was incredibly wary of changing the CL  
>>>> code and wanted to shield non-CDC users from any and all risk I  
>>>> reasonably could.  
>>> 
>>> This approach so far is my favourite. (Thanks Josh.)  
>>> 
>>> The flag name `cdc_enabled` is simple and, without adjectives, does not  
>>> imply "experimental" or "beta" or anything like that.  
>>> It does make life easier for both operators and the C* developers.  
>>> 
>>> I'm also fond of how Apache projects often vote both on the release as well  
>>> as its stability flag: Alpha|Beta|GA (General Availability).  
>>> https://httpd.apache.org/dev/release.html  
>>> http://www.apache.org/legal/release-policy.html#release-types  
>>> 
>>> Given the importance of The Database, i'd be keen to see attached such  
>>> community-agreed quality references. And going further, not just to the  
>>> releases but also to substantial new features (those yet to reach GA). Then  
>>> the downloads page could provide a table something like  
>>> https://paste.apache.org/FzrQ  
>>> 
>>> It's just one idea to throw out there, and while it hijacks the thread a  
>>> bit, it could even with just the quality tag on releases go a long way with  
>>> user trust. Especially if we really are humble about it and use GA  
>>> appropriately. For example I'm perfectly happy using a beta in production  
>>> if I see the community otherwise has good processes in place and there's  
>>> strong testing and staging resources to take advantage of. And as Kurt has  
>>> implied many users are indeed smart and wise enough to know how to safely  
>>> test and cautiously use even alpha features in production.  
>>> 
>>> Anyway, with or without the above idea, yaml flag names that don't  
>>> use adjectives could address Kurt's concerns about pulling the rug from  
>>> under the feet of existing users. Such a flag is but a small improvement  
>>> suitable for a minor release (you must read the NEWS.txt before even a  
>>> patch upgrade), and the documentation is only making explicit what should  
>>> have been all along. Users shouldn't feel that we're returning features  
>>> into "alpha|beta" mode when what we're actually doing is improving the  
>>> community's quality assurance documentation.  
>>> 
>>> Mick  
>>> 
>> 
>> 
>> ---------------------------------------------------------------------  
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremy Hanna <je...@gmail.com>.

Not to detract from the discussion about whether or not to classify X or Y as experimental but https://issues.apache.org/jira/browse/CASSANDRA-8303 <https://issues.apache.org/jira/browse/CASSANDRA-8303> was originally about operators preventing users from abusing features (e.g. allow filtering).  Could that concept be extended to features like MVs or SASI or anything else?  On the one hand it is nice to be able to set those things dynamically without a rolling restart as well as by user.  On the other it’s less clear about defaults.  There could be a property file or just in the yaml, the operator could specify the default features that are enabled for users and then it could be overridden within that framework.

> On Oct 4, 2017, at 10:24 AM, Aleksey Yeshchenko <al...@apple.com> wrote:
> 
> We already have those for UDFs and CDC.
> 
> We should have more: for triggers, SASI, and MVs, at least. Operators need a way to disable features they haven’t validated.
> 
> We already have sufficient consensus to introduce the flags, and we should. There also seems to be sufficient consensus on emitting warnings.
> 
> The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree with Sylvain that flipping the default in a minor would be invasive. We shouldn’t do that.
> 
> For trunk, though, I think we should default to off. When it comes to releasing 4.0 we can collectively decide if there is sufficient trust in MVs at the time to warrant flipping the default to true. Ultimately we can decide this in a PMC vote. If I misread the consensus regarding the default for 4.0, then we might as well vote on that. What I see is sufficient distrust coming from core committers, including the author of the v1 design, to warrant opt-in for MVs.
> 
> If we don’t trust in them as developers, we shouldn’t be cavalier with the users, either. Not until that trust is gained/regained.
> 
> —
> AY
> 
> On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org) wrote:
> 
> Introducing feature flags for enabling or disabling different code paths  
> is not sustainable in the long run. It's hard enough to keep up with  
> integration testing with the couple of Jenkins jobs that we have.  
> Running jobs for all permutations of flags that we keep around, would  
> turn out impractical. But if we don't, I'm pretty sure something will  
> fall off the radar and it won't take long until someone reports that  
> enabling feature X after the latest upgrade will simply not work anymore.  
> 
> There may also be some more subtle assumptions and cross dependencies  
> between features that may cause side effects by disabling a feature (or  
> parts of it), even if it's just e.g. a metric value that suddenly won't  
> get updated anymore, but is used somewhere else. We'll also have to  
> consider migration paths for turning a feature on and off again without  
> causing any downtime. If I was to turn on e.g. MVs on a single node in  
> my cluster, then this should not cause any issues on the other nodes  
> that still have MV code paths disabled. Again, this would need to be tested.  
> 
> So to be clear, my point is that any flags should be implemented in a  
> really non-invasive way on the user facing side only, e.g. by emitting a  
> log message or cqlsh error. At this point, I'm not really sure if it  
> would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
> that eventually they will be used to change the behaviour of our code,  
> beside printing a log message.  
> 
> 
> On 04.10.17 10:03, Mick Semb Wever wrote:  
>>>> CDC sounds like it is in the same basket, but it already has the  
>>>> `cdc_enabled` yaml flag which defaults false.  
>>> I went this route because I was incredibly wary of changing the CL  
>>> code and wanted to shield non-CDC users from any and all risk I  
>>> reasonably could.  
>> 
>> This approach so far is my favourite. (Thanks Josh.)  
>> 
>> The flag name `cdc_enabled` is simple and, without adjectives, does not  
>> imply "experimental" or "beta" or anything like that.  
>> It does make life easier for both operators and the C* developers.  
>> 
>> I'm also fond of how Apache projects often vote both on the release as well  
>> as its stability flag: Alpha|Beta|GA (General Availability).  
>> https://httpd.apache.org/dev/release.html  
>> http://www.apache.org/legal/release-policy.html#release-types  
>> 
>> Given the importance of The Database, i'd be keen to see attached such  
>> community-agreed quality references. And going further, not just to the  
>> releases but also to substantial new features (those yet to reach GA). Then  
>> the downloads page could provide a table something like  
>> https://paste.apache.org/FzrQ  
>> 
>> It's just one idea to throw out there, and while it hijacks the thread a  
>> bit, it could even with just the quality tag on releases go a long way with  
>> user trust. Especially if we really are humble about it and use GA  
>> appropriately. For example I'm perfectly happy using a beta in production  
>> if I see the community otherwise has good processes in place and there's  
>> strong testing and staging resources to take advantage of. And as Kurt has  
>> implied many users are indeed smart and wise enough to know how to safely  
>> test and cautiously use even alpha features in production.  
>> 
>> Anyway, with or without the above idea, yaml flag names that don't  
>> use adjectives could address Kurt's concerns about pulling the rug from  
>> under the feet of existing users. Such a flag is but a small improvement  
>> suitable for a minor release (you must read the NEWS.txt before even a  
>> patch upgrade), and the documentation is only making explicit what should  
>> have been all along. Users shouldn't feel that we're returning features  
>> into "alpha|beta" mode when what we're actually doing is improving the  
>> community's quality assurance documentation.  
>> 
>> Mick  
>> 
> 
> 
> ---------------------------------------------------------------------  
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> For additional commands, e-mail: dev-help@cassandra.apache.org  
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

We already have those for UDFs and CDC.

We should have more: for triggers, SASI, and MVs, at least. Operators need a way to disable features they haven’t validated.

We already have sufficient consensus to introduce the flags, and we should. There also seems to be sufficient consensus on emitting warnings.

The debate is now on their defaults for MVs in 3.0, 3.11, and 4.0. I agree with Sylvain that flipping the default in a minor would be invasive. We shouldn’t do that.

For trunk, though, I think we should default to off. When it comes to releasing 4.0 we can collectively decide if there is sufficient trust in MVs at the time to warrant flipping the default to true. Ultimately we can decide this in a PMC vote. If I misread the consensus regarding the default for 4.0, then we might as well vote on that. What I see is sufficient distrust coming from core committers, including the author of the v1 design, to warrant opt-in for MVs.

If we don’t trust in them as developers, we shouldn’t be cavalier with the users, either. Not until that trust is gained/regained.

—
AY

On 4 October 2017 at 13:26:10, Stefan Podkowinski (spod@apache.org) wrote:

Introducing feature flags for enabling or disabling different code paths  
is not sustainable in the long run. It's hard enough to keep up with  
integration testing with the couple of Jenkins jobs that we have.  
Running jobs for all permutations of flags that we keep around, would  
turn out impractical. But if we don't, I'm pretty sure something will  
fall off the radar and it won't take long until someone reports that  
enabling feature X after the latest upgrade will simply not work anymore.  

There may also be some more subtle assumptions and cross dependencies  
between features that may cause side effects by disabling a feature (or  
parts of it), even if it's just e.g. a metric value that suddenly won't  
get updated anymore, but is used somewhere else. We'll also have to  
consider migration paths for turning a feature on and off again without  
causing any downtime. If I was to turn on e.g. MVs on a single node in  
my cluster, then this should not cause any issues on the other nodes  
that still have MV code paths disabled. Again, this would need to be tested.  

So to be clear, my point is that any flags should be implemented in a  
really non-invasive way on the user facing side only, e.g. by emitting a  
log message or cqlsh error. At this point, I'm not really sure if it  
would be a good idea to add them to cassandra.yaml, as I'm pretty sure  
that eventually they will be used to change the behaviour of our code,  
beside printing a log message.  

On 04.10.17 10:03, Mick Semb Wever wrote:  
>>> CDC sounds like it is in the same basket, but it already has the  
>>> `cdc_enabled` yaml flag which defaults false.  
>> I went this route because I was incredibly wary of changing the CL  
>> code and wanted to shield non-CDC users from any and all risk I  
>> reasonably could.  
>  
> This approach so far is my favourite. (Thanks Josh.)  
>  
> The flag name `cdc_enabled` is simple and, without adjectives, does not  
> imply "experimental" or "beta" or anything like that.  
> It does make life easier for both operators and the C* developers.  
>  
> I'm also fond of how Apache projects often vote both on the release as well  
> as its stability flag: Alpha|Beta|GA (General Availability).  
> https://httpd.apache.org/dev/release.html  
> http://www.apache.org/legal/release-policy.html#release-types  
>  
> Given the importance of The Database, i'd be keen to see attached such  
> community-agreed quality references. And going further, not just to the  
> releases but also to substantial new features (those yet to reach GA). Then  
> the downloads page could provide a table something like  
> https://paste.apache.org/FzrQ  
>  
> It's just one idea to throw out there, and while it hijacks the thread a  
> bit, it could even with just the quality tag on releases go a long way with  
> user trust. Especially if we really are humble about it and use GA  
> appropriately. For example I'm perfectly happy using a beta in production  
> if I see the community otherwise has good processes in place and there's  
> strong testing and staging resources to take advantage of. And as Kurt has  
> implied many users are indeed smart and wise enough to know how to safely  
> test and cautiously use even alpha features in production.  
>  
> Anyway, with or without the above idea, yaml flag names that don't  
> use adjectives could address Kurt's concerns about pulling the rug from  
> under the feet of existing users. Such a flag is but a small improvement  
> suitable for a minor release (you must read the NEWS.txt before even a  
> patch upgrade), and the documentation is only making explicit what should  
> have been all along. Users shouldn't feel that we're returning features  
> into "alpha|beta" mode when what we're actually doing is improving the  
> community's quality assurance documentation.  
>  
> Mick  
>  

---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Stefan Podkowinski <sp...@apache.org>.

Introducing feature flags for enabling or disabling different code paths
is not sustainable in the long run. It's hard enough to keep up with
integration testing with the couple of Jenkins jobs that we have.
Running jobs for all permutations of flags that we keep around, would
turn out impractical. But if we don't, I'm pretty sure something will
fall off the radar and it won't take long until someone reports that
enabling feature X after the latest upgrade will simply not work anymore.

There may also be some more subtle assumptions and cross dependencies
between features that may cause side effects by disabling a feature (or
parts of it), even if it's just e.g. a metric value that suddenly won't
get updated anymore, but is used somewhere else. We'll also have to
consider migration paths for turning a feature on and off again without
causing any downtime. If I was to turn on e.g. MVs on a single node in
my cluster, then this should not cause any issues on the other nodes
that still have MV code paths disabled. Again, this would need to be tested.

So to be clear, my point is that any flags should be implemented in a
really non-invasive way on the user facing side only, e.g. by emitting a
log message or cqlsh error. At this point, I'm not really sure if it
would be a good idea to add them to cassandra.yaml, as I'm pretty sure
that eventually they will be used to change the behaviour of our code,
beside printing a log message.

On 04.10.17 10:03, Mick Semb Wever wrote:
>>> CDC sounds like it is in the same basket, but it already has the
>>> `cdc_enabled` yaml flag which defaults false.
>> I went this route because I was incredibly wary of changing the CL
>> code and wanted to shield non-CDC users from any and all risk I
>> reasonably could.
>
> This approach so far is my favourite. (Thanks Josh.)
>
> The flag name `cdc_enabled` is simple and, without adjectives, does not
> imply "experimental" or "beta" or anything like that.
> It does make life easier for both operators and the C* developers.
>
> I'm also fond of how Apache projects often vote both on the release as well
> as its stability flag: Alpha|Beta|GA (General Availability).
>     https://httpd.apache.org/dev/release.html
>     http://www.apache.org/legal/release-policy.html#release-types
>
> Given the importance of The Database, i'd be keen to see attached such
> community-agreed quality references. And going further, not just to the
> releases but also to substantial new features (those yet to reach GA). Then
> the downloads page could provide a table something like
> https://paste.apache.org/FzrQ
>
> It's just one idea to throw out there, and while it hijacks the thread a
> bit, it could even with just the quality tag on releases go a long way with
> user trust. Especially if we really are humble about it and use GA
> appropriately. For example I'm perfectly happy using a beta in production
> if I see the community otherwise has good processes in place and there's
> strong testing and staging resources to take advantage of. And as Kurt has
> implied many users are indeed smart and wise enough to know how to safely
> test and cautiously use even alpha features in production.
>
> Anyway, with or without the above idea, yaml flag names that don't
> use adjectives could address Kurt's concerns about pulling the rug from
> under the feet of existing users. Such a flag is but a small improvement
> suitable for a minor release (you must read the NEWS.txt before even a
> patch upgrade), and the documentation is only making explicit what should
> have been all along. Users shouldn't feel that we're returning features
> into "alpha|beta" mode when what we're actually doing is improving the
> community's quality assurance documentation.
>
> Mick
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Mick Semb Wever <mi...@thelastpickle.com>.

> > CDC sounds like it is in the same basket, but it already has the
> > `cdc_enabled` yaml flag which defaults false.
>
> I went this route because I was incredibly wary of changing the CL
> code and wanted to shield non-CDC users from any and all risk I
> reasonably could.


This approach so far is my favourite. (Thanks Josh.)

The flag name `cdc_enabled` is simple and, without adjectives, does not
imply "experimental" or "beta" or anything like that.
It does make life easier for both operators and the C* developers.

I'm also fond of how Apache projects often vote both on the release as well
as its stability flag: Alpha|Beta|GA (General Availability).
    https://httpd.apache.org/dev/release.html
    http://www.apache.org/legal/release-policy.html#release-types

Given the importance of The Database, i'd be keen to see attached such
community-agreed quality references. And going further, not just to the
releases but also to substantial new features (those yet to reach GA). Then
the downloads page could provide a table something like
https://paste.apache.org/FzrQ

It's just one idea to throw out there, and while it hijacks the thread a
bit, it could even with just the quality tag on releases go a long way with
user trust. Especially if we really are humble about it and use GA
appropriately. For example I'm perfectly happy using a beta in production
if I see the community otherwise has good processes in place and there's
strong testing and staging resources to take advantage of. And as Kurt has
implied many users are indeed smart and wise enough to know how to safely
test and cautiously use even alpha features in production.

Anyway, with or without the above idea, yaml flag names that don't
use adjectives could address Kurt's concerns about pulling the rug from
under the feet of existing users. Such a flag is but a small improvement
suitable for a minor release (you must read the NEWS.txt before even a
patch upgrade), and the documentation is only making explicit what should
have been all along. Users shouldn't feel that we're returning features
into "alpha|beta" mode when what we're actually doing is improving the
community's quality assurance documentation.

Mick

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

> CDC sounds like it is in the same basket, but it already has the
> `cdc_enabled` yaml flag which defaults false.

I went this route because I was incredibly wary of changing the CL
code and wanted to shield non-CDC users from any and all risk I
reasonably could. I don't know of any outstanding issues with the
feature, and this calls into question how we distinguish between 'new
feature, we consider this stable (scope is constrained, testing
coverage, etc)', and 'new feature, this is admittedly experimental',
as well as determining how long something remains 'experimental' and
at what point we remove that .yaml gate-keeping and warning.

For instance, SASI hasn't seen that much development in a long while,
so at what point do we address Marcus' question of 'when do we
consider an experimental feature atrophied and remove it'? What does
this mean for users that took the plunge and are using these features
in production if they're stable for their use-case?

We're also going to introduce complexity and risk into the code-base
w/experimental features we later pull out. With as much static state
as we have in this project (and a lot of the design precedent in the
code-base, inheritance coupling, etc), you can't exactly add a
completely isolated, cleanly abstracted feature into the code-base and
remove it risk-free later.

On Mon, Oct 2, 2017 at 7:16 PM, Mick Semb Wever <mi...@thelastpickle.com> wrote:
> On 3 October 2017 at 04:57, Aleksey Yeshchenko <al...@apple.com> wrote:
>
>> The idea is to check the flag in CreateViewStatement, so creation of new
>> MVs doesn’t succeed without that flag flipped.
>> Obviously, just disabling existing MVs working in a minor would be silly.
>> As for the warning - yes, that should also be emitted. Unconditionally.
>>
>
>
> Thanks Aleksey, this was the best read in this thread imo of what should be
> done with MVs.
> (With these warnings being emitted in both logs and cqlsh).
>
> Hopefully similar "creation flags" and log+cqlsh warnings can be added to:
> triggers, SASI, and incremental repair (<4.0).
>
> CDC sounds like it is in the same basket, but it already has the
> `cdc_enabled` yaml flag which defaults false.
>
> Mick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Jake Luciani <ja...@gmail.com>.

>
> the default behavioural unsafety we opt for by not writing through the
> batch log


We always write to the local batchlog (unless the MV replica is the local
node)...  Most of the bugs have been around tombstones and ttls AFAIK.

On Tue, Oct 3, 2017 at 3:23 PM, Benedict Elliott Smith <_...@belliottsmith.com>
wrote:

> This link is a helpful segway to another problem with MVs and defaults -
> the default behavioural unsafety we opt for by not writing through the
> batch log, opening far more windows for data inconsistency than the
> algorithm otherwise permits.  Without a way to detect or repair these
> inconsistencies this seems cavalier as a default, and it's even more
> pressing to change in my opinion, however we resolve the experimental
> status (I am in favour of marking them experimental also, ftr)
>
> As the originator of the broad-strokes algorithm, I'd more generally like
> to offer my 2c that we do not sufficiently understand the algorithm's
> qualities to recommend it in production, or perhaps at all.  We had agreed
> over IRC that we would model/simulate the algorithm under a multiplicity of
> cluster scenarios (by which I mean something more complete than dtests)
> before we released, but this unfortunately never materialised.
>
> The documentation at present also doesn't highlight the problems we *know*
> can occur, such as with availability-loss, and the fuzzy-application of
> consistency levels with MVs.
>
> Certainly it may be said that it was harder to achieve this functionality
> by application maintainers, but I do think we have a duty of care to fully
> understand and explain the new tools we provide, as it may be that we offer
> fewer guarantees than many users might be able to readily achieve, and they
> don't know this.  Even for those users we can offer better guarantees, it
> is in my opinion fundamentally a problem to offer a tool we do not fully
> understand or fully explain/caveat the behaviour of.
>
>
> On 3 Oct 2017, at 15:02, Jake Luciani <ja...@gmail.com> wrote:
>
> >>
> >> The remaining issues are:
> >>
> >> * There's no way to determine if a view is out of sync with the base
> table.
> >> * If you do determine that a view is out of sync, the only way to fix it
> >> is to drop and rebuild the view.
> >> * There are liveness issues with updates being reflected in the view.
> >>
> >
> > I just want to mention that manual de-normalization has all the same
> issues
> > as the list of above.  If you write to multiple tables with batch logs
> when
> > do you know the data is consistent?
> > In fact, manual de-normalization is worse because you can't manually
> handle
> > updates to existing data due to the lack of synchronization on read
> before
> > write.
> >
> > I think a lot of you have lost sight on what MV was intended for, as a
> way
> > to keep developers from manually maintaining a consistent view of data
> > across tables.
> > There is still the fundamental problem of managing multiple views of data
> > even if you remove the MV feature, you just make it someone else's
> problem.
> >
> > I'll re-post this blog from back when MVs first came out to hopefully
> clear
> > questions up on the goals of MV.
> >
> > https://www.datastax.com/dev/blog/understanding-materialized-views
> >
> > -Jake
> >
> >
> > On Tue, Oct 3, 2017 at 2:50 PM, Aleksey Yeshchenko <al...@apple.com>
> > wrote:
> >
> >> Indeed. Paulo and Zhao did a lot of good work to make the situation less
> >> bad. You did some as well. Even I retouched small parts of it - metadata
> >> related. I’m sorry if I came off as disrespectful - I didn’t mean to.
> I’ve
> >> seen and I appreciate every commit that went into it.
> >>
> >> It is however my opinion that we started at a very low point, for a
> >> variety of reasons, and climbing out of that initial poor state, to the
> >> level that power users start having trust in MVs and overcome the
> initial
> >> deservedly poor impression, will probably take even more work. And
> when/if
> >> we get there, maybe we won’t need the switch anymore.
> >>
> >> —
> >> AY
> >>
> >> On 3 October 2017 at 17:00:31, Sylvain Lebresne (sylvain@datastax.com)
> >> wrote:
> >>
> >> You're giving little credit to the hard work that people have put into
> >> getting MV in a usable state.
> >>
> >
> >
> >
> > --
> > http://twitter.com/tjake
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>


-- 
http://twitter.com/tjake

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

This link is a helpful segway to another problem with MVs and defaults - the default behavioural unsafety we opt for by not writing through the batch log, opening far more windows for data inconsistency than the algorithm otherwise permits.  Without a way to detect or repair these inconsistencies this seems cavalier as a default, and it's even more pressing to change in my opinion, however we resolve the experimental status (I am in favour of marking them experimental also, ftr)

As the originator of the broad-strokes algorithm, I'd more generally like to offer my 2c that we do not sufficiently understand the algorithm's qualities to recommend it in production, or perhaps at all.  We had agreed over IRC that we would model/simulate the algorithm under a multiplicity of cluster scenarios (by which I mean something more complete than dtests) before we released, but this unfortunately never materialised.  

The documentation at present also doesn't highlight the problems we *know* can occur, such as with availability-loss, and the fuzzy-application of consistency levels with MVs.

Certainly it may be said that it was harder to achieve this functionality by application maintainers, but I do think we have a duty of care to fully understand and explain the new tools we provide, as it may be that we offer fewer guarantees than many users might be able to readily achieve, and they don't know this.  Even for those users we can offer better guarantees, it is in my opinion fundamentally a problem to offer a tool we do not fully understand or fully explain/caveat the behaviour of.

On 3 Oct 2017, at 15:02, Jake Luciani <ja...@gmail.com> wrote:

>> 
>> The remaining issues are:
>> 
>> * There's no way to determine if a view is out of sync with the base table.
>> * If you do determine that a view is out of sync, the only way to fix it
>> is to drop and rebuild the view.
>> * There are liveness issues with updates being reflected in the view.
>> 
> 
> I just want to mention that manual de-normalization has all the same issues
> as the list of above.  If you write to multiple tables with batch logs when
> do you know the data is consistent?
> In fact, manual de-normalization is worse because you can't manually handle
> updates to existing data due to the lack of synchronization on read before
> write.
> 
> I think a lot of you have lost sight on what MV was intended for, as a way
> to keep developers from manually maintaining a consistent view of data
> across tables.
> There is still the fundamental problem of managing multiple views of data
> even if you remove the MV feature, you just make it someone else's problem.
> 
> I'll re-post this blog from back when MVs first came out to hopefully clear
> questions up on the goals of MV.
> 
> https://www.datastax.com/dev/blog/understanding-materialized-views
> 
> -Jake
> 
> 
> On Tue, Oct 3, 2017 at 2:50 PM, Aleksey Yeshchenko <al...@apple.com>
> wrote:
> 
>> Indeed. Paulo and Zhao did a lot of good work to make the situation less
>> bad. You did some as well. Even I retouched small parts of it - metadata
>> related. I’m sorry if I came off as disrespectful - I didn’t mean to. I’ve
>> seen and I appreciate every commit that went into it.
>> 
>> It is however my opinion that we started at a very low point, for a
>> variety of reasons, and climbing out of that initial poor state, to the
>> level that power users start having trust in MVs and overcome the initial
>> deservedly poor impression, will probably take even more work. And when/if
>> we get there, maybe we won’t need the switch anymore.
>> 
>> —
>> AY
>> 
>> On 3 October 2017 at 17:00:31, Sylvain Lebresne (sylvain@datastax.com)
>> wrote:
>> 
>> You're giving little credit to the hard work that people have put into
>> getting MV in a usable state.
>> 
> 
> 
> 
> -- 
> http://twitter.com/tjake

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Sylvain Lebresne <sy...@datastax.com>.

For the record, in case I was unclear, it was never my intention to
suggest that we shouldn't warn about MVs: I would agree that we still
should and I'm happy that we do. I would also agree that the remaining
caveats and limitations should be more clearly documented.

But, I kind of got the feeling that people were trying to justify
taking what I consider somewhat drastic measures (disabling MVs by
default _in a patch release_) by piling on on how bad MV were and how
impossible it was for anyone ever to use them without dying of a
horrible death. This, to me, felt a bit unfair to the hard work that
has gone into fixing the more blatant problems.

Tl;dr, MVs are certainly not perfect (spoiler alert: they probably
will never be) but they are now imo in a state where some users can
use them productively, so OK to warn about their remaining
problems/limitations, but not ok for me to risk breaking existing user
in a patch release.


On Wed, Oct 4, 2017 at 3:31 AM, Benedict Elliott Smith
<_...@belliottsmith.com> wrote:
> So, I'm of the opinion there's a difference between users misusing a well understood feature whose shortcomings are widely discussed in the community, and providing a feature we don't fully understand, have not fully documented the caveats of, let alone discovered all the problems with nor had that knowledge percolate fully into the wider community.
>
> I also think there's a huge difference between users shooting themselves in the foot, and us shooting them in the foot.
>
> There's a degree of trust - undeserved - that goes with being a database.  People assume you're smarter than them, and that it Just Works.  Given this, and that squandering this trust as a bad thing, I personally believe it is better to offer the feature as experimental until we iron out all of the problems, fully understand it, and have a wider community knowledge base around it.
>
> We can still encourage users that can tolerate problems to use it, but we won't be giving any false assurances to those that don't.  Doesn't that seem like a win-win?
>
>
>
>> On 3 Oct 2017, at 21:07, Jeremiah D Jordan <je...@gmail.com> wrote:
>>
>> So for some perspective here, how do users who do not get the guarantees of MV’s implement this on their own?  They used logged batches.
>>
>> Pseudo CQL here, but you should get the picture:
>>
>> If they don’t ever update data, they do it like so, and it is pretty safe:
>> BEGIN BATCH
>> INSERT tablea blah
>> INSERT tableb blahview
>> END BATCH
>>
>> If they do update data, they likely do it like so, and get it wrong in the face of concurrency:
>> SELECT * from tablea WHERE blah;
>>
>> BEGIN BATCH
>> INSERT tablea blah
>> INSERT tableb blahview
>> DELETE tableb oldblahview
>> END BATCH
>>
>> A sophisticated user that understands the concurrency issues may well try to implement it like so:
>>
>> SELECT key, col1, col2 FROM tablea WHERE key=blah;
>>
>> BEGIN BATCH
>> UPDATE tablea col1=new1, col2=new2 WHERE key=blah IF col1=old1 and col2=old2
>> UPDATE tableb viewc1=new2, viewc2=blah WHERE key=new1
>> DELETE tableb WHERE key=old1
>> END BATCH
>>
>> And it wouldn’t work because you can only use LWT in a BATCH if all updates have the same partition key value, and the whole point of a view most of the time is that it doesn't (and there are other issues with this, like most likely needing to use uuid’s or something else to distinguish between concurrent updates, that are not realized until it is too late).
>>
>> A user who does not dig in and understand how MV’s work, most likely also does not dig in to understand the trade offs and draw backs of logged batches to multiple tables across different partition keys.  Or even necessarily of read before writes, and concurrent updates and the races inherent in them.  I would guess that using MV’s, even as they are today is *safer* for these users than rolling their own.  I have seen these patterns implemented by people many times, including the “broken in the face of concurrency” version.  So lets please not try to argue that a casual user that does not dig in to the specifics of feature A is going dig in and understand the specifics of any other features.  So yes, I would prefer my bank to use MV’s as they are today over rolling their own, and getting it even more wrong.
>>
>> Now, even given all that, if we want to warn users of the pit falls of using MV’s, then lets do that.  But lets keep some perspective on how things actually get used.
>>
>> -Jeremiah
>>
>>> On Oct 3, 2017, at 8:12 PM, Benedict Elliott Smith <_...@belliottsmith.com> wrote:
>>>
>>> While many users may apparently be using MVs successfully, the problem is how few (if any) know what guarantees they are getting.  Since we aren’t even absolutely certain ourselves, it cannot be many.  Most of the shortcomings we are aware of are complicated, concern failure scenarios and aren’t fully explained; i.e. if you’re lucky they’ll never be a problem, but some users must surely be bitten, and they won’t have had fair warning.  The same goes for as-yet undiscovered edge cases.
>>>
>>> It is my humble opinion that averting problems like this for just a handful of users, that cannot readily tolerate corruption, offsets any inconvenience we might cause to those who can.
>>>
>>> For the record, while it’s true that detecting inconsistencies is as much of a problem for user-rolled solutions, it’s worth remembering that the inconsistencies themselves are not equally likely:
>>>
>>> In cases where C* is not the database of record, it is quite easy to provide very good consistency guarantees when rolling your own
>>> Conversely, a global-CAS with synchronous QUORUM updates that are retried until success, while much slower, also doesn’t easily suffer these consistency problems, and is the naive approach a user might take if C* were the database of record
>>>
>>> Given our approach isn’t uniformly superior, I think we should be very cautious about how it is made available until we’re very confident in it, and we and the community fully understand it.
>>>
>>>
>>>> On 3 Oct 2017, at 18:51, kurt greaves <ku...@instaclustr.com> wrote:
>>>>
>>>> Lots of users are already using MV's, believe it or not in some cases quite
>>>> effectively and also on older versions which were still exposed to a lot of
>>>> the bugs that cause inconsistencies. 3.11.1 has come a long way since then
>>>> and I think with a bit more documentation around the current issues marking
>>>> MV's as experimental is unnecessary and likely annoying for current users.
>>>> On that note we've already had complaints about changing defaults and
>>>> behaviours willy nilly across majors and minors, I can't see this helping
>>>> our cause. Sure, you can make it "seamless" from an upgrade perspective,
>>>> but that doesn't account for every single way operators do things. I'm sure
>>>> someone will express surprise when they run up a new cluster or datacenter
>>>> for testing with default config and find out that they have to enable MV's.
>>>> Meanwhile they've been using them the whole time and haven't had any major
>>>> issues because they didn't touch the edge cases.
>>>>
>>>> I'd like to point out that introducing "experimental" features sets a
>>>> precedent for future releases, and will likely result in using the
>>>> "experimental" tag to push out features that are not ready (again). In fact
>>>> we already routinely say >=3 isn't production ready yet, so why don't we
>>>> just mark 3+ as "experimental" as well? I don't think experimental is the
>>>> right approach for a database. The better solution, as I said, is more
>>>> verification and testing during the release process (by users!). A lot of
>>>> other projects take this approach, and it certainly makes sense. It could
>>>> also be coupled with beta releases, so people can start getting
>>>> verification of their new features at an earlier date. Granted this is
>>>> similar to experimental features, but applied to the whole release rather
>>>> than just individual features.
>>>>
>>>> * There's no way to determine if a view is out of sync with the base table.
>>>>>
>>>> As already pointed out by Jake, this is still true when you don't use
>>>> MV's. We should document this. I think it's entirely fair to say that
>>>> users *should
>>>> not *expect this to be done for them. There is also no way for a user to
>>>> determine they have inconsistencies short of their own verification. And
>>>> also a lot of the synchronisation problems have been resolved, undoubtedly
>>>> there are more unknowns out there but what MV's have is still better than
>>>> managing your own.
>>>>
>>>>> * If you do determine that a view is out of sync, the only way to fix it
>>>>> is to drop and rebuild the view.
>>>>>
>>>> This is undoubtedly a problem, but also no worse than managing your own
>>>> views. Also at least there is still a way to fix your view. It certainly
>>>> shouldn't be as common in 3.11.1/3.0.15, and we have enough insight now to
>>>> be able to tell when out of sync will actually occur, so we can document
>>>> those cases.
>>>>
>>>>> * There are liveness issues with updates being reflected in the view.
>>>>
>>>> What specific issues are you referring to here? The only one I'm aware of
>>>> is deletion of unselected columns in the view affecting out of order
>>>> updates. If we deem this a major problem we can document it or at least put
>>>> a restriction in place until it's fixed in CASSANDRA-13826
>>>> <https://issues.apache.org/jira/browse/CASSANDRA-13826>
>>>>
>>>>
>>>> In this case, 'out of sync' means 'you lost data', since the current design
>>>>> + repair should keep things eventually consistent right?
>>>>
>>>> I'd like Zhao or Paulo to confirm here but I believe the only way you can
>>>> really "lose data" (that can't be repaired) here would be partition
>>>> deletions on massively wide rows in the view that will not fit in the
>>>> batchlog (256mb/max value size) as it currently stands. Frankly this is
>>>> probably an anti-pattern for MV's at the moment anyway and one we should
>>>> advise against.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

So, I'm of the opinion there's a difference between users misusing a well understood feature whose shortcomings are widely discussed in the community, and providing a feature we don't fully understand, have not fully documented the caveats of, let alone discovered all the problems with nor had that knowledge percolate fully into the wider community.

I also think there's a huge difference between users shooting themselves in the foot, and us shooting them in the foot.  

There's a degree of trust - undeserved - that goes with being a database.  People assume you're smarter than them, and that it Just Works.  Given this, and that squandering this trust as a bad thing, I personally believe it is better to offer the feature as experimental until we iron out all of the problems, fully understand it, and have a wider community knowledge base around it.

We can still encourage users that can tolerate problems to use it, but we won't be giving any false assurances to those that don't.  Doesn't that seem like a win-win?



> On 3 Oct 2017, at 21:07, Jeremiah D Jordan <je...@gmail.com> wrote:
> 
> So for some perspective here, how do users who do not get the guarantees of MV’s implement this on their own?  They used logged batches.
> 
> Pseudo CQL here, but you should get the picture:
> 
> If they don’t ever update data, they do it like so, and it is pretty safe:
> BEGIN BATCH
> INSERT tablea blah
> INSERT tableb blahview
> END BATCH
> 
> If they do update data, they likely do it like so, and get it wrong in the face of concurrency:
> SELECT * from tablea WHERE blah;
> 
> BEGIN BATCH
> INSERT tablea blah
> INSERT tableb blahview
> DELETE tableb oldblahview
> END BATCH
> 
> A sophisticated user that understands the concurrency issues may well try to implement it like so:
> 
> SELECT key, col1, col2 FROM tablea WHERE key=blah;
> 
> BEGIN BATCH
> UPDATE tablea col1=new1, col2=new2 WHERE key=blah IF col1=old1 and col2=old2
> UPDATE tableb viewc1=new2, viewc2=blah WHERE key=new1
> DELETE tableb WHERE key=old1
> END BATCH
> 
> And it wouldn’t work because you can only use LWT in a BATCH if all updates have the same partition key value, and the whole point of a view most of the time is that it doesn't (and there are other issues with this, like most likely needing to use uuid’s or something else to distinguish between concurrent updates, that are not realized until it is too late).
> 
> A user who does not dig in and understand how MV’s work, most likely also does not dig in to understand the trade offs and draw backs of logged batches to multiple tables across different partition keys.  Or even necessarily of read before writes, and concurrent updates and the races inherent in them.  I would guess that using MV’s, even as they are today is *safer* for these users than rolling their own.  I have seen these patterns implemented by people many times, including the “broken in the face of concurrency” version.  So lets please not try to argue that a casual user that does not dig in to the specifics of feature A is going dig in and understand the specifics of any other features.  So yes, I would prefer my bank to use MV’s as they are today over rolling their own, and getting it even more wrong.
> 
> Now, even given all that, if we want to warn users of the pit falls of using MV’s, then lets do that.  But lets keep some perspective on how things actually get used.
> 
> -Jeremiah
> 
>> On Oct 3, 2017, at 8:12 PM, Benedict Elliott Smith <_...@belliottsmith.com> wrote:
>> 
>> While many users may apparently be using MVs successfully, the problem is how few (if any) know what guarantees they are getting.  Since we aren’t even absolutely certain ourselves, it cannot be many.  Most of the shortcomings we are aware of are complicated, concern failure scenarios and aren’t fully explained; i.e. if you’re lucky they’ll never be a problem, but some users must surely be bitten, and they won’t have had fair warning.  The same goes for as-yet undiscovered edge cases.
>> 
>> It is my humble opinion that averting problems like this for just a handful of users, that cannot readily tolerate corruption, offsets any inconvenience we might cause to those who can.
>> 
>> For the record, while it’s true that detecting inconsistencies is as much of a problem for user-rolled solutions, it’s worth remembering that the inconsistencies themselves are not equally likely:
>> 
>> In cases where C* is not the database of record, it is quite easy to provide very good consistency guarantees when rolling your own
>> Conversely, a global-CAS with synchronous QUORUM updates that are retried until success, while much slower, also doesn’t easily suffer these consistency problems, and is the naive approach a user might take if C* were the database of record
>> 
>> Given our approach isn’t uniformly superior, I think we should be very cautious about how it is made available until we’re very confident in it, and we and the community fully understand it.
>> 
>> 
>>> On 3 Oct 2017, at 18:51, kurt greaves <ku...@instaclustr.com> wrote:
>>> 
>>> Lots of users are already using MV's, believe it or not in some cases quite
>>> effectively and also on older versions which were still exposed to a lot of
>>> the bugs that cause inconsistencies. 3.11.1 has come a long way since then
>>> and I think with a bit more documentation around the current issues marking
>>> MV's as experimental is unnecessary and likely annoying for current users.
>>> On that note we've already had complaints about changing defaults and
>>> behaviours willy nilly across majors and minors, I can't see this helping
>>> our cause. Sure, you can make it "seamless" from an upgrade perspective,
>>> but that doesn't account for every single way operators do things. I'm sure
>>> someone will express surprise when they run up a new cluster or datacenter
>>> for testing with default config and find out that they have to enable MV's.
>>> Meanwhile they've been using them the whole time and haven't had any major
>>> issues because they didn't touch the edge cases.
>>> 
>>> I'd like to point out that introducing "experimental" features sets a
>>> precedent for future releases, and will likely result in using the
>>> "experimental" tag to push out features that are not ready (again). In fact
>>> we already routinely say >=3 isn't production ready yet, so why don't we
>>> just mark 3+ as "experimental" as well? I don't think experimental is the
>>> right approach for a database. The better solution, as I said, is more
>>> verification and testing during the release process (by users!). A lot of
>>> other projects take this approach, and it certainly makes sense. It could
>>> also be coupled with beta releases, so people can start getting
>>> verification of their new features at an earlier date. Granted this is
>>> similar to experimental features, but applied to the whole release rather
>>> than just individual features.
>>> 
>>> * There's no way to determine if a view is out of sync with the base table.
>>>> 
>>> As already pointed out by Jake, this is still true when you don't use
>>> MV's. We should document this. I think it's entirely fair to say that
>>> users *should
>>> not *expect this to be done for them. There is also no way for a user to
>>> determine they have inconsistencies short of their own verification. And
>>> also a lot of the synchronisation problems have been resolved, undoubtedly
>>> there are more unknowns out there but what MV's have is still better than
>>> managing your own.
>>> 
>>>> * If you do determine that a view is out of sync, the only way to fix it
>>>> is to drop and rebuild the view.
>>>> 
>>> This is undoubtedly a problem, but also no worse than managing your own
>>> views. Also at least there is still a way to fix your view. It certainly
>>> shouldn't be as common in 3.11.1/3.0.15, and we have enough insight now to
>>> be able to tell when out of sync will actually occur, so we can document
>>> those cases.
>>> 
>>>> * There are liveness issues with updates being reflected in the view.
>>> 
>>> What specific issues are you referring to here? The only one I'm aware of
>>> is deletion of unselected columns in the view affecting out of order
>>> updates. If we deem this a major problem we can document it or at least put
>>> a restriction in place until it's fixed in CASSANDRA-13826
>>> <https://issues.apache.org/jira/browse/CASSANDRA-13826>
>>> 
>>> 
>>> In this case, 'out of sync' means 'you lost data', since the current design
>>>> + repair should keep things eventually consistent right?
>>> 
>>> I'd like Zhao or Paulo to confirm here but I believe the only way you can
>>> really "lose data" (that can't be repaired) here would be partition
>>> deletions on massively wide rows in the view that will not fit in the
>>> batchlog (256mb/max value size) as it currently stands. Frankly this is
>>> probably an anti-pattern for MV's at the moment anyway and one we should
>>> advise against.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremiah D Jordan <je...@gmail.com>.

So for some perspective here, how do users who do not get the guarantees of MV’s implement this on their own?  They used logged batches.

Pseudo CQL here, but you should get the picture:

If they don’t ever update data, they do it like so, and it is pretty safe:
BEGIN BATCH
INSERT tablea blah
INSERT tableb blahview
END BATCH

If they do update data, they likely do it like so, and get it wrong in the face of concurrency:
SELECT * from tablea WHERE blah;

BEGIN BATCH
INSERT tablea blah
INSERT tableb blahview
DELETE tableb oldblahview
END BATCH

A sophisticated user that understands the concurrency issues may well try to implement it like so:

SELECT key, col1, col2 FROM tablea WHERE key=blah;

BEGIN BATCH
UPDATE tablea col1=new1, col2=new2 WHERE key=blah IF col1=old1 and col2=old2
UPDATE tableb viewc1=new2, viewc2=blah WHERE key=new1
DELETE tableb WHERE key=old1
END BATCH

And it wouldn’t work because you can only use LWT in a BATCH if all updates have the same partition key value, and the whole point of a view most of the time is that it doesn't (and there are other issues with this, like most likely needing to use uuid’s or something else to distinguish between concurrent updates, that are not realized until it is too late).

A user who does not dig in and understand how MV’s work, most likely also does not dig in to understand the trade offs and draw backs of logged batches to multiple tables across different partition keys.  Or even necessarily of read before writes, and concurrent updates and the races inherent in them.  I would guess that using MV’s, even as they are today is *safer* for these users than rolling their own.  I have seen these patterns implemented by people many times, including the “broken in the face of concurrency” version.  So lets please not try to argue that a casual user that does not dig in to the specifics of feature A is going dig in and understand the specifics of any other features.  So yes, I would prefer my bank to use MV’s as they are today over rolling their own, and getting it even more wrong.

Now, even given all that, if we want to warn users of the pit falls of using MV’s, then lets do that.  But lets keep some perspective on how things actually get used.

-Jeremiah

> On Oct 3, 2017, at 8:12 PM, Benedict Elliott Smith <_...@belliottsmith.com> wrote:
> 
> While many users may apparently be using MVs successfully, the problem is how few (if any) know what guarantees they are getting.  Since we aren’t even absolutely certain ourselves, it cannot be many.  Most of the shortcomings we are aware of are complicated, concern failure scenarios and aren’t fully explained; i.e. if you’re lucky they’ll never be a problem, but some users must surely be bitten, and they won’t have had fair warning.  The same goes for as-yet undiscovered edge cases.
> 
> It is my humble opinion that averting problems like this for just a handful of users, that cannot readily tolerate corruption, offsets any inconvenience we might cause to those who can.
> 
> For the record, while it’s true that detecting inconsistencies is as much of a problem for user-rolled solutions, it’s worth remembering that the inconsistencies themselves are not equally likely:
> 
> In cases where C* is not the database of record, it is quite easy to provide very good consistency guarantees when rolling your own
> Conversely, a global-CAS with synchronous QUORUM updates that are retried until success, while much slower, also doesn’t easily suffer these consistency problems, and is the naive approach a user might take if C* were the database of record
> 
> Given our approach isn’t uniformly superior, I think we should be very cautious about how it is made available until we’re very confident in it, and we and the community fully understand it.
> 
> 
>> On 3 Oct 2017, at 18:51, kurt greaves <ku...@instaclustr.com> wrote:
>> 
>> Lots of users are already using MV's, believe it or not in some cases quite
>> effectively and also on older versions which were still exposed to a lot of
>> the bugs that cause inconsistencies. 3.11.1 has come a long way since then
>> and I think with a bit more documentation around the current issues marking
>> MV's as experimental is unnecessary and likely annoying for current users.
>> On that note we've already had complaints about changing defaults and
>> behaviours willy nilly across majors and minors, I can't see this helping
>> our cause. Sure, you can make it "seamless" from an upgrade perspective,
>> but that doesn't account for every single way operators do things. I'm sure
>> someone will express surprise when they run up a new cluster or datacenter
>> for testing with default config and find out that they have to enable MV's.
>> Meanwhile they've been using them the whole time and haven't had any major
>> issues because they didn't touch the edge cases.
>> 
>> I'd like to point out that introducing "experimental" features sets a
>> precedent for future releases, and will likely result in using the
>> "experimental" tag to push out features that are not ready (again). In fact
>> we already routinely say >=3 isn't production ready yet, so why don't we
>> just mark 3+ as "experimental" as well? I don't think experimental is the
>> right approach for a database. The better solution, as I said, is more
>> verification and testing during the release process (by users!). A lot of
>> other projects take this approach, and it certainly makes sense. It could
>> also be coupled with beta releases, so people can start getting
>> verification of their new features at an earlier date. Granted this is
>> similar to experimental features, but applied to the whole release rather
>> than just individual features.
>> 
>> * There's no way to determine if a view is out of sync with the base table.
>>> 
>> As already pointed out by Jake, this is still true when you don't use
>> MV's. We should document this. I think it's entirely fair to say that
>> users *should
>> not *expect this to be done for them. There is also no way for a user to
>> determine they have inconsistencies short of their own verification. And
>> also a lot of the synchronisation problems have been resolved, undoubtedly
>> there are more unknowns out there but what MV's have is still better than
>> managing your own.
>> 
>>> * If you do determine that a view is out of sync, the only way to fix it
>>> is to drop and rebuild the view.
>>> 
>> This is undoubtedly a problem, but also no worse than managing your own
>> views. Also at least there is still a way to fix your view. It certainly
>> shouldn't be as common in 3.11.1/3.0.15, and we have enough insight now to
>> be able to tell when out of sync will actually occur, so we can document
>> those cases.
>> 
>>> * There are liveness issues with updates being reflected in the view.
>> 
>> What specific issues are you referring to here? The only one I'm aware of
>> is deletion of unselected columns in the view affecting out of order
>> updates. If we deem this a major problem we can document it or at least put
>> a restriction in place until it's fixed in CASSANDRA-13826
>> <https://issues.apache.org/jira/browse/CASSANDRA-13826>
>> 
>> 
>> In this case, 'out of sync' means 'you lost data', since the current design
>>> + repair should keep things eventually consistent right?
>> 
>> I'd like Zhao or Paulo to confirm here but I believe the only way you can
>> really "lose data" (that can't be repaired) here would be partition
>> deletions on massively wide rows in the view that will not fit in the
>> batchlog (256mb/max value size) as it currently stands. Frankly this is
>> probably an anti-pattern for MV's at the moment anyway and one we should
>> advise against.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Benedict Elliott Smith <_...@belliottsmith.com>.

While many users may apparently be using MVs successfully, the problem is how few (if any) know what guarantees they are getting.  Since we aren’t even absolutely certain ourselves, it cannot be many.  Most of the shortcomings we are aware of are complicated, concern failure scenarios and aren’t fully explained; i.e. if you’re lucky they’ll never be a problem, but some users must surely be bitten, and they won’t have had fair warning.  The same goes for as-yet undiscovered edge cases.

It is my humble opinion that averting problems like this for just a handful of users, that cannot readily tolerate corruption, offsets any inconvenience we might cause to those who can.

For the record, while it’s true that detecting inconsistencies is as much of a problem for user-rolled solutions, it’s worth remembering that the inconsistencies themselves are not equally likely:

In cases where C* is not the database of record, it is quite easy to provide very good consistency guarantees when rolling your own
Conversely, a global-CAS with synchronous QUORUM updates that are retried until success, while much slower, also doesn’t easily suffer these consistency problems, and is the naive approach a user might take if C* were the database of record

Given our approach isn’t uniformly superior, I think we should be very cautious about how it is made available until we’re very confident in it, and we and the community fully understand it.


> On 3 Oct 2017, at 18:51, kurt greaves <ku...@instaclustr.com> wrote:
> 
> Lots of users are already using MV's, believe it or not in some cases quite
> effectively and also on older versions which were still exposed to a lot of
> the bugs that cause inconsistencies. 3.11.1 has come a long way since then
> and I think with a bit more documentation around the current issues marking
> MV's as experimental is unnecessary and likely annoying for current users.
> On that note we've already had complaints about changing defaults and
> behaviours willy nilly across majors and minors, I can't see this helping
> our cause. Sure, you can make it "seamless" from an upgrade perspective,
> but that doesn't account for every single way operators do things. I'm sure
> someone will express surprise when they run up a new cluster or datacenter
> for testing with default config and find out that they have to enable MV's.
> Meanwhile they've been using them the whole time and haven't had any major
> issues because they didn't touch the edge cases.
> 
> I'd like to point out that introducing "experimental" features sets a
> precedent for future releases, and will likely result in using the
> "experimental" tag to push out features that are not ready (again). In fact
> we already routinely say >=3 isn't production ready yet, so why don't we
> just mark 3+ as "experimental" as well? I don't think experimental is the
> right approach for a database. The better solution, as I said, is more
> verification and testing during the release process (by users!). A lot of
> other projects take this approach, and it certainly makes sense. It could
> also be coupled with beta releases, so people can start getting
> verification of their new features at an earlier date. Granted this is
> similar to experimental features, but applied to the whole release rather
> than just individual features.
> 
> * There's no way to determine if a view is out of sync with the base table.
>> 
> As already pointed out by Jake, this is still true when you don't use
> MV's. We should document this. I think it's entirely fair to say that
> users *should
> not *expect this to be done for them. There is also no way for a user to
> determine they have inconsistencies short of their own verification. And
> also a lot of the synchronisation problems have been resolved, undoubtedly
> there are more unknowns out there but what MV's have is still better than
> managing your own.
> 
>> * If you do determine that a view is out of sync, the only way to fix it
>> is to drop and rebuild the view.
>> 
> This is undoubtedly a problem, but also no worse than managing your own
> views. Also at least there is still a way to fix your view. It certainly
> shouldn't be as common in 3.11.1/3.0.15, and we have enough insight now to
> be able to tell when out of sync will actually occur, so we can document
> those cases.
> 
>> * There are liveness issues with updates being reflected in the view.
> 
> What specific issues are you referring to here? The only one I'm aware of
> is deletion of unselected columns in the view affecting out of order
> updates. If we deem this a major problem we can document it or at least put
> a restriction in place until it's fixed in CASSANDRA-13826
> <https://issues.apache.org/jira/browse/CASSANDRA-13826>
> 
> 
> In this case, 'out of sync' means 'you lost data', since the current design
>> + repair should keep things eventually consistent right?
> 
> I'd like Zhao or Paulo to confirm here but I believe the only way you can
> really "lose data" (that can't be repaired) here would be partition
> deletions on massively wide rows in the view that will not fit in the
> batchlog (256mb/max value size) as it currently stands. Frankly this is
> probably an anti-pattern for MV's at the moment anyway and one we should
> advise against.

Re: Proposal to retroactively mark materialized views experimental

Posted by kurt greaves <ku...@instaclustr.com>.

Lots of users are already using MV's, believe it or not in some cases quite
effectively and also on older versions which were still exposed to a lot of
the bugs that cause inconsistencies. 3.11.1 has come a long way since then
and I think with a bit more documentation around the current issues marking
MV's as experimental is unnecessary and likely annoying for current users.
On that note we've already had complaints about changing defaults and
behaviours willy nilly across majors and minors, I can't see this helping
our cause. Sure, you can make it "seamless" from an upgrade perspective,
but that doesn't account for every single way operators do things. I'm sure
someone will express surprise when they run up a new cluster or datacenter
for testing with default config and find out that they have to enable MV's.
Meanwhile they've been using them the whole time and haven't had any major
issues because they didn't touch the edge cases.

I'd like to point out that introducing "experimental" features sets a
precedent for future releases, and will likely result in using the
"experimental" tag to push out features that are not ready (again). In fact
we already routinely say >=3 isn't production ready yet, so why don't we
just mark 3+ as "experimental" as well? I don't think experimental is the
right approach for a database. The better solution, as I said, is more
verification and testing during the release process (by users!). A lot of
other projects take this approach, and it certainly makes sense. It could
also be coupled with beta releases, so people can start getting
verification of their new features at an earlier date. Granted this is
similar to experimental features, but applied to the whole release rather
than just individual features.

* There's no way to determine if a view is out of sync with the base table.
>
 As already pointed out by Jake, this is still true when you don't use
MV's. We should document this. I think it's entirely fair to say that
users *should
not *expect this to be done for them. There is also no way for a user to
determine they have inconsistencies short of their own verification. And
also a lot of the synchronisation problems have been resolved, undoubtedly
there are more unknowns out there but what MV's have is still better than
managing your own.

> * If you do determine that a view is out of sync, the only way to fix it
> is to drop and rebuild the view.
>
This is undoubtedly a problem, but also no worse than managing your own
views. Also at least there is still a way to fix your view. It certainly
shouldn't be as common in 3.11.1/3.0.15, and we have enough insight now to
be able to tell when out of sync will actually occur, so we can document
those cases.

> * There are liveness issues with updates being reflected in the view.

 What specific issues are you referring to here? The only one I'm aware of
is deletion of unselected columns in the view affecting out of order
updates. If we deem this a major problem we can document it or at least put
a restriction in place until it's fixed in CASSANDRA-13826
<https://issues.apache.org/jira/browse/CASSANDRA-13826>


In this case, 'out of sync' means 'you lost data', since the current design
> + repair should keep things eventually consistent right?

I'd like Zhao or Paulo to confirm here but I believe the only way you can
really "lose data" (that can't be repaired) here would be partition
deletions on massively wide rows in the view that will not fit in the
batchlog (256mb/max value size) as it currently stands. Frankly this is
probably an anti-pattern for MV's at the moment anyway and one we should
advise against.

Re: Proposal to retroactively mark materialized views experimental

Posted by Jake Luciani <ja...@gmail.com>.

I never used the word easier.  I think it's a hard problem but it should be
our problem if we want people to use our database.

I have little opinion of if MVs should be made experimental or opt-in.  I'd
simply discussing the need for this feature (as opposed to ripping it out)

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

To clarify:
>
> * There's no way to determine if a view is out of sync with the base table.
> * If you do determine that a view is out of sync, the only way to fix it
> is to drop and rebuild the view.


In this case, 'out of sync' means 'you lost data', since the current design
+ repair should keep things eventually consistent right? Not saying that's
ideal, and not saying having no visibility to holes like that from data
loss is acceptable either, just trying to make sure we're all on the same
page here.

On Tue, Oct 3, 2017 at 3:09 PM, Jeff Jirsa <jj...@gmail.com> wrote:

> Nobody debates that it's easier, the debate is over whether or not it's
> correct (and more importantly, whether or not people realize it's not
> strictly correct in all edge cases).
>
> Users expect correct results. People are literally betting their jobs on
> it. When you have to manually manage sync between two tables, you at least
> become (painfully) aware that correctness is difficult and you can't count
> on it (maybe you need an app level re-sync or similar).
>
> When you use a feature that is built in, people will assume it's correct,
> and there's no way for the average user to know that's not the case right
> now.
>
> Put another way:
>
> If your bank decided to use MVs right now for your personal bank/investment
> accounts, would you be ok with that?
> If not, then we need a way to stop the banks (and all other cassandra
> users) from doing it without realizing that it's not OK.
>
>
>
>
> On Tue, Oct 3, 2017 at 12:02 PM, Jake Luciani <ja...@gmail.com> wrote:
>
> > >
> > > The remaining issues are:
> > >
> > > * There's no way to determine if a view is out of sync with the base
> > table.
> > > * If you do determine that a view is out of sync, the only way to fix
> it
> > > is to drop and rebuild the view.
> > > * There are liveness issues with updates being reflected in the view.
> > >
> >
> > I just want to mention that manual de-normalization has all the same
> issues
> > as the list of above.  If you write to multiple tables with batch logs
> when
> > do you know the data is consistent?
> > In fact, manual de-normalization is worse because you can't manually
> handle
> > updates to existing data due to the lack of synchronization on read
> before
> > write.
> >
> > I think a lot of you have lost sight on what MV was intended for, as a
> way
> > to keep developers from manually maintaining a consistent view of data
> > across tables.
> > There is still the fundamental problem of managing multiple views of data
> > even if you remove the MV feature, you just make it someone else's
> problem.
> >
> > I'll re-post this blog from back when MVs first came out to hopefully
> clear
> > questions up on the goals of MV.
> >
> > https://www.datastax.com/dev/blog/understanding-materialized-views
> >
> > -Jake
> >
> >
> > On Tue, Oct 3, 2017 at 2:50 PM, Aleksey Yeshchenko <al...@apple.com>
> > wrote:
> >
> > > Indeed. Paulo and Zhao did a lot of good work to make the situation
> less
> > > bad. You did some as well. Even I retouched small parts of it -
> metadata
> > > related. I’m sorry if I came off as disrespectful - I didn’t mean to.
> > I’ve
> > > seen and I appreciate every commit that went into it.
> > >
> > > It is however my opinion that we started at a very low point, for a
> > > variety of reasons, and climbing out of that initial poor state, to the
> > > level that power users start having trust in MVs and overcome the
> initial
> > > deservedly poor impression, will probably take even more work. And
> > when/if
> > > we get there, maybe we won’t need the switch anymore.
> > >
> > > —
> > > AY
> > >
> > > On 3 October 2017 at 17:00:31, Sylvain Lebresne (sylvain@datastax.com)
> > > wrote:
> > >
> > > You're giving little credit to the hard work that people have put into
> > > getting MV in a usable state.
> > >
> >
> >
> >
> > --
> > http://twitter.com/tjake
> >
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeff Jirsa <jj...@gmail.com>.

Nobody debates that it's easier, the debate is over whether or not it's
correct (and more importantly, whether or not people realize it's not
strictly correct in all edge cases).

Users expect correct results. People are literally betting their jobs on
it. When you have to manually manage sync between two tables, you at least
become (painfully) aware that correctness is difficult and you can't count
on it (maybe you need an app level re-sync or similar).

When you use a feature that is built in, people will assume it's correct,
and there's no way for the average user to know that's not the case right
now.

Put another way:

If your bank decided to use MVs right now for your personal bank/investment
accounts, would you be ok with that?
If not, then we need a way to stop the banks (and all other cassandra
users) from doing it without realizing that it's not OK.




On Tue, Oct 3, 2017 at 12:02 PM, Jake Luciani <ja...@gmail.com> wrote:

> >
> > The remaining issues are:
> >
> > * There's no way to determine if a view is out of sync with the base
> table.
> > * If you do determine that a view is out of sync, the only way to fix it
> > is to drop and rebuild the view.
> > * There are liveness issues with updates being reflected in the view.
> >
>
> I just want to mention that manual de-normalization has all the same issues
> as the list of above.  If you write to multiple tables with batch logs when
> do you know the data is consistent?
> In fact, manual de-normalization is worse because you can't manually handle
> updates to existing data due to the lack of synchronization on read before
> write.
>
> I think a lot of you have lost sight on what MV was intended for, as a way
> to keep developers from manually maintaining a consistent view of data
> across tables.
> There is still the fundamental problem of managing multiple views of data
> even if you remove the MV feature, you just make it someone else's problem.
>
> I'll re-post this blog from back when MVs first came out to hopefully clear
> questions up on the goals of MV.
>
> https://www.datastax.com/dev/blog/understanding-materialized-views
>
> -Jake
>
>
> On Tue, Oct 3, 2017 at 2:50 PM, Aleksey Yeshchenko <al...@apple.com>
> wrote:
>
> > Indeed. Paulo and Zhao did a lot of good work to make the situation less
> > bad. You did some as well. Even I retouched small parts of it - metadata
> > related. I’m sorry if I came off as disrespectful - I didn’t mean to.
> I’ve
> > seen and I appreciate every commit that went into it.
> >
> > It is however my opinion that we started at a very low point, for a
> > variety of reasons, and climbing out of that initial poor state, to the
> > level that power users start having trust in MVs and overcome the initial
> > deservedly poor impression, will probably take even more work. And
> when/if
> > we get there, maybe we won’t need the switch anymore.
> >
> > —
> > AY
> >
> > On 3 October 2017 at 17:00:31, Sylvain Lebresne (sylvain@datastax.com)
> > wrote:
> >
> > You're giving little credit to the hard work that people have put into
> > getting MV in a usable state.
> >
>
>
>
> --
> http://twitter.com/tjake
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Jake Luciani <ja...@gmail.com>.

>
> The remaining issues are:
>
> * There's no way to determine if a view is out of sync with the base table.
> * If you do determine that a view is out of sync, the only way to fix it
> is to drop and rebuild the view.
> * There are liveness issues with updates being reflected in the view.
>

I just want to mention that manual de-normalization has all the same issues
as the list of above.  If you write to multiple tables with batch logs when
do you know the data is consistent?
In fact, manual de-normalization is worse because you can't manually handle
updates to existing data due to the lack of synchronization on read before
write.

I think a lot of you have lost sight on what MV was intended for, as a way
to keep developers from manually maintaining a consistent view of data
across tables.
There is still the fundamental problem of managing multiple views of data
even if you remove the MV feature, you just make it someone else's problem.

I'll re-post this blog from back when MVs first came out to hopefully clear
questions up on the goals of MV.

https://www.datastax.com/dev/blog/understanding-materialized-views

-Jake

On Tue, Oct 3, 2017 at 2:50 PM, Aleksey Yeshchenko <al...@apple.com>
wrote:

> Indeed. Paulo and Zhao did a lot of good work to make the situation less
> bad. You did some as well. Even I retouched small parts of it - metadata
> related. I’m sorry if I came off as disrespectful - I didn’t mean to. I’ve
> seen and I appreciate every commit that went into it.
>
> It is however my opinion that we started at a very low point, for a
> variety of reasons, and climbing out of that initial poor state, to the
> level that power users start having trust in MVs and overcome the initial
> deservedly poor impression, will probably take even more work. And when/if
> we get there, maybe we won’t need the switch anymore.
>
> —
> AY
>
> On 3 October 2017 at 17:00:31, Sylvain Lebresne (sylvain@datastax.com)
> wrote:
>
> You're giving little credit to the hard work that people have put into
> getting MV in a usable state.
>

-- 
http://twitter.com/tjake

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

Indeed. Paulo and Zhao did a lot of good work to make the situation less bad. You did some as well. Even I retouched small parts of it - metadata related. I’m sorry if I came off as disrespectful - I didn’t mean to. I’ve seen and I appreciate every commit that went into it.

It is however my opinion that we started at a very low point, for a variety of reasons, and climbing out of that initial poor state, to the level that power users start having trust in MVs and overcome the initial deservedly poor impression, will probably take even more work. And when/if we get there, maybe we won’t need the switch anymore.

—
AY

On 3 October 2017 at 17:00:31, Sylvain Lebresne (sylvain@datastax.com) wrote:

You're giving little credit to the hard work that people have put into 
getting MV in a usable state.

Re: Proposal to retroactively mark materialized views experimental

Posted by Ben Bromhead <be...@instaclustr.com>.

Lot's of hard work by folks on MVs and I don't think this proposal is a
commentary or reflection on that. What it is, is about signalling to users
that this feature has more edge cases and caveats than other tried and true
features (like all new features).

MVs are still a feature in a "stable" release and if it solves the end
users problem but they are more aware of the edge cases because it is an
explicit opt-in I think that would be quite beneficial. However this
argument is more about user behaviour and guiding first time adopters so
they have a better first time experience, which is a more nebulous concept
with many approaches.

The other side to proposal touches on the idea feature flags that operators
can enable and disable depending on their organisational requirements and
risk appetite. This is strongly related to the first point about guiding
user behaviour, however it allows an organisation or operator to make that
decision independent of their own end users.

Whilst personally I would advocate for off by default for experimental /
dangerous features (even retroactively doing it as suggested in the
proposal), I do see the other side of the argument and we do need to give
the quality control processes in place a chance to show fruit. I think the
compromise suggested by Aleksey is fair.

+1 to either A) or B)

On Tue, 3 Oct 2017 at 09:29 Blake Eggleston <be...@apple.com> wrote:

> The remaining issues are:
>
> * There's no way to determine if a view is out of sync with the base table.
> * If you do determine that a view is out of sync, the only way to fix it
> is to drop and rebuild the view.
> * There are liveness issues with updates being reflected in the view.
>
> On October 3, 2017 at 9:00:32 AM, Sylvain Lebresne (sylvain@datastax.com)
> wrote:
>
> On Tue, Oct 3, 2017 at 5:54 PM, Aleksey Yeshchenko <al...@apple.com>
> wrote:
> > There are a couple compromise options here:
> >
> > a) Introduce the flag (enalbe_experimental_features, or maybe one per
> experimental feature), set it to ‘false’ in the yaml, but have the default
> be ‘true’. So that if you are upgrading from a previous minor to the next
> without updating the yaml, you notice nothing.
> >
> > b) Introduce the flag in the minor, and set it to ‘true’ in the yaml in
> 3.0 and 3.11, but to ‘false’ in 4.0. So the operators and in general people
> who know better can still disable it with one flip, but nobody would be
> affected by it in a minor otherwise.
> >
> > B might be more correct, and I’m okay with it
>
> Does feel more correct to me as well
>
> > although I do feel that we are behaving irresponsibly as developers by
> allowing MV creation by default in their current state
>
> You're giving little credit to the hard work that people have put into
> getting MV in a usable state. To quote Kurt's email:
>
> > And finally, back onto the original topic. I'm not convinced that MV's
> need
> > this treatment now. Zhao and Paulo (and others+reviewers) have made
> quite a
> > lot of fixes, granted there are still some outstanding bugs but the
> > majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
> > bugs mostly only affect views with a poor data model. Plus we've already
> > required the known broken components require a flag to be turned on.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer

Re: Proposal to retroactively mark materialized views experimental

Posted by Blake Eggleston <be...@apple.com>.

The remaining issues are:

* There's no way to determine if a view is out of sync with the base table.
* If you do determine that a view is out of sync, the only way to fix it is to drop and rebuild the view.
* There are liveness issues with updates being reflected in the view.

On October 3, 2017 at 9:00:32 AM, Sylvain Lebresne (sylvain@datastax.com) wrote:

On Tue, Oct 3, 2017 at 5:54 PM, Aleksey Yeshchenko <al...@apple.com> wrote:  
> There are a couple compromise options here:  
>  
> a) Introduce the flag (enalbe_experimental_features, or maybe one per experimental feature), set it to ‘false’ in the yaml, but have the default be ‘true’. So that if you are upgrading from a previous minor to the next without updating the yaml, you notice nothing.  
>  
> b) Introduce the flag in the minor, and set it to ‘true’ in the yaml in 3.0 and 3.11, but to ‘false’ in 4.0. So the operators and in general people who know better can still disable it with one flip, but nobody would be affected by it in a minor otherwise.  
>  
> B might be more correct, and I’m okay with it  

Does feel more correct to me as well  

> although I do feel that we are behaving irresponsibly as developers by allowing MV creation by default in their current state  

You're giving little credit to the hard work that people have put into  
getting MV in a usable state. To quote Kurt's email:  

> And finally, back onto the original topic. I'm not convinced that MV's need  
> this treatment now. Zhao and Paulo (and others+reviewers) have made quite a  
> lot of fixes, granted there are still some outstanding bugs but the  
> majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining  
> bugs mostly only affect views with a poor data model. Plus we've already  
> required the known broken components require a flag to be turned on.  

---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Sylvain Lebresne <sy...@datastax.com>.

On Tue, Oct 3, 2017 at 5:54 PM, Aleksey Yeshchenko <al...@apple.com> wrote:
> There are a couple compromise options here:
>
> a) Introduce the flag (enalbe_experimental_features, or maybe one per experimental feature), set it to ‘false’ in the yaml, but have the default be ‘true’. So that if you are upgrading from a previous minor to the next without updating the yaml, you notice nothing.
>
> b) Introduce the flag in the minor, and set it to ‘true’ in the yaml in 3.0 and 3.11, but to ‘false’ in 4.0. So the operators and in general people who know better can still disable it with one flip, but nobody would be affected by it in a minor otherwise.
>
> B might be more correct, and I’m okay with it

Does feel more correct to me as well

> although I do feel that we are behaving irresponsibly as developers by allowing MV creation by default in their current state

You're giving little credit to the hard work that people have put into
getting MV in a usable state. To quote Kurt's email:

> And finally, back onto the original topic. I'm not convinced that MV's need
> this treatment now. Zhao and Paulo (and others+reviewers) have made quite a
> lot of fixes, granted there are still some outstanding bugs but the
> majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
> bugs mostly only affect views with a poor data model. Plus we've already
> required the known broken components require a flag to be turned on.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

There are a couple compromise options here:

a) Introduce the flag (enalbe_experimental_features, or maybe one per experimental feature), set it to ‘false’ in the yaml, but have the default be ‘true’. So that if you are upgrading from a previous minor to the next without updating the yaml, you notice nothing.

b) Introduce the flag in the minor, and set it to ‘true’ in the yaml in 3.0 and 3.11, but to ‘false’ in 4.0. So the operators and in general people who know better can still disable it with one flip, but nobody would be affected by it in a minor otherwise.

B might be more correct, and I’m okay with it, although I do feel that we are behaving irresponsibly as developers by allowing MV creation by default in their current state and that it’s better to correct a mistake late than later.

—
AY

On 3 October 2017 at 10:27:58, Sylvain Lebresne (sylvain@datastax.com) wrote:

Just one more data point, but I personally don't feel that disabling 
new MV creation (or new SAS index creation for that matter) by default 
_in a patch release_ is terribly nice.

Re: Proposal to retroactively mark materialized views experimental

Posted by Sylvain Lebresne <sy...@datastax.com>.

Just one more data point, but I personally don't feel that disabling
new MV creation (or new SAS index creation for that matter) by default
_in a patch release_ is terribly nice. There can absolutely be code
out there that creates MV/SASI indexes somewhat automatically on some
events and it would break those. That doesn't feel appropriate to me
in a patch release.

It's probably true that such disabling by default would be a tad more
efficient in raising awareness than sticking only to warnings, but I
don't feel like this would make such a big difference in practice that
it justify breaking (some) users in a patch release. I'm fine being
thorough in our warnings however: on top of a cqlsh/client warning, we
can warn in the log both on startup (if we detect any MVs) _and_ at MV
creation time. We should obviously also put a clear warning in the
doc.



On Tue, Oct 3, 2017 at 5:07 AM, Jeremy Hanna <je...@gmail.com> wrote:
> At the risk of sounding redundant it sounds like for MVs at this point we want to preserve current functionality for existing users.  We would have a flag in the yaml to disable it by default for new MV creation with an error message. In addition we want a warning in the log and in cqlsh upon creation and usage even with it enabled.  We could be very explicit with a message to the user list and a clear NEWS item and upgrade note.
>
> For SASI, it seems like a similar pattern could be used.
>
> For either or both of these features, a Jira epic could be created to make a path to making it a full-fledged non-experimental feature.  That could include a ticket for realistic testing within the boundaries of the target use cases - that way people can point to what testing has been done and what the target use cases were.  An epic and test ticket would give people a pathway to make it no longer experimental so it isn’t lingering halfway in the codebase indefinitely and for interested parties something to contribute towards.  Granted if issues are found, they’d need to be added to the epic.
>
>>> On Oct 2, 2017, at 6:16 PM, Mick Semb Wever <mi...@thelastpickle.com> wrote:
>>>
>>> On 3 October 2017 at 04:57, Aleksey Yeshchenko <al...@apple.com> wrote:
>>>
>>> The idea is to check the flag in CreateViewStatement, so creation of new
>>> MVs doesn’t succeed without that flag flipped.
>>> Obviously, just disabling existing MVs working in a minor would be silly.
>>> As for the warning - yes, that should also be emitted. Unconditionally.
>>
>>
>> Thanks Aleksey, this was the best read in this thread imo of what should be
>> done with MVs.
>> (With these warnings being emitted in both logs and cqlsh).
>>
>> Hopefully similar "creation flags" and log+cqlsh warnings can be added to:
>> triggers, SASI, and incremental repair (<4.0).
>>
>> CDC sounds like it is in the same basket, but it already has the
>> `cdc_enabled` yaml flag which defaults false.
>>
>> Mick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremiah D Jordan <je...@datastax.com>.

Thanks for bringing this up Kurt, it is a fair point.  Given the work that Paulo and Zhao have done to get MV’s in shape, what are the outstanding issues that would warrant making them experimental?



> On Oct 3, 2017, at 5:56 AM, kurt greaves <ku...@instaclustr.com> wrote:
> 
> And finally, back onto the original topic. I'm not convinced that MV's need
> this treatment now. Zhao and Paulo (and others+reviewers) have made quite a
> lot of fixes, granted there are still some outstanding bugs but the
> majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
> bugs mostly only affect views with a poor data model. Plus we've already
> required the known broken components require a flag to be turned on. Also
> at this point it's not worth making them experimental because a lot of
> users are already using them, it's a bit late to go and do that. We should
> just continue to try and fix them, or where not possible clearly document
> use cases that should be avoided.

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

> there was "some" reason that even major changes had to be
> squeezed into 3.0 before it was released
The TL;DR is: having One Version to Rule Them All forces a slew of
changes into majors only, since bumping the MessagingService Version
has far-reaching impacts. Reference:
https://issues.apache.org/jira/browse/CASSANDRA-12042

With this setup, it doesn't matter what arbitrary date we have on a
calendar for a release; there's always going to be a bunch of things
in flight that end up cut in scope to try and get in since it's a
12-15 month delay to get out the door otherwise as they're blocked by
protocol bumps. In part, tick-tock was an effort to try and ease some
of that 'release infrequently and you are pressured to get things into
a release since protocol version changes are infrequent', though we
never got so far as to iron out 12042 and fully close the loop on that
approach.

On Tue, Oct 3, 2017 at 5:56 AM, kurt greaves <ku...@instaclustr.com> wrote:
> Well this is all terribly interesting. I was actually going to get some
> discussion going about this during my talk, which unfortunately didn't
> happen, but I'll take this opportunity to push my agenda. My 99 cents:
>
> *tl;dr: we should probably just focus on not releasing completely broken
> features in the first place, and we should do that through user
> engagement/testing wooo!*
>
> Some context to begin with, because I think this needs to be spelled out.
> Cassandra is a database. People treat databases as their prize possession.
> It stores all their sweet sweet data, and undoubtedly that data is the most
> important component in their system. Without it, there is no point in
> having a system. Users expect their databases to be the most stable
> component of their system, and generally they won't upgrade them without
> being absolutely positively sure that a new version will work at least
> exactly as the old one has. All our users treat their database in exactly
> this same way. Change happens slowly in the database world, and generally
> this is true both for the database and the users of the database. "C* 3.0.0
> is out tomorrow! let's upgrade!" - said no one ever.
>
> Anyway, with that out of the way, back to the crux of the issue. This may
> get long and unwieldy, and derail the actual thread, but in this case I
> think for good reason. Either way it's all relevant to the actual topic.
>
> I think it's worth taking a step back and looking at the actual situation
> and what brought us here, rather than just proposing a solution that's
> really just a band-aid on the real issue. These are the problems I've seen
> that have caused a lot of the pain with new features, and an indication
> that we need to change the way we manage our releases and major changes.
>
>    1. We pushed out large feature sets with minimal testing of said
>    features. At this stage we had no requirement for clean passing tests on
>    commit, and over all we didn't have a strong commitment to writing tests
>    either. In 3.10 this changed, where we put forth that dtests and utests
>    needed to pass, and new tests needed to be written for each change. Any
>    change prior to 3.10 was subject to many flaky tests with minimal coverage.
>    Many features only went partially tested and were committed anyway.
>
>    2. We rushed features to meet deadlines, or simply didn't give them
>    enough time + thought in the conception phase because of deadlines.
>    I've never met an arbitrary deadline that made things better. From
>    looking at lots of old tickets, there was "some" reason that even major
>    changes had to be squeezed into 3.0 before it was released, which resulted
>    in a lack of attention and testing for these features. We didn't just wait
>    until things were ready before committing them, we just cut scope so it
>    would fit. I honestly don't know how this could ever make sense for a
>    volunteer driven project. In fact I don't really know how it works well for
>    any software project. It generally just ends in bad software. It might make
>    sense for a business pushing the feature agenda for $$, or where a projects
>    users don't care about stability (lol), but it still results in bad
>    software. It definitely doesn't make sense for an open source project.
>
>    3. We didn't do any system-wide verification/integration testing of
>    features. We essentially relied on dtests and unit tests. Touched on this
>    in 1, but we don't have much system testing. dtests kind of covers it, but
>    not really well. cstar is also used in some cases but is also limited in
>    scope (performance only, really). We're lucky that we can cover a lot of
>    cases with dtests, but it seems to me that we don't capture a lot of the
>    cases where feature X affects feature Y. E.g: the effect of repairs against
>    everything ever, but mostly vnodes. We really need a proper testing cluster
>    with each version we put out, and to test new and existing features
>    extensively to measure their worth. Instaclustr is looking at this but
>    we're still a ways off having something up and running.
>    On this note we also changed defaults prematurely, but we wouldn't know
>    it was premature until we did so, as if we didn't change the default they
>    probably wouldn't have received much usage.
>
>    4. Our community is made up of mostly power users, and most of these are
>    still on older versions (2.0, 2.1). There is little reason for these users
>    to upgrade to newer versions, and little reason to use the new features
>    (even if they were the ones developing them). This is actually great, that
>    the power users have been adding functionality to Cassandra for new users,
>    however we haven't really engaged with these users to go and verify this
>    functionality, and we did a pretty half-arsed job of testing them
>    ourselves. We essentially just rolled it out and waited for the bug reports.
>    IMO this is where the "experimental flag" comes in. We rolled out a
>    bunch of stuff, a year later some people started using it and realised it
>    didn't quite work but they had already invested a lot of time into it, all
>    of a sudden there is a world of issues and we realise we never should have
>    rolled it out in the first place. It's tempting to just say "let's put in
>    an experimental flag so this doesn't happen again and we'll be all G", but
>    that won't actually fix the problem, it's much like the changing the
>    defaults problem.
>
> Now, in a perfect world we would have the testing in place to not need an
> "experimental" flag, which I think is what we should actually aim for. In
> the mean time an experimental flag *may* be necessary, but so far I'm not
> really convinced. If we just mark a feature as experimental it will scare a
> lot of users off, and these new features will have a lot less coverage.
> Albeit there will be a lot less problems, but only because less people are
> using it. Especially with no indication of when it will actually be
> production ready. On that note, how do we even decide when it is production
> ready? It's bound to be something arbitrary like "we haven't seen a
> horrible bug in 6 months", which is no better than what we currently have.
> This sort of thing detracts from the usefulness of Cassandra, and gives
> nice big opportunities for someone to come along and do it better than us.
>
> I actually think a better solution here is more user engagement/testing in
> the release process. If there are users actually out there who want these
> features, they should be willing to help us test them prior to release. If
> each feature can get exposed to a few different use cases on real
> *staging* clusters,
> we could verify functionality a lot easier. This would have been cake with
> MV's, as there are many users managing their own views that could have just
> replaced them with MV's in their staging environment. This can be applied
> to a lot of other features as well (incremental repairs replace full
> repairs, SASI replace SI or even Solr), it just requires some buy-in from
> the userbase, which I'm sure we'd find, because if we didn't there would be
> no reason to write the feature in the first place. This would put us in a
> lot better position than an experimental flag, which would essentially
> require us to do this exact same thing in order to make a feature
> "production ready", however those experimental features may never end up
> getting the attention they need to become production ready. You could argue
> that if someone really wanted it then they'd push to get it out of an
> experimental state, but I think you'd find that most users will only
> consider what's readily available to them.
>
> And finally, back onto the original topic. I'm not convinced that MV's need
> this treatment now. Zhao and Paulo (and others+reviewers) have made quite a
> lot of fixes, granted there are still some outstanding bugs but the
> majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
> bugs mostly only affect views with a poor data model. Plus we've already
> required the known broken components require a flag to be turned on. Also
> at this point it's not worth making them experimental because a lot of
> users are already using them, it's a bit late to go and do that. We should
> just continue to try and fix them, or where not possible clearly document
> use cases that should be avoided.
>
> Frankly, marking features experimental that loads of users have already
> invested in feels to me a bit like a kick in the teeth to said users.
> Almost like telling them "we're actually not going to support this,
> surprise". If it's a big deal, we should probably just fix the issues. If
> anyone knows some really pressing issues I'm unaware of, feel free to fill
> me in. The only issue raised in this thread so far is a tool to repair
> consistency between view and base. While I think this is necessary, it
> really shouldn't be a major problem on the latest releases, and really, if
> the view loses consistency with the base, waiting for some kind of repair
> to fix it isn't much better than just rebuilding it from scratch. This is
> one case where we should document the possible causes of an inconsistent
> view, and the way to fix it (which is essentially, you had an outage, now
> you need to rebuild it), along with a warning about this in the docs.
>
> And to bring it all back to my initial comment about slow-moving databases
> and change and things... We've literally only just got stricter w.r.t
> testing in 3.10. We've hardly given 3.11 a go before coming along and
> saying "we need to make everything experimental so no one gets hurt!".
> Change is and should be slow in a database world, and science should be
> applied. At the very least, before we get too crazy, we should see if the
> changes to how we do testing have a positive effect on future features.
> This also comes back to the deadline situation I mentioned earlier. While
> we haven't formally changed how releases are scheduled/managed, we've
> informally moved to a strategy of "we'll have these problems solved before
> we do the next release". I think this will also be a huge improvement to
> the stability/production readiness of new features in 4.0. (ps: we should
> formalise that but that's a whole 'nother wall of text)
>
> Anyway, I have lots more to say on this and related topics but I see Josh
> is already raising one of my points against experimental flags now, and
> this is probably enough words for one email.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by kurt greaves <ku...@instaclustr.com>.

Well this is all terribly interesting. I was actually going to get some
discussion going about this during my talk, which unfortunately didn't
happen, but I'll take this opportunity to push my agenda. My 99 cents:

*tl;dr: we should probably just focus on not releasing completely broken
features in the first place, and we should do that through user
engagement/testing wooo!*

Some context to begin with, because I think this needs to be spelled out.
Cassandra is a database. People treat databases as their prize possession.
It stores all their sweet sweet data, and undoubtedly that data is the most
important component in their system. Without it, there is no point in
having a system. Users expect their databases to be the most stable
component of their system, and generally they won't upgrade them without
being absolutely positively sure that a new version will work at least
exactly as the old one has. All our users treat their database in exactly
this same way. Change happens slowly in the database world, and generally
this is true both for the database and the users of the database. "C* 3.0.0
is out tomorrow! let's upgrade!" - said no one ever.

Anyway, with that out of the way, back to the crux of the issue. This may
get long and unwieldy, and derail the actual thread, but in this case I
think for good reason. Either way it's all relevant to the actual topic.

I think it's worth taking a step back and looking at the actual situation
and what brought us here, rather than just proposing a solution that's
really just a band-aid on the real issue. These are the problems I've seen
that have caused a lot of the pain with new features, and an indication
that we need to change the way we manage our releases and major changes.

   1. We pushed out large feature sets with minimal testing of said
   features. At this stage we had no requirement for clean passing tests on
   commit, and over all we didn't have a strong commitment to writing tests
   either. In 3.10 this changed, where we put forth that dtests and utests
   needed to pass, and new tests needed to be written for each change. Any
   change prior to 3.10 was subject to many flaky tests with minimal coverage.
   Many features only went partially tested and were committed anyway.

   2. We rushed features to meet deadlines, or simply didn't give them
   enough time + thought in the conception phase because of deadlines.
   I've never met an arbitrary deadline that made things better. From
   looking at lots of old tickets, there was "some" reason that even major
   changes had to be squeezed into 3.0 before it was released, which resulted
   in a lack of attention and testing for these features. We didn't just wait
   until things were ready before committing them, we just cut scope so it
   would fit. I honestly don't know how this could ever make sense for a
   volunteer driven project. In fact I don't really know how it works well for
   any software project. It generally just ends in bad software. It might make
   sense for a business pushing the feature agenda for $$, or where a projects
   users don't care about stability (lol), but it still results in bad
   software. It definitely doesn't make sense for an open source project.

   3. We didn't do any system-wide verification/integration testing of
   features. We essentially relied on dtests and unit tests. Touched on this
   in 1, but we don't have much system testing. dtests kind of covers it, but
   not really well. cstar is also used in some cases but is also limited in
   scope (performance only, really). We're lucky that we can cover a lot of
   cases with dtests, but it seems to me that we don't capture a lot of the
   cases where feature X affects feature Y. E.g: the effect of repairs against
   everything ever, but mostly vnodes. We really need a proper testing cluster
   with each version we put out, and to test new and existing features
   extensively to measure their worth. Instaclustr is looking at this but
   we're still a ways off having something up and running.
   On this note we also changed defaults prematurely, but we wouldn't know
   it was premature until we did so, as if we didn't change the default they
   probably wouldn't have received much usage.

   4. Our community is made up of mostly power users, and most of these are
   still on older versions (2.0, 2.1). There is little reason for these users
   to upgrade to newer versions, and little reason to use the new features
   (even if they were the ones developing them). This is actually great, that
   the power users have been adding functionality to Cassandra for new users,
   however we haven't really engaged with these users to go and verify this
   functionality, and we did a pretty half-arsed job of testing them
   ourselves. We essentially just rolled it out and waited for the bug reports.
   IMO this is where the "experimental flag" comes in. We rolled out a
   bunch of stuff, a year later some people started using it and realised it
   didn't quite work but they had already invested a lot of time into it, all
   of a sudden there is a world of issues and we realise we never should have
   rolled it out in the first place. It's tempting to just say "let's put in
   an experimental flag so this doesn't happen again and we'll be all G", but
   that won't actually fix the problem, it's much like the changing the
   defaults problem.

Now, in a perfect world we would have the testing in place to not need an
"experimental" flag, which I think is what we should actually aim for. In
the mean time an experimental flag *may* be necessary, but so far I'm not
really convinced. If we just mark a feature as experimental it will scare a
lot of users off, and these new features will have a lot less coverage.
Albeit there will be a lot less problems, but only because less people are
using it. Especially with no indication of when it will actually be
production ready. On that note, how do we even decide when it is production
ready? It's bound to be something arbitrary like "we haven't seen a
horrible bug in 6 months", which is no better than what we currently have.
This sort of thing detracts from the usefulness of Cassandra, and gives
nice big opportunities for someone to come along and do it better than us.

I actually think a better solution here is more user engagement/testing in
the release process. If there are users actually out there who want these
features, they should be willing to help us test them prior to release. If
each feature can get exposed to a few different use cases on real
*staging* clusters,
we could verify functionality a lot easier. This would have been cake with
MV's, as there are many users managing their own views that could have just
replaced them with MV's in their staging environment. This can be applied
to a lot of other features as well (incremental repairs replace full
repairs, SASI replace SI or even Solr), it just requires some buy-in from
the userbase, which I'm sure we'd find, because if we didn't there would be
no reason to write the feature in the first place. This would put us in a
lot better position than an experimental flag, which would essentially
require us to do this exact same thing in order to make a feature
"production ready", however those experimental features may never end up
getting the attention they need to become production ready. You could argue
that if someone really wanted it then they'd push to get it out of an
experimental state, but I think you'd find that most users will only
consider what's readily available to them.

And finally, back onto the original topic. I'm not convinced that MV's need
this treatment now. Zhao and Paulo (and others+reviewers) have made quite a
lot of fixes, granted there are still some outstanding bugs but the
majority of bad ones have been fixed in 3.11.1 and 3.0.15, the remaining
bugs mostly only affect views with a poor data model. Plus we've already
required the known broken components require a flag to be turned on. Also
at this point it's not worth making them experimental because a lot of
users are already using them, it's a bit late to go and do that. We should
just continue to try and fix them, or where not possible clearly document
use cases that should be avoided.

Frankly, marking features experimental that loads of users have already
invested in feels to me a bit like a kick in the teeth to said users.
Almost like telling them "we're actually not going to support this,
surprise". If it's a big deal, we should probably just fix the issues. If
anyone knows some really pressing issues I'm unaware of, feel free to fill
me in. The only issue raised in this thread so far is a tool to repair
consistency between view and base. While I think this is necessary, it
really shouldn't be a major problem on the latest releases, and really, if
the view loses consistency with the base, waiting for some kind of repair
to fix it isn't much better than just rebuilding it from scratch. This is
one case where we should document the possible causes of an inconsistent
view, and the way to fix it (which is essentially, you had an outage, now
you need to rebuild it), along with a warning about this in the docs.

And to bring it all back to my initial comment about slow-moving databases
and change and things... We've literally only just got stricter w.r.t
testing in 3.10. We've hardly given 3.11 a go before coming along and
saying "we need to make everything experimental so no one gets hurt!".
Change is and should be slow in a database world, and science should be
applied. At the very least, before we get too crazy, we should see if the
changes to how we do testing have a positive effect on future features.
This also comes back to the deadline situation I mentioned earlier. While
we haven't formally changed how releases are scheduled/managed, we've
informally moved to a strategy of "we'll have these problems solved before
we do the next release". I think this will also be a huge improvement to
the stability/production readiness of new features in 4.0. (ps: we should
formalise that but that's a whole 'nother wall of text)

Anyway, I have lots more to say on this and related topics but I see Josh
is already raising one of my points against experimental flags now, and
this is probably enough words for one email.

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremy Hanna <je...@gmail.com>.

At the risk of sounding redundant it sounds like for MVs at this point we want to preserve current functionality for existing users.  We would have a flag in the yaml to disable it by default for new MV creation with an error message. In addition we want a warning in the log and in cqlsh upon creation and usage even with it enabled.  We could be very explicit with a message to the user list and a clear NEWS item and upgrade note.

For SASI, it seems like a similar pattern could be used.

For either or both of these features, a Jira epic could be created to make a path to making it a full-fledged non-experimental feature.  That could include a ticket for realistic testing within the boundaries of the target use cases - that way people can point to what testing has been done and what the target use cases were.  An epic and test ticket would give people a pathway to make it no longer experimental so it isn’t lingering halfway in the codebase indefinitely and for interested parties something to contribute towards.  Granted if issues are found, they’d need to be added to the epic.

>> On Oct 2, 2017, at 6:16 PM, Mick Semb Wever <mi...@thelastpickle.com> wrote:
>> 
>> On 3 October 2017 at 04:57, Aleksey Yeshchenko <al...@apple.com> wrote:
>> 
>> The idea is to check the flag in CreateViewStatement, so creation of new
>> MVs doesn’t succeed without that flag flipped.
>> Obviously, just disabling existing MVs working in a minor would be silly.
>> As for the warning - yes, that should also be emitted. Unconditionally.
> 
> 
> Thanks Aleksey, this was the best read in this thread imo of what should be
> done with MVs.
> (With these warnings being emitted in both logs and cqlsh).
> 
> Hopefully similar "creation flags" and log+cqlsh warnings can be added to:
> triggers, SASI, and incremental repair (<4.0).
> 
> CDC sounds like it is in the same basket, but it already has the
> `cdc_enabled` yaml flag which defaults false.
> 
> Mick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Mick Semb Wever <mi...@thelastpickle.com>.

On 3 October 2017 at 04:57, Aleksey Yeshchenko <al...@apple.com> wrote:

> The idea is to check the flag in CreateViewStatement, so creation of new
> MVs doesn’t succeed without that flag flipped.
> Obviously, just disabling existing MVs working in a minor would be silly.
> As for the warning - yes, that should also be emitted. Unconditionally.
>

Thanks Aleksey, this was the best read in this thread imo of what should be
done with MVs.
(With these warnings being emitted in both logs and cqlsh).

Hopefully similar "creation flags" and log+cqlsh warnings can be added to:
triggers, SASI, and incremental repair (<4.0).

CDC sounds like it is in the same basket, but it already has the
`cdc_enabled` yaml flag which defaults false.

Mick

Re: Proposal to retroactively mark materialized views experimental

Posted by Aleksey Yeshchenko <al...@apple.com>.

The idea is to check the flag in CreateViewStatement, so creation of new MVs doesn’t succeed without that flag flipped.

Obviously, just disabling existing MVs working in a minor would be silly.

As for the warning - yes, that should also be emitted. Unconditionally.

—
AY

On 2 October 2017 at 18:18:52, Jeremiah D Jordan (jeremiah.jordan@gmail.com) wrote:

These things are live on clusters right now, and I would not want someone to upgrade their cluster to a new *patch* release and suddenly something that may have been working for them now does not function. Anyway, we need to be careful about how this gets put into practice if we are going to do it retroactively.

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeremiah D Jordan <je...@gmail.com>.

Hindsight is 20/20.  For 8099 this is the reason we cut the 2.2 release before 8099 got merged.

But moving forward with where we are now, if we are going to start adding some experimental flags to things, then I would definitely put SASI on this list as well.

For both SASI and MV I don’t know that adding a flags in the cassandra.yaml which prevents their use is the right way to go.  I would propose that we emit WARN from the native protocol mechanism when a user does an ALTER/CREATE what ever that tries to use an experiment feature, and probably in the system.log as well.  So someone who is starting new development using them will get a warning showing up in cqlsh “hey the thing you just used is experimental, proceed with caution” and also in their logs.

These things are live on clusters right now, and I would not want someone to upgrade their cluster to a new *patch* release and suddenly something that may have been working for them now does not function.  Anyway, we need to be careful about how this gets put into practice if we are going to do it retroactively.

-Jeremiah


> On Oct 1, 2017, at 5:36 PM, Josh McKenzie <jm...@apache.org> wrote:
> 
>> 
>> I think committing 8099, or at the very least, parts of it, behind an
>> experimental flag would have been the right thing to do.
> 
> With a major refactor like that, it's a staggering amount of extra work to
> have a parallel re-write of core components of a storage engine accessible
> in parallel to the major based on an experimental flag in the same branch.
> I think the complexity in the code-base of having two such channels in
> parallel would be an altogether different kind of burden along with making
> the work take considerably longer. The argument of modularizing a change
> like that, however, is something I can get behind as a matter of general
> principle. As we discussed at NGCC, the amount of static state in the C*
> code-base makes this an aspirational goal rather than a reality all too
> often, unfortunately.
> 
> Not looking to get into the discussion of the appropriateness of 8099 and
> other major refactors like it (nio MessagingService for instance) - but
> there's a difference between building out new features and shielding the
> code-base and users from their complexity and reliability and refactoring
> core components of the code-base to keep it relevant.
> 
> On Sun, Oct 1, 2017 at 5:01 PM, Dave Brosius <db...@apache.org> wrote:
> 
>> triggers
>> 
>> 
>> On 10/01/2017 11:25 AM, Jeff Jirsa wrote:
>> 
>>> Historical examples are anything that you wouldn’t bet your job on for
>>> the first release:
>>> 
>>> Udf/uda in 2.2
>>> Incremental repair - would have yanked the flag following 9143
>>> SASI - probably still experimental
>>> Counters - all sorts of correctness issues originally, no longer true
>>> since the rewrite in 2.1
>>> Vnodes - or at least shuffle
>>> CDC - is the API going to change or is it good as-is?
>>> CQL - we’re on v3, what’s that say about v1?
>>> 
>>> Basically anything where we can’t definitively say “this feature is going
>>> to work for you, build your product on it” because companies around the
>>> world are trying to make that determination on their own, and they don’t
>>> have the same insight that the active committers have.
>>> 
>>> The transition out we could define as a fixed number of releases or a dev@
>>> vote, I don’t think you’ll find something that applies to all experimental
>>> features, so being flexible is probably the best bet there
>>> 
>>> 
>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Josh McKenzie <jm...@apache.org>.

>
> I think committing 8099, or at the very least, parts of it, behind an
> experimental flag would have been the right thing to do.

With a major refactor like that, it's a staggering amount of extra work to
have a parallel re-write of core components of a storage engine accessible
in parallel to the major based on an experimental flag in the same branch.
I think the complexity in the code-base of having two such channels in
parallel would be an altogether different kind of burden along with making
the work take considerably longer. The argument of modularizing a change
like that, however, is something I can get behind as a matter of general
principle. As we discussed at NGCC, the amount of static state in the C*
code-base makes this an aspirational goal rather than a reality all too
often, unfortunately.

Not looking to get into the discussion of the appropriateness of 8099 and
other major refactors like it (nio MessagingService for instance) - but
there's a difference between building out new features and shielding the
code-base and users from their complexity and reliability and refactoring
core components of the code-base to keep it relevant.

On Sun, Oct 1, 2017 at 5:01 PM, Dave Brosius <db...@apache.org> wrote:

> triggers
>
>
> On 10/01/2017 11:25 AM, Jeff Jirsa wrote:
>
>> Historical examples are anything that you wouldn’t bet your job on for
>> the first release:
>>
>> Udf/uda in 2.2
>> Incremental repair - would have yanked the flag following 9143
>> SASI - probably still experimental
>> Counters - all sorts of correctness issues originally, no longer true
>> since the rewrite in 2.1
>> Vnodes - or at least shuffle
>> CDC - is the API going to change or is it good as-is?
>> CQL - we’re on v3, what’s that say about v1?
>>
>> Basically anything where we can’t definitively say “this feature is going
>> to work for you, build your product on it” because companies around the
>> world are trying to make that determination on their own, and they don’t
>> have the same insight that the active committers have.
>>
>> The transition out we could define as a fixed number of releases or a dev@
>> vote, I don’t think you’ll find something that applies to all experimental
>> features, so being flexible is probably the best bet there
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Proposal to retroactively mark materialized views experimental

Posted by Dave Brosius <db...@apache.org>.

triggers


On 10/01/2017 11:25 AM, Jeff Jirsa wrote:
> Historical examples are anything that you wouldn’t bet your job on for the first release:
>
> Udf/uda in 2.2
> Incremental repair - would have yanked the flag following 9143
> SASI - probably still experimental
> Counters - all sorts of correctness issues originally, no longer true since the rewrite in 2.1
> Vnodes - or at least shuffle
> CDC - is the API going to change or is it good as-is?
> CQL - we’re on v3, what’s that say about v1?
>
> Basically anything where we can’t definitively say “this feature is going to work for you, build your product on it” because companies around the world are trying to make that determination on their own, and they don’t have the same insight that the active committers have.
>
> The transition out we could define as a fixed number of releases or a dev@ vote, I don’t think you’ll find something that applies to all experimental features, so being flexible is probably the best bet there
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org

Re: Proposal to retroactively mark materialized views experimental

Posted by Jeff Jirsa <jj...@gmail.com>.

Historical examples are anything that you wouldn’t bet your job on for the first release:

Udf/uda in 2.2
Incremental repair - would have yanked the flag following 9143
SASI - probably still experimental 
Counters - all sorts of correctness issues originally, no longer true since the rewrite in 2.1
Vnodes - or at least shuffle
CDC - is the API going to change or is it good as-is? 
CQL - we’re on v3, what’s that say about v1?

Basically anything where we can’t definitively say “this feature is going to work for you, build your product on it” because companies around the world are trying to make that determination on their own, and they don’t have the same insight that the active committers have.

The transition out we could define as a fixed number of releases or a dev@ vote, I don’t think you’ll find something that applies to all experimental features, so being flexible is probably the best bet there


-- 
Jeff Jirsa


> On Oct 1, 2017, at 3:12 AM, Marcus Eriksson <kr...@gmail.com> wrote:
> 
> I was just thinking that we should try really hard to avoid adding
> experimental features - they are experimental due to lack of testing right?
> There should be a clear path to making the feature non-experimental (or get
> it removed) and having that path discussed on dev@ might give more
> visibility to it.
> 
> I'm also struggling a bit to find good historic examples of "this would
> have been better off as an experimental feature" - I used to think that it
> would have been good to commit DTCS with some sort of experimental flag,
> but that would not have made DTCS any better - it would have been better to
> do more testing, realise that it does not work and then not commit it at
> all of course.
> 
> Does anyone have good examples of features where it would have made sense
> to commit them behind an experimental flag? SASI might be a good example,
> but for MVs - if we knew how painful they would be, they really would not
> have gotten committed at all, right?
> 
> /Marcus
> 
>> On Sat, Sep 30, 2017 at 7:42 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>> 
>> Reviewers should be able to suggest when experimental is warranted, and
>> conversation on dev+jira to justify when it’s transitioned from
>> experimental to stable?
>> 
>> We should remove the flag as soon as we’re (collectively) confident in a
>> feature’s behavior - at least correctness, if not performance.
>> 
>> 
>>> On Sep 29, 2017, at 10:31 PM, Marcus Eriksson <kr...@gmail.com> wrote:
>>> 
>>> +1 on marking MVs experimental, but should there be some point in the
>>> future where we consider removing them from the code base unless they
>> have
>>> gotten significant improvement as well?
>>> 
>>> We probably need to enforce some kind of process for adding new
>>> experimental features in the future - perhaps a mail like this one to
>> dev@
>>> motivating why it should be experimental?
>>> 
>>> /Marcus
>>> 
>>> On Sat, Sep 30, 2017 at 1:15 AM, Vinay Chella
>> <vc...@netflix.com.invalid>
>>> wrote:
>>> 
>>>> We tried perf testing MVs internally here but did not see good results
>> with
>>>> it, hence paused its usage. +1 on tagging certain features which are not
>>>> PROD ready or not stable enough.
>>>> 
>>>> Regards,
>>>> Vinay Chella
>>>> 
>>>>> On Fri, Sep 29, 2017 at 7:22 PM, Ben Bromhead <be...@instaclustr.com>
>> wrote:
>>>>> 
>>>>> I'm a fan of introducing experimental flags in general as well, +1
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Fri, 29 Sep 2017 at 13:22 Jon Haddad <jo...@jonhaddad.com> wrote:
>>>>>> 
>>>>>> I’m very much +1 on this, and to new features in general.
>>>>>> 
>>>>>> I think having a clear line in which we classify something as
>>>> production
>>>>>> ready would be nice.  It would be great if committers were using the
>>>>>> feature in prod and could vouch for it’s stability.
>>>>>> 
>>>>>>> On Sep 29, 2017, at 1:09 PM, Blake Eggleston <be...@apple.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi dev@,
>>>>>>> 
>>>>>>> I’d like to propose that we retroactively classify materialized views
>>>>> as
>>>>>> an experimental feature, disable them by default, and require users to
>>>>>> enable them through a config setting before using.
>>>>>>> 
>>>>>>> Materialized views have several issues that make them (effectively)
>>>>>> unusable in production. Some of the issues aren’t just implementation
>>>>>> problems, but problems with the design that aren’t easily fixed. It’s
>>>>>> unfair of us to make features available to users in this state without
>>>>>> providing a clear warning that bad or unexpected things are likely to
>>>>>> happen if they use it.
>>>>>>> 
>>>>>>> Obviously, this isn’t great news for users that have already adopted
>>>>>> MVs, and I don’t have a great answer for that. I think that’s sort of
>> a
>>>>>> sunk cost at this point. If they have any MV related problems, they’ll
>>>>> have
>>>>>> them whether they’re marked experimental or not. I would expect this
>> to
>>>>>> reduce the number of users adopting MVs in the future though, and if
>>>> they
>>>>>> do, it would be opt-in.
>>>>>>> 
>>>>>>> Once MVs reach a point where they’re usable in production, we can
>>>>> remove
>>>>>> the flag. Specifics of how the experimental flag would work can be
>>>>> hammered
>>>>>> out in a forthcoming JIRA, but I’d imagine it would just prevent users
>>>>> from
>>>>>> creating new MVs, and maybe log warnings on startup for existing MVs
>> if
>>>>> the
>>>>>> flag isn’t enabled.
>>>>>>> 
>>>>>>> Let me know what you think.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Blake
>>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>>>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>>>> 
>>>>>> --
>>>>> Ben Bromhead
>>>>> CTO | Instaclustr <https://www.instaclustr.com/>
>>>>> +1 650 284 9692
>>>>> Reliability at Scale
>>>>> Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer
>>>>> 
>>>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org