You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by kurt greaves <ku...@instaclustr.com> on 2017/07/17 04:22:48 UTC

State of Materialized Views

wall of text inc.
*tl;dr: *Aiming to come to some conclusions about what we are doing with
MV's and how we are going to make them stable in production. But really
just trying to raise awareness/involvement for MV's.

It seems we've got an excess of MV bugs that pretty much make them
completely unusable in production, or at least incredibly risky and also
limited. It also appears that we don't have many people totally across MV's
either (or at least a lack of people currently looking at them). To avoid
us "forgetting" about MV's I'd like to raise the current issues and get
opinions on the direction we should go with MV's. I know historically there
was a lot of discussion about this, but it seems a lot of the originally
involved are currently less involved, and thus before making wild changes
to MV's it might be worth going back to the start and think through the
original requirements and implementation.

Probably worth summarising the original goals of MV's:

   - Maintain eventual consistency between base table and view tables
   - Provide mechanisms to repair consistency between base and views
   - Aim to keep convergence between base and view fast without sacrificing
   availability (low MTTR)
   Goals that weren't explicitly mentioned but more or less implied:
   - Performance must be at least good enough to justify using them over
   rolling-your-own. (we haven't really tried to measure this yet - only
   measured in comparison to not-a-MV)
   - Allow a user to redefine their partitioning key

And also a quick summary of *some *of the limitations in our implementation
(there are more, but majority of our current problems revolve around these):

   1. Primary key of the base table must be included in the view,
   optionally one non-primary key column can be included in the view primary
   key.
   2. All columns in the view primary key must be declared NOT NULL.
   3. Base tables and views are one-to-one. That is, a *primary key* in a
   base maps to exactly one *primary key *in the view. Therefore you should
   never expect multiple rows in the view for a partition with multiple rows
   in the base.


I've summarised the bulk of the outstanding bugs below (may have missed
some), but notably it would be useful to get some decision-making happening
on them. Fixing these bugs is a bit more involved and there is likely a few
possible solutions and implications. Also they all pretty much touch the
same parts of the code, so needs to be some collaboration across the
patches (part of the reason I'm trying to bring more attention to them).

CASSANDRA-13657 <https://issues.apache.org/jira/browse/CASSANDRA-13657> -
Using a non-PK column in the view PK means that you can TTL that column in
the base without TTLing the resulting view row. Potential solution is to
change the definition of liveness info for view rows. This would probably
work but makes moving away from the NOT NULL requirement on view PK's
harder. Need to decide if that's what we want to do or if we pursue a
different solution.

CASSANDRA-13127 <https://issues.apache.org/jira/browse/CASSANDRA-13127> -
Inserting with key with a TTL then updating the TTL on a column from the
base that doesn't exist in the view doesn't update the liveness of the row
in the MV, and thus the MV row expires before the base. The current
proposed solution should work but will increase the amount of cases where
we need to read the existing data. Needs some reviewing and wouldn't hurt
to benchmark the changes.

CASSANDRA-13547 <https://issues.apache.org/jira/browse/CASSANDRA-13547> -
Being able to leave a column out of your SELECT but including it in the
view filters causes some serious issues. Proposed fix is to force user to
select all columns also included in where clause. This will potentially be
a compatibility issue but *should *be fine as it only is checked on MV
creation - so people upgrading shouldn't be affected (needs reviewing).
Also another issue is addressed in the patch regarding timestamps - choice
of timestamps led to rows not being deleted in the view. This comes back to
the fact that we allow a non-PK column in the view PK. Needs more reviewing.
Also related somewhat to 11500.

CASSANDRA-13409 <https://issues.apache.org/jira/browse/CASSANDRA-13409> -
Issues with shadowable tombstones. Has a patch but not sure if resolved
based on Zhao's last comment. Another case of bringing data back in the
view and thus making base and view inconsistent. Needs reviewing.

CASSANDRA-11500 <https://issues.apache.org/jira/browse/CASSANDRA-11500>
CASSANDRA-10965 <https://issues.apache.org/jira/browse/CASSANDRA-10965> -
Both these appear to be instances of the same issue. Got a couple of
potential solutions. Back to that problem of shadowable tombstones and
timestamps. Pretty involved and would require an in depth review as
decisions could greatly impact the complexity/usefulness of MV's.

CASSANDRA-13069 <https://issues.apache.org/jira/browse/CASSANDRA-13069> -
Node movements can cause inconsistencies. Paulo has written a patch but
Sylvain has raised some concerns about our use of the local batchlog.
Haven't confirmed myself but belief is that our eventual consistency
guarantee is broken... :/ needs reviewing...

CASSANDRA-12888 <https://issues.apache.org/jira/browse/CASSANDRA-12888> -
Most people are probably aware of this one. Losing the repaired_at status
for all MV streams as they are replayed through the write path. Has a
potential solution in place for 4.x, but we need to commit to a work around
for 3.11.x at least.

CASSANDRA-12730 <https://issues.apache.org/jira/browse/CASSANDRA-12730> -
This touches on some very common repair issues that we should probably look
at, but I don't think it directly relates to MV's anymore. Might be worth
removing the Materialized View component. (but this ticket probably still
deserves a bit of attention).

If anyone has been working on any of these tickets and no longer is able
to, either update the ticket or let me know and I'll either take over/find
some other poor soul to have a stab at it.
It would also be nice to get some volunteers who are familiar with MV's to
review the above tickets.

Another thing I'm not sure of is that we are aiming to guarantee eventual
consistency between base and view, however even with using the batchlog my
understanding is we can't achieve this without some tool to synchronise the
base with the view, however I don't think this tool currently exists and it
seems like CASSANDRA-10346
<https://issues.apache.org/jira/browse/CASSANDRA-10346> agrees... Can
anyone clarify if this is actually a requirement for eventual consistency?

My general advice these days is for users to steer clear of MV's for the
moment, however we have no clear plan for when these will really be stable.
I think as some of the changes to fix MV's may potentially require a major
version change, we should at least aim to get all those in for 4.0
(although still need to figure out what exactly these issues are).
Interested to hear peoples thoughts.

Re: State of Materialized Views

Posted by kurt greaves <ku...@instaclustr.com>.
Thanks for the input Benjamin. Sounds like you've come to a lot of the same
conclusions I have. I'm certainly keen on fixing up MV's and I don't really
see a way we could avoid it, as I know they are already widely being used
in production. I think we would have had a much easier time if we went with
a basic implementation (append only) first, but y'know, hindsight.
Unfortunately I'd say we're kind of stuck with fixing what we've got or
have a really angry userbase that jumps ship.

*What I miss is a central consensus about "MV business rules" + a central
> set of proofs and or tests that support these rules and proof or falsify
> assumptions in a reproducible way.*
>
From what I gathered from JIRA the goals in my original post are the ones
outlined during initial development of MV's. The general design and goals
were also documented here
<https://docs.google.com/document/d/1sK96wsE3uwFqzrLQju_spya6rOTxojOKR9N7W-rPwdw/edit#heading=h.c88v9p3byo75>,
however doesn't completely cover the current state of MV's.
I'm with you that we certainly need a set of proofs/tests to support these
rules. At the moment a lot of the open tickets have patches that contribute
good tests that cover many cases however we're almost kind of defining
rules as we go (granted it is difficult when we need to test every possible
write you could make in Cassandra).

In regards to your "tickler", a colleague has been working on something
similar however we haven't deemed it quite production ready yet so we
haven't released it to the public. It may be useful to compare notes if
you're interested!

​

Re: State of Materialized Views

Posted by benjamin roth <br...@gmail.com>.
Hi Kurt,

First of all thanks for this elaborate post.

At this moment, I don't want to come up with a solution for all MV issues
but I would like to point out, why I was quite active some time ago and why
I pulled myself back.

As you also mentioned in different words, it seems to me that MVs are an
orphan in CS. They started out as a shiny and promising feature, but ... .
When I came to CS, MVs were one of the reasons why I gave CS in general and
3.0 in special a try. But when I started to work with MVs in production -
willing to overcome the "little obstacles" and the fact they are "not quite
stable" - I started to realize that there is almost no support from the
community. The initial contributors turned their back on MVs. All that
remained is a 95% ready feature, a lot of public documentation but no
disclaimer that says "Please Do Not Use MVs". And every time when a
discussion pops up around MVs the bottom line is:

- All or most of involved people have not much experience in MVs
- Original contributors are not involved
- It seems to me, discussions are more based on assumptions or superficial
knowledge than on real knowledge/experience/research/proofs
- Bringing in code changes is difficult for the same reasons. Nobody likes
to take over the "old heritage" or take over responsibility for it. And it
seems that nobody feels confident enough to bring in critical changes
- I don't want to touch this critical part in the code path, I know we have
tests but ...

Initially I was very eager to contribute and to help MV to get mature but
over time it turned out it is very cumbersome and frustrating. Additionally
I have very little time left in my daily routine to work on CS. So I
decided to work on a solution that solved our specific problems with CS and
MVs. I am not really happy with it but it actually works quite well.

To be honest, I also had in the back of my head to write a posing similar
to yours. I would really like to contribute and bring MVs forward, but not
at all costs. I see many problems with MVs, even some that haven't even
been mentioned, yet. But I do not want to come up with half-baked
assumptions. What really lacks for MVs is a reproducible code-based proof
what works and what does not. One example is the question "Why can I add
only a single column to an MV PK". I have read arguments of which I think
they are not quite right or "somehow incomplete". There are a lot of
arguments and discussions that are totally scattered across JIRA and it
seems to me that every contributor knows a little bit of this and a little
bit of that and remember this post or that post. I was already thinking of
setting up super-reduced "storage mock" to prove / find edge cases in MV
fail-and-repair scenarios to answer questions like these with code instead
of sentences like "I think that... " or "I can remember a comment of ...".
Unfortunately dtests are super painful things like that because a) they are
f***** slow b) it is super complicated to simulate a certain situation. I
also did not see a simple way to do this with the CS unit test suite as I
didn't see a way to boot and control multiple storages there.

*What I miss is a central consensus about "MV business rules" + a central
set of proofs and or tests that support these rules and proof or falsify
assumptions in a reproducible way.*

The reason why I did not already come up with sth like that:
- Time
- Frustration

If I can see that there are more people who feel like that and are willing
to work together to find a solid solution, my level of frustration could
turn into motivation again.

--
Last but not least for those who care:
One of the solutions I created was to implement our own version of Tickler
(full table scans with CL_ALL to enforce read repair) to get rid of these
damned built-in repairs which simply don't work well (especially) for MVs.
To only name a few numbers:
- We could bring down the repair time of a KS with RF=5 from 5 hours to 5
minutes. Really. I could not believe it.
- No more "compaction storms" or piling up compaction queues or compactions
falling behind
- No more SSTables piling up. Before it was normal that the number of
SSTables went up from 300-400 to 5000 and more. After: No noticeable
change. (Btw that was the reason for CASSANDRA-12730. This isn't even bound
to MVs, they maybe only amplify the impact of the underlying design)
- We now repair the whole cluster in 16h (10 nodes, 400-450gb load each,
14KS). Before we had single keyspaces that took more than a day to finish.
Sometimes they took even 3 days with reaper because of "Too many
compactions"
- It showed us problems in our model. We had data that was not readable at
all due to massive tombstones + read timeouts
... if someone is interested in more details, just ping me.

- Benjamin


2017-07-17 6:22 GMT+02:00 kurt greaves <ku...@instaclustr.com>:

> wall of text inc.
> *tl;dr: *Aiming to come to some conclusions about what we are doing with
> MV's and how we are going to make them stable in production. But really
> just trying to raise awareness/involvement for MV's.
>
> It seems we've got an excess of MV bugs that pretty much make them
> completely unusable in production, or at least incredibly risky and also
> limited. It also appears that we don't have many people totally across MV's
> either (or at least a lack of people currently looking at them). To avoid
> us "forgetting" about MV's I'd like to raise the current issues and get
> opinions on the direction we should go with MV's. I know historically there
> was a lot of discussion about this, but it seems a lot of the originally
> involved are currently less involved, and thus before making wild changes
> to MV's it might be worth going back to the start and think through the
> original requirements and implementation.
>
> Probably worth summarising the original goals of MV's:
>
>    - Maintain eventual consistency between base table and view tables
>    - Provide mechanisms to repair consistency between base and views
>    - Aim to keep convergence between base and view fast without sacrificing
>    availability (low MTTR)
>    Goals that weren't explicitly mentioned but more or less implied:
>    - Performance must be at least good enough to justify using them over
>    rolling-your-own. (we haven't really tried to measure this yet - only
>    measured in comparison to not-a-MV)
>    - Allow a user to redefine their partitioning key
>
> And also a quick summary of *some *of the limitations in our implementation
> (there are more, but majority of our current problems revolve around
> these):
>
>    1. Primary key of the base table must be included in the view,
>    optionally one non-primary key column can be included in the view
> primary
>    key.
>    2. All columns in the view primary key must be declared NOT NULL.
>    3. Base tables and views are one-to-one. That is, a *primary key* in a
>    base maps to exactly one *primary key *in the view. Therefore you should
>    never expect multiple rows in the view for a partition with multiple
> rows
>    in the base.
>
>
> I've summarised the bulk of the outstanding bugs below (may have missed
> some), but notably it would be useful to get some decision-making happening
> on them. Fixing these bugs is a bit more involved and there is likely a few
> possible solutions and implications. Also they all pretty much touch the
> same parts of the code, so needs to be some collaboration across the
> patches (part of the reason I'm trying to bring more attention to them).
>
> CASSANDRA-13657 <https://issues.apache.org/jira/browse/CASSANDRA-13657> -
> Using a non-PK column in the view PK means that you can TTL that column in
> the base without TTLing the resulting view row. Potential solution is to
> change the definition of liveness info for view rows. This would probably
> work but makes moving away from the NOT NULL requirement on view PK's
> harder. Need to decide if that's what we want to do or if we pursue a
> different solution.
>
> CASSANDRA-13127 <https://issues.apache.org/jira/browse/CASSANDRA-13127> -
> Inserting with key with a TTL then updating the TTL on a column from the
> base that doesn't exist in the view doesn't update the liveness of the row
> in the MV, and thus the MV row expires before the base. The current
> proposed solution should work but will increase the amount of cases where
> we need to read the existing data. Needs some reviewing and wouldn't hurt
> to benchmark the changes.
>
> CASSANDRA-13547 <https://issues.apache.org/jira/browse/CASSANDRA-13547> -
> Being able to leave a column out of your SELECT but including it in the
> view filters causes some serious issues. Proposed fix is to force user to
> select all columns also included in where clause. This will potentially be
> a compatibility issue but *should *be fine as it only is checked on MV
> creation - so people upgrading shouldn't be affected (needs reviewing).
> Also another issue is addressed in the patch regarding timestamps - choice
> of timestamps led to rows not being deleted in the view. This comes back to
> the fact that we allow a non-PK column in the view PK. Needs more
> reviewing.
> Also related somewhat to 11500.
>
> CASSANDRA-13409 <https://issues.apache.org/jira/browse/CASSANDRA-13409> -
> Issues with shadowable tombstones. Has a patch but not sure if resolved
> based on Zhao's last comment. Another case of bringing data back in the
> view and thus making base and view inconsistent. Needs reviewing.
>
> CASSANDRA-11500 <https://issues.apache.org/jira/browse/CASSANDRA-11500>
> CASSANDRA-10965 <https://issues.apache.org/jira/browse/CASSANDRA-10965> -
> Both these appear to be instances of the same issue. Got a couple of
> potential solutions. Back to that problem of shadowable tombstones and
> timestamps. Pretty involved and would require an in depth review as
> decisions could greatly impact the complexity/usefulness of MV's.
>
> CASSANDRA-13069 <https://issues.apache.org/jira/browse/CASSANDRA-13069> -
> Node movements can cause inconsistencies. Paulo has written a patch but
> Sylvain has raised some concerns about our use of the local batchlog.
> Haven't confirmed myself but belief is that our eventual consistency
> guarantee is broken... :/ needs reviewing...
>
> CASSANDRA-12888 <https://issues.apache.org/jira/browse/CASSANDRA-12888> -
> Most people are probably aware of this one. Losing the repaired_at status
> for all MV streams as they are replayed through the write path. Has a
> potential solution in place for 4.x, but we need to commit to a work around
> for 3.11.x at least.
>
> CASSANDRA-12730 <https://issues.apache.org/jira/browse/CASSANDRA-12730> -
> This touches on some very common repair issues that we should probably look
> at, but I don't think it directly relates to MV's anymore. Might be worth
> removing the Materialized View component. (but this ticket probably still
> deserves a bit of attention).
>
> If anyone has been working on any of these tickets and no longer is able
> to, either update the ticket or let me know and I'll either take over/find
> some other poor soul to have a stab at it.
> It would also be nice to get some volunteers who are familiar with MV's to
> review the above tickets.
>
> Another thing I'm not sure of is that we are aiming to guarantee eventual
> consistency between base and view, however even with using the batchlog my
> understanding is we can't achieve this without some tool to synchronise the
> base with the view, however I don't think this tool currently exists and it
> seems like CASSANDRA-10346
> <https://issues.apache.org/jira/browse/CASSANDRA-10346> agrees... Can
> anyone clarify if this is actually a requirement for eventual consistency?
>
> My general advice these days is for users to steer clear of MV's for the
> moment, however we have no clear plan for when these will really be stable.
> I think as some of the changes to fix MV's may potentially require a major
> version change, we should at least aim to get all those in for 4.0
> (although still need to figure out what exactly these issues are).
> Interested to hear peoples thoughts.
>

Re: State of Materialized Views

Posted by Carlos Rolo <ro...@pythian.com>.
We have a couple of big deployments with MV in production, I will try to
get some help in form of testing and validation. Will do my best to try and
contribute to the codebase too.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
<http://linkedin.com/in/carlosjuzarterolo>*
Mobile: +351 918 918 100
www.pythian.com

On Mon, Jul 24, 2017 at 3:48 PM, Josh McKenzie <jm...@apache.org> wrote:

> >
> > Who is "we" in this case?
>
>
> Initial contributors (myself + Jake, Carl's no longer active on the
> project), Zhao, Andres, Paulo, Sylvain, etc. The people who are publicly,
> actively working on MV issues atm.
>
> On Mon, Jul 24, 2017 at 9:46 AM, benjamin roth <br...@gmail.com> wrote:
>
> > Hi Josh,
> >
> > Who is "we" in this case?
> >
> > Best,
> > Ben
> >
> > 2017-07-24 15:41 GMT+02:00 Josh McKenzie <jm...@apache.org>:
> >
> > > >
> > > > The initial contributors turned their back on MVs
> > >
> > >
> > > We're working on the following MV-related issues in the 4.0 time-frame:
> > >     CASSANDRA-13162
> > >     CASSANDRA-13547
> > >     CASSANDRA-13127
> > >     CASSANDRA-13409
> > >     CASSANDRA-12952
> > >     CASSANDRA-13069
> > >     CASSANDRA-12888
> > >
> > > We're also keeping our eye on CASSANDRA-13657
> > >
> > > This is by no means an exhaustive list, but we're hoping it'll help
> take
> > > care of some of the more pressing / critical issues with the feature.
> > > Automated de-normalization on a Dynamo EC architecture is a Hard
> Problem.
> > >
> > >
> > > On Thu, Jul 20, 2017 at 9:56 PM, kurt greaves <ku...@instaclustr.com>
> > > wrote:
> > >
> > > > I'm going to do my best to review all the changes Zhao is making
> under
> > > > CASSANDRA-11500 <https://issues.apache.org/
> jira/browse/CASSANDRA-11500
> > >,
> > > > but yeah definitely need a committer nominee as well. On that note,
> > Zhao
> > > is
> > > > going to try address a lot of the current issues I listed above in
> > > #11500.​
> > > > Thanks Zhao!
> > > >
> > >
> >
>

-- 


--




Re: State of Materialized Views

Posted by Josh McKenzie <jm...@apache.org>.
>
> Who is "we" in this case?


Initial contributors (myself + Jake, Carl's no longer active on the
project), Zhao, Andres, Paulo, Sylvain, etc. The people who are publicly,
actively working on MV issues atm.

On Mon, Jul 24, 2017 at 9:46 AM, benjamin roth <br...@gmail.com> wrote:

> Hi Josh,
>
> Who is "we" in this case?
>
> Best,
> Ben
>
> 2017-07-24 15:41 GMT+02:00 Josh McKenzie <jm...@apache.org>:
>
> > >
> > > The initial contributors turned their back on MVs
> >
> >
> > We're working on the following MV-related issues in the 4.0 time-frame:
> >     CASSANDRA-13162
> >     CASSANDRA-13547
> >     CASSANDRA-13127
> >     CASSANDRA-13409
> >     CASSANDRA-12952
> >     CASSANDRA-13069
> >     CASSANDRA-12888
> >
> > We're also keeping our eye on CASSANDRA-13657
> >
> > This is by no means an exhaustive list, but we're hoping it'll help take
> > care of some of the more pressing / critical issues with the feature.
> > Automated de-normalization on a Dynamo EC architecture is a Hard Problem.
> >
> >
> > On Thu, Jul 20, 2017 at 9:56 PM, kurt greaves <ku...@instaclustr.com>
> > wrote:
> >
> > > I'm going to do my best to review all the changes Zhao is making under
> > > CASSANDRA-11500 <https://issues.apache.org/jira/browse/CASSANDRA-11500
> >,
> > > but yeah definitely need a committer nominee as well. On that note,
> Zhao
> > is
> > > going to try address a lot of the current issues I listed above in
> > #11500.​
> > > Thanks Zhao!
> > >
> >
>

Re: State of Materialized Views

Posted by benjamin roth <br...@gmail.com>.
Hi Josh,

Who is "we" in this case?

Best,
Ben

2017-07-24 15:41 GMT+02:00 Josh McKenzie <jm...@apache.org>:

> >
> > The initial contributors turned their back on MVs
>
>
> We're working on the following MV-related issues in the 4.0 time-frame:
>     CASSANDRA-13162
>     CASSANDRA-13547
>     CASSANDRA-13127
>     CASSANDRA-13409
>     CASSANDRA-12952
>     CASSANDRA-13069
>     CASSANDRA-12888
>
> We're also keeping our eye on CASSANDRA-13657
>
> This is by no means an exhaustive list, but we're hoping it'll help take
> care of some of the more pressing / critical issues with the feature.
> Automated de-normalization on a Dynamo EC architecture is a Hard Problem.
>
>
> On Thu, Jul 20, 2017 at 9:56 PM, kurt greaves <ku...@instaclustr.com>
> wrote:
>
> > I'm going to do my best to review all the changes Zhao is making under
> > CASSANDRA-11500 <https://issues.apache.org/jira/browse/CASSANDRA-11500>,
> > but yeah definitely need a committer nominee as well. On that note, Zhao
> is
> > going to try address a lot of the current issues I listed above in
> #11500.​
> > Thanks Zhao!
> >
>

Re: State of Materialized Views

Posted by Aleksey Yeshchenko <al...@apple.com>.
It’s the only remaining one on my radar. But we should be good next week - so long as nothing else pops up, and I’m not missing any other JIRAs.

—
AY

On 22 September 2017 at 19:18:12, Michael Shuler (michael@pbandjelly.org) wrote:

I asked the same question on IRC a couple days ago and Aleksey asked if  
I could hold for CASSANDRA-13595.  

--  
Kind regards,  
Michael  

On 09/22/2017 01:02 PM, Ben Bromhead wrote:  
> Just saw that https://issues.apache.org/jira/browse/CASSANDRA-11500 got  
> commited 4 days ago, awesome stuff and a huge thank you to everyone who  
> worked on it!  
>  
> Looking forward to what happens in  
> https://issues.apache.org/jira/browse/CASSANDRA-13826 :)  
>  
> I don't know if we are waiting on anything other than  
> https://issues.apache.org/jira/browse/CASSANDRA-13808 for 3.11.1 ?  
>  
> On Tue, 25 Jul 2017 at 04:58 Josh McKenzie <jm...@apache.org> wrote:  
>  
>> Status of above is on our collective radars. As always, interleaving  
>> reviews with other work is a challenge.  
>>  
>> On Mon, Jul 24, 2017 at 7:05 PM, Nate McCall <zz...@gmail.com> wrote:  
>>  
>>>>  
>>>> We're working on the following MV-related issues in the 4.0 time-frame:  
>>>> CASSANDRA-13162  
>>>> CASSANDRA-13547  
>>> Patch Available  
>>>  
>>>> CASSANDRA-13127  
>>> Patch Available  
>>>  
>>>> CASSANDRA-13409  
>>> Patch Available  
>>>  
>>>> CASSANDRA-12952  
>>> Patch Available  
>>>  
>>>> CASSANDRA-13069  
>>>> CASSANDRA-12888  
>>>>  
>>>  
>>> Josh - want to make sure folks are not duplicating effort here, is the  
>>> status of the above on your radar? Regardless, I appreciate the  
>>> communication. Thanks for that!  
>>>  
>>> ---------------------------------------------------------------------  
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
>>> For additional commands, e-mail: dev-help@cassandra.apache.org  
>>>  
>>>  
>>  


---------------------------------------------------------------------  
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
For additional commands, e-mail: dev-help@cassandra.apache.org  


Re: State of Materialized Views

Posted by Michael Shuler <mi...@pbandjelly.org>.
I asked the same question on IRC a couple days ago and Aleksey asked if
I could hold for CASSANDRA-13595.

-- 
Kind regards,
Michael

On 09/22/2017 01:02 PM, Ben Bromhead wrote:
> Just saw that https://issues.apache.org/jira/browse/CASSANDRA-11500 got
> commited 4 days ago, awesome stuff and a huge thank you to everyone who
> worked on it!
> 
> Looking forward to what happens in
> https://issues.apache.org/jira/browse/CASSANDRA-13826 :)
> 
> I don't know if we are waiting on anything other than
> https://issues.apache.org/jira/browse/CASSANDRA-13808 for 3.11.1 ?
> 
> On Tue, 25 Jul 2017 at 04:58 Josh McKenzie <jm...@apache.org> wrote:
> 
>> Status of above is on our collective radars. As always, interleaving
>> reviews with other work is a challenge.
>>
>> On Mon, Jul 24, 2017 at 7:05 PM, Nate McCall <zz...@gmail.com> wrote:
>>
>>>>
>>>> We're working on the following MV-related issues in the 4.0 time-frame:
>>>>     CASSANDRA-13162
>>>>     CASSANDRA-13547
>>> Patch Available
>>>
>>>>     CASSANDRA-13127
>>> Patch Available
>>>
>>>>     CASSANDRA-13409
>>> Patch Available
>>>
>>>>     CASSANDRA-12952
>>> Patch Available
>>>
>>>>     CASSANDRA-13069
>>>>     CASSANDRA-12888
>>>>
>>>
>>> Josh - want to make sure folks are not duplicating effort here, is the
>>> status of the above on your radar? Regardless, I appreciate the
>>> communication. Thanks for that!
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>>
>>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: State of Materialized Views

Posted by Ben Bromhead <be...@instaclustr.com>.
Just saw that https://issues.apache.org/jira/browse/CASSANDRA-11500 got
commited 4 days ago, awesome stuff and a huge thank you to everyone who
worked on it!

Looking forward to what happens in
https://issues.apache.org/jira/browse/CASSANDRA-13826 :)

I don't know if we are waiting on anything other than
https://issues.apache.org/jira/browse/CASSANDRA-13808 for 3.11.1 ?

On Tue, 25 Jul 2017 at 04:58 Josh McKenzie <jm...@apache.org> wrote:

> Status of above is on our collective radars. As always, interleaving
> reviews with other work is a challenge.
>
> On Mon, Jul 24, 2017 at 7:05 PM, Nate McCall <zz...@gmail.com> wrote:
>
> > >
> > > We're working on the following MV-related issues in the 4.0 time-frame:
> > >     CASSANDRA-13162
> > >     CASSANDRA-13547
> > Patch Available
> >
> > >     CASSANDRA-13127
> > Patch Available
> >
> > >     CASSANDRA-13409
> > Patch Available
> >
> > >     CASSANDRA-12952
> > Patch Available
> >
> > >     CASSANDRA-13069
> > >     CASSANDRA-12888
> > >
> >
> > Josh - want to make sure folks are not duplicating effort here, is the
> > status of the above on your radar? Regardless, I appreciate the
> > communication. Thanks for that!
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>
-- 
Ben Bromhead
CTO | Instaclustr <https://www.instaclustr.com/>
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer

Re: State of Materialized Views

Posted by Josh McKenzie <jm...@apache.org>.
Status of above is on our collective radars. As always, interleaving
reviews with other work is a challenge.

On Mon, Jul 24, 2017 at 7:05 PM, Nate McCall <zz...@gmail.com> wrote:

> >
> > We're working on the following MV-related issues in the 4.0 time-frame:
> >     CASSANDRA-13162
> >     CASSANDRA-13547
> Patch Available
>
> >     CASSANDRA-13127
> Patch Available
>
> >     CASSANDRA-13409
> Patch Available
>
> >     CASSANDRA-12952
> Patch Available
>
> >     CASSANDRA-13069
> >     CASSANDRA-12888
> >
>
> Josh - want to make sure folks are not duplicating effort here, is the
> status of the above on your radar? Regardless, I appreciate the
> communication. Thanks for that!
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: State of Materialized Views

Posted by Nate McCall <zz...@gmail.com>.
>
> We're working on the following MV-related issues in the 4.0 time-frame:
>     CASSANDRA-13162
>     CASSANDRA-13547
Patch Available

>     CASSANDRA-13127
Patch Available

>     CASSANDRA-13409
Patch Available

>     CASSANDRA-12952
Patch Available

>     CASSANDRA-13069
>     CASSANDRA-12888
>

Josh - want to make sure folks are not duplicating effort here, is the
status of the above on your radar? Regardless, I appreciate the
communication. Thanks for that!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: State of Materialized Views

Posted by Josh McKenzie <jm...@apache.org>.
>
> The initial contributors turned their back on MVs


We're working on the following MV-related issues in the 4.0 time-frame:
    CASSANDRA-13162
    CASSANDRA-13547
    CASSANDRA-13127
    CASSANDRA-13409
    CASSANDRA-12952
    CASSANDRA-13069
    CASSANDRA-12888

We're also keeping our eye on CASSANDRA-13657

This is by no means an exhaustive list, but we're hoping it'll help take
care of some of the more pressing / critical issues with the feature.
Automated de-normalization on a Dynamo EC architecture is a Hard Problem.


On Thu, Jul 20, 2017 at 9:56 PM, kurt greaves <ku...@instaclustr.com> wrote:

> I'm going to do my best to review all the changes Zhao is making under
> CASSANDRA-11500 <https://issues.apache.org/jira/browse/CASSANDRA-11500>,
> but yeah definitely need a committer nominee as well. On that note, Zhao is
> going to try address a lot of the current issues I listed above in #11500.​
> Thanks Zhao!
>

Re: State of Materialized Views

Posted by kurt greaves <ku...@instaclustr.com>.
I'm going to do my best to review all the changes Zhao is making under
CASSANDRA-11500 <https://issues.apache.org/jira/browse/CASSANDRA-11500>,
but yeah definitely need a committer nominee as well. On that note, Zhao is
going to try address a lot of the current issues I listed above in #11500.​
Thanks Zhao!

Re: State of Materialized Views

Posted by Nate McCall <zz...@gmail.com>.
>
>  so perhaps the real solution is we need to be more aggressive about nominating and electing committers who are willing to spend some attention on MVs.
>

I am very much +1 on this solution.

Huge thanks to Kurt for the excellent summarization and to Benjamin
and ZhaoYang for all their recent development efforts.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: State of Materialized Views

Posted by Jeff Jirsa <jj...@apache.org>.

On 2017-07-16 21:22 (-0700), kurt greaves <ku...@instaclustr.com> wrote: 
> wall of text inc.
> *tl;dr: *Aiming to come to some conclusions about what we are doing with
> MV's and how we are going to make them stable in production. But really
> just trying to raise awareness/involvement for MV's.
> 

I share your frustration, for what it's worth. And Ben's, too. That doesn't necessarily count for much, I'm afraid, but I sympathize.

> It seems we've got an excess of MV bugs that pretty much make them
> completely unusable in production, or at least incredibly risky and also
> limited. It also appears that we don't have many people totally across MV's
> either (or at least a lack of people currently looking at them). To avoid
> us "forgetting" about MV's I'd like to raise the current issues and get
> opinions on the direction we should go with MV's. I know historically there
> was a lot of discussion about this, but it seems a lot of the originally
> involved are currently less involved, and thus before making wild changes
> to MV's it might be worth going back to the start and think through the
> original requirements and implementation.
> 
> 
> If anyone has been working on any of these tickets and no longer is able
> to, either update the ticket or let me know and I'll either take over/find
> some other poor soul to have a stab at it.
> It would also be nice to get some volunteers who are familiar with MV's to
> review the above tickets.

Anyone want to admit to running them in prod? Any committers with an MV install base? Any non-trivial use cases? 

> 
> 
> My general advice these days is for users to steer clear of MV's for the
> moment, however we have no clear plan for when these will really be stable.
> I think as some of the changes to fix MV's may potentially require a major
> version change, we should at least aim to get all those in for 4.0
> (although still need to figure out what exactly these issues are).
> Interested to hear peoples thoughts.

I think you're probably right on here. I think they may work for people with suitably simple use cases (append only, no delete, writes with strong consistency, and use single token or few tokens per node).

I think the more clear point is that we need people willing to help step up and fix it. I don't use them in prod, and I don't actually know anyone who does (though clearly a few folks do, including the three or four folks who seem to actually be working on the tickets), so perhaps the real solution is we need to be more aggressive about nominating and electing committers who are willing to spend some attention on MVs. 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org