You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Jonathan Haddad <jo...@jonhaddad.com> on 2018/08/28 00:36:55 UTC

Reaper as cassandra-admin

Hey folks,

Mick brought this up in the sidecar thread, but I wanted to have a clear /
separate discussion about what we're thinking with regard to contributing
Reaper to the C* project.  In my mind, starting with Reaper is a great way
of having an admin right now, that we know works well at the kind of scale
we need.  We've worked with a lot of companies putting Reaper in prod (at
least 50), running on several hundred clusters.  The codebase has evolved
as a direct result of production usage, and we feel it would be great to
pair it with the 4.0 release.  There was a LOT of work done on the repair
logic to make things work across every supported version of Cassandra, with
a great deal of documentation as well.

In case folks aren't aware, in addition to one off and scheduled repairs,
Reaper also does cluster wide snapshots, exposes thread pool stats, and
visualizes streaming (in trunk).

We're hoping to get some feedback on our side if that's something people
are interested in.  We've gone back and forth privately on our own
preferences, hopes, dreams, etc, but I feel like a public discussion would
be healthy at this point.  Does anyone share the view of using Reaper as a
starting point?  What concerns to people have?
-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

Re: Reaper as cassandra-admin

Posted by Sankalp Kohli <ko...@gmail.com>.
We can wait for a week post Freeze so everyone can participate however we need to decide after that so we can make progress. 
I am leaning towards piecemeal approach so we can review the code and pick best of all 3 options 

> On Aug 29, 2018, at 00:26, kurt greaves <ku...@instaclustr.com> wrote:
> 
> 2c: There's a lot to think about here, and as Blake already mentioned most
> people don't have time to dedicate a lot of thought to this at the moment.
> There appear to be a lot of voices missing from the discussion, and I think
> it's pretty clear this isn't super tied to the freeze, so maybe we should
> leave this discussion until next week when everyone can take part? This
> kind of goes for every sidecar related discussion going on at the moment
> IMO.
> 
> On 29 August 2018 at 16:44, Vinay Chella <vc...@netflix.com.invalid>
> wrote:
> 
>>> I haven’t settled on a position yet (will have more time think about
>> things after the 9/1 freeze), but I wanted to point out that the argument
>> that something new should be written because an existing project has tech
>> debt, and we'll do it the right way this time, is a pretty common software
>> engineering mistake. The thing you’re replacing usually needs to have some
>> really serious problems to make it worth replacing.
>> 
>> Agreed, Yes, I don’t think we should write everything from the scratch, but
>> carry forwarding tech debt (if any) and design decisions which makes new
>> features in future difficult to develop is something that we need to
>> consider. I second Dinesh’s thought on taking the best parts from available
>> projects to move forward with the right solution which works great and
>> easily pluggable.
>> 
>> -
>> Vinay Chella
>> 
>> 
>>> On Tue, Aug 28, 2018 at 10:03 PM Mick Semb Wever <mc...@apache.org> wrote:
>>> 
>>> 
>>>> the argument that something new should be written because an existing
>>> project has tech debt, and we'll do it the right way this time, is a
>> pretty
>>> common software engineering mistake. The thing you’re replacing usually
>>> needs to have some really serious problems to make it worth replacing.
>>> 
>>> 
>>> Thanks for writing this Blake. I'm no fan of writing from scratch.
>> Working
>>> with other people's code is the joy of open-source, imho.
>>> 
>>> Reaper is not a big project. None of its java files are large or
>>> complicated.
>>> This is not the C* codebase we're talking about.
>>> 
>>> It comes with strict code style in place (which the build enforces), unit
>>> and integration tests. The tech debt that I think of first is removing
>>> stuff that we would no longer want to support if it were inside the
>>> Cassandra project. A number of recent refactorings  have proved it's an
>>> easy codebase to work with.
>>> 
>>> It's also worth noting that Cassandra-4.x adoption is still some away, in
>>> which time Reaper will only continue to grow and gain users.
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>> 
>>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by kurt greaves <ku...@instaclustr.com>.
2c: There's a lot to think about here, and as Blake already mentioned most
people don't have time to dedicate a lot of thought to this at the moment.
There appear to be a lot of voices missing from the discussion, and I think
it's pretty clear this isn't super tied to the freeze, so maybe we should
leave this discussion until next week when everyone can take part? This
kind of goes for every sidecar related discussion going on at the moment
IMO.

On 29 August 2018 at 16:44, Vinay Chella <vc...@netflix.com.invalid>
wrote:

> > I haven’t settled on a position yet (will have more time think about
> things after the 9/1 freeze), but I wanted to point out that the argument
> that something new should be written because an existing project has tech
> debt, and we'll do it the right way this time, is a pretty common software
> engineering mistake. The thing you’re replacing usually needs to have some
> really serious problems to make it worth replacing.
>
> Agreed, Yes, I don’t think we should write everything from the scratch, but
> carry forwarding tech debt (if any) and design decisions which makes new
> features in future difficult to develop is something that we need to
> consider. I second Dinesh’s thought on taking the best parts from available
> projects to move forward with the right solution which works great and
> easily pluggable.
>
> -
> Vinay Chella
>
>
> On Tue, Aug 28, 2018 at 10:03 PM Mick Semb Wever <mc...@apache.org> wrote:
>
> >
> > > the argument that something new should be written because an existing
> > project has tech debt, and we'll do it the right way this time, is a
> pretty
> > common software engineering mistake. The thing you’re replacing usually
> > needs to have some really serious problems to make it worth replacing.
> >
> >
> > Thanks for writing this Blake. I'm no fan of writing from scratch.
> Working
> > with other people's code is the joy of open-source, imho.
> >
> > Reaper is not a big project. None of its java files are large or
> > complicated.
> > This is not the C* codebase we're talking about.
> >
> > It comes with strict code style in place (which the build enforces), unit
> > and integration tests. The tech debt that I think of first is removing
> > stuff that we would no longer want to support if it were inside the
> > Cassandra project. A number of recent refactorings  have proved it's an
> > easy codebase to work with.
> >
> > It's also worth noting that Cassandra-4.x adoption is still some away, in
> > which time Reaper will only continue to grow and gain users.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>

Re: Reaper as cassandra-admin

Posted by Jeff Jirsa <jj...@gmail.com>.
Agreed here - combining effort and making things pluggable seems like a good solution


-- 
Jeff Jirsa


On Aug 28, 2018, at 11:44 PM, Vinay Chella <vc...@netflix.com.INVALID> wrote:

>> I haven’t settled on a position yet (will have more time think about
> things after the 9/1 freeze), but I wanted to point out that the argument
> that something new should be written because an existing project has tech
> debt, and we'll do it the right way this time, is a pretty common software
> engineering mistake. The thing you’re replacing usually needs to have some
> really serious problems to make it worth replacing.
> 
> Agreed, Yes, I don’t think we should write everything from the scratch, but
> carry forwarding tech debt (if any) and design decisions which makes new
> features in future difficult to develop is something that we need to
> consider. I second Dinesh’s thought on taking the best parts from available
> projects to move forward with the right solution which works great and
> easily pluggable.
> 
> -
> Vinay Chella
> 
> 
>> On Tue, Aug 28, 2018 at 10:03 PM Mick Semb Wever <mc...@apache.org> wrote:
>> 
>> 
>>> the argument that something new should be written because an existing
>> project has tech debt, and we'll do it the right way this time, is a pretty
>> common software engineering mistake. The thing you’re replacing usually
>> needs to have some really serious problems to make it worth replacing.
>> 
>> 
>> Thanks for writing this Blake. I'm no fan of writing from scratch. Working
>> with other people's code is the joy of open-source, imho.
>> 
>> Reaper is not a big project. None of its java files are large or
>> complicated.
>> This is not the C* codebase we're talking about.
>> 
>> It comes with strict code style in place (which the build enforces), unit
>> and integration tests. The tech debt that I think of first is removing
>> stuff that we would no longer want to support if it were inside the
>> Cassandra project. A number of recent refactorings  have proved it's an
>> easy codebase to work with.
>> 
>> It's also worth noting that Cassandra-4.x adoption is still some away, in
>> which time Reaper will only continue to grow and gain users.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Vinay Chella <vc...@netflix.com.INVALID>.
> I haven’t settled on a position yet (will have more time think about
things after the 9/1 freeze), but I wanted to point out that the argument
that something new should be written because an existing project has tech
debt, and we'll do it the right way this time, is a pretty common software
engineering mistake. The thing you’re replacing usually needs to have some
really serious problems to make it worth replacing.

Agreed, Yes, I don’t think we should write everything from the scratch, but
carry forwarding tech debt (if any) and design decisions which makes new
features in future difficult to develop is something that we need to
consider. I second Dinesh’s thought on taking the best parts from available
projects to move forward with the right solution which works great and
easily pluggable.

-
Vinay Chella


On Tue, Aug 28, 2018 at 10:03 PM Mick Semb Wever <mc...@apache.org> wrote:

>
> > the argument that something new should be written because an existing
> project has tech debt, and we'll do it the right way this time, is a pretty
> common software engineering mistake. The thing you’re replacing usually
> needs to have some really serious problems to make it worth replacing.
>
>
> Thanks for writing this Blake. I'm no fan of writing from scratch. Working
> with other people's code is the joy of open-source, imho.
>
> Reaper is not a big project. None of its java files are large or
> complicated.
> This is not the C* codebase we're talking about.
>
> It comes with strict code style in place (which the build enforces), unit
> and integration tests. The tech debt that I think of first is removing
> stuff that we would no longer want to support if it were inside the
> Cassandra project. A number of recent refactorings  have proved it's an
> easy codebase to work with.
>
> It's also worth noting that Cassandra-4.x adoption is still some away, in
> which time Reaper will only continue to grow and gain users.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Reaper as cassandra-admin

Posted by Mick Semb Wever <mc...@apache.org>.
> the argument that something new should be written because an existing project has tech debt, and we'll do it the right way this time, is a pretty common software engineering mistake. The thing you’re replacing usually needs to have some really serious problems to make it worth replacing.

 
Thanks for writing this Blake. I'm no fan of writing from scratch. Working with other people's code is the joy of open-source, imho.

Reaper is not a big project. None of its java files are large or complicated. 
This is not the C* codebase we're talking about. 

It comes with strict code style in place (which the build enforces), unit and integration tests. The tech debt that I think of first is removing stuff that we would no longer want to support if it were inside the Cassandra project. A number of recent refactorings  have proved it's an easy codebase to work with. 

It's also worth noting that Cassandra-4.x adoption is still some away, in which time Reaper will only continue to grow and gain users.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Blake Eggleston <be...@apple.com>.
> FTR nobody has called Reaper a "hopeless mess".

I didn't mean they did. I just meant that it's generally a bad idea to do a rewrite unless the thing being rewritten is a hopeless mess, which reaper probably isn't. I realize this isn't technically a rewrite since we're not talking about actually rewriting something that's part of the project, but a lot of the same reasoning applies to starting work on a new admin tool vs using reaper as a starting point. It's not a strictly technical decision either. The community of users and developers already established around reaper is also a consideration.
On August 28, 2018 at 3:53:02 PM, dinesh.joshi@yahoo.com.INVALID (dinesh.joshi@yahoo.com.invalid) wrote:

On Tuesday, August 28, 2018, 2:52:03 PM PDT, Blake Eggleston <be...@apple.com> wrote:  
> I’m sure reaper will bring tech debt with it, but I doubt it's a hopeless mess.   
FTR nobody has called Reaper a "hopeless mess".  
> It would bring a relatively mature project as well as a community of users> and developers that the other options won’t. It’s probably a lot less work to > rework whatever shortcomings reaper has, add new-hotness repair  

You can bring in parts of a relatively mature project that minimize refactoring & changes that need to be made once imported. You can also bring in best parts of multiples projects without importing entire codebases.  
Dinesh  


On August 28, 2018 at 1:40:59 PM, Roopa Tangirala (rtangirala@netflix.com.invalid) wrote:  
I share Dinesh's concern too regarding tech debt with existing codebase.   
Its good we have multiple solutions for repairs which have been always   
painful in Cassandra. It would be great to see the community take the best   
pieces from the available solutions and roll it into the fresh side car   
which will help ease Cassandra's maintenance for lot of folks.   

My main concern with starting with an existing codebase is that it comes   
with tech debt. This is not specific to Reaper but to any codebase that is   
imported as a whole. This means future developers and patches have to work   
within the confines of the decisions that were already made. Practically   
speaking once a codebase is established there is inertia in making   
architectural changes and we're left dealing with technical debt.   



*Regards,*   

*Roopa Tangirala*   

Engineering Manager CDE   

*(408) 438-3156 - mobile*   






On Mon, Aug 27, 2018 at 10:49 PM Dinesh Joshi   
<di...@yahoo.com.invalid> wrote:   

> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:   
> > We're hoping to get some feedback on our side if that's something people   
> > are interested in. We've gone back and forth privately on our own   
> > preferences, hopes, dreams, etc, but I feel like a public discussion   
> would   
> > be healthy at this point. Does anyone share the view of using Reaper as   
> a   
> > starting point? What concerns to people have?   
>   
>   
> I have briefly looked at the Reaper codebase but I am yet to analyze it   
> better to have a real, meaningful opinion.   
>   
> My main concern with starting with an existing codebase is that it comes   
> with tech debt. This is not specific to Reaper but to any codebase that is   
> imported as a whole. This means future developers and patches have to work   
> within the confines of the decisions that were already made. Practically   
> speaking once a codebase is established there is inertia in making   
> architectural changes and we're left dealing with technical debt.   
>   
> As it stands I am not against the idea of using Reaper's features and I   
> would very much like using mature code that has been tested. I would   
> however like to propose piece-mealing it into the codebase. This will give   
> the community a chance to review what is going in and possibly change some   
> of the design decisions upfront. This will also avoid a situation where we   
> have to make many breaking changes in the initial versions due to   
> refactoring.   
>   
> I would also like it if we could compare and contrast the functionality   
> with Priam or any other interesting sidecars that folks may want to call   
> out. In fact it would be great if we could bring in the best functionality   
> from multiple implementations.   
>   
> Dinesh   
> ---------------------------------------------------------------------   
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org   
> For additional commands, e-mail: dev-help@cassandra.apache.org   
>   
> 

Re: Reaper as cassandra-admin

Posted by "dinesh.joshi@yahoo.com.INVALID" <di...@yahoo.com.INVALID>.
On Tuesday, August 28, 2018, 2:52:03 PM PDT, Blake Eggleston <be...@apple.com> wrote:
> I’m sure reaper will bring tech debt with it, but I doubt it's a hopeless mess. 
FTR nobody has called Reaper a "hopeless mess".
> It would bring a relatively mature project as well as a community of users> and developers that the other options won’t. It’s probably a lot less work to > rework whatever shortcomings reaper has, add new-hotness repair

You can bring in parts of a relatively mature project that minimize refactoring & changes that need to be made once imported. You can also bring in best parts of multiples projects without importing entire codebases.
Dinesh 


On August 28, 2018 at 1:40:59 PM, Roopa Tangirala (rtangirala@netflix.com.invalid) wrote:
I share Dinesh's concern too regarding tech debt with existing codebase.  
Its good we have multiple solutions for repairs which have been always  
painful in Cassandra. It would be great to see the community take the best  
pieces from the available solutions and roll it into the fresh side car  
which will help ease Cassandra's maintenance for lot of folks.  

My main concern with starting with an existing codebase is that it comes  
with tech debt. This is not specific to Reaper but to any codebase that is  
imported as a whole. This means future developers and patches have to work  
within the confines of the decisions that were already made. Practically  
speaking once a codebase is established there is inertia in making  
architectural changes and we're left dealing with technical debt.  



*Regards,*  

*Roopa Tangirala*  

Engineering Manager CDE  

*(408) 438-3156 - mobile*  






On Mon, Aug 27, 2018 at 10:49 PM Dinesh Joshi  
<di...@yahoo.com.invalid> wrote:  

> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:  
> > We're hoping to get some feedback on our side if that's something people  
> > are interested in. We've gone back and forth privately on our own  
> > preferences, hopes, dreams, etc, but I feel like a public discussion  
> would  
> > be healthy at this point. Does anyone share the view of using Reaper as  
> a  
> > starting point? What concerns to people have?  
>  
>  
> I have briefly looked at the Reaper codebase but I am yet to analyze it  
> better to have a real, meaningful opinion.  
>  
> My main concern with starting with an existing codebase is that it comes  
> with tech debt. This is not specific to Reaper but to any codebase that is  
> imported as a whole. This means future developers and patches have to work  
> within the confines of the decisions that were already made. Practically  
> speaking once a codebase is established there is inertia in making  
> architectural changes and we're left dealing with technical debt.  
>  
> As it stands I am not against the idea of using Reaper's features and I  
> would very much like using mature code that has been tested. I would  
> however like to propose piece-mealing it into the codebase. This will give  
> the community a chance to review what is going in and possibly change some  
> of the design decisions upfront. This will also avoid a situation where we  
> have to make many breaking changes in the initial versions due to  
> refactoring.  
>  
> I would also like it if we could compare and contrast the functionality  
> with Priam or any other interesting sidecars that folks may want to call  
> out. In fact it would be great if we could bring in the best functionality  
> from multiple implementations.  
>  
> Dinesh  
> ---------------------------------------------------------------------  
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> For additional commands, e-mail: dev-help@cassandra.apache.org  
>  
>    

Re: Reaper as cassandra-admin

Posted by Sumanth Pasupuleti <sp...@netflix.com.INVALID>.
IMO, I am glad to see there are at least three solutions here to address
repair and can be potential candidates for sidecar. As Joey pointed out,
all of them may not be great at everything they do, and to me, it makes
sense to "cherry pick" the best each product has, and build a good C*
sidecar.

Thanks,
Sumanth

On Tue, Aug 28, 2018 at 2:45 PM, Blake Eggleston <be...@apple.com>
wrote:

> I haven’t settled on a position yet (will have more time think about
> things after the 9/1 freeze), but I wanted to point out that the argument
> that something new should be written because an existing project has tech
> debt, and we'll do it the right way this time, is a pretty common software
> engineering mistake. The thing you’re replacing usually needs to have some
> really serious problems to make it worth replacing.
>
> I’m sure reaper will bring tech debt with it, but I doubt it's a hopeless
> mess. It would bring a relatively mature project as well as a community of
> users and developers that the other options won’t. It’s probably a lot less
> work to rework whatever shortcomings reaper has, add new-hotness repair
> schedulers to it, and get people to actually use them than it would be to
> write something from scratch and build community confidence in it and get
> reaper users to switch.
>
> On August 28, 2018 at 1:40:59 PM, Roopa Tangirala (rtangirala@netflix.com.invalid)
> wrote:
> I share Dinesh's concern too regarding tech debt with existing codebase.
> Its good we have multiple solutions for repairs which have been always
> painful in Cassandra. It would be great to see the community take the
> best
> pieces from the available solutions and roll it into the fresh side car
> which will help ease Cassandra's maintenance for lot of folks.
>
> My main concern with starting with an existing codebase is that it comes
> with tech debt. This is not specific to Reaper but to any codebase that
> is
> imported as a whole. This means future developers and patches have to
> work
> within the confines of the decisions that were already made. Practically
> speaking once a codebase is established there is inertia in making
> architectural changes and we're left dealing with technical debt.
>
>
>
> *Regards,*
>
> *Roopa Tangirala*
>
> Engineering Manager CDE
>
> *(408) 438-3156 - mobile*
>
>
>
>
>
>
> On Mon, Aug 27, 2018 at 10:49 PM Dinesh Joshi
> <di...@yahoo.com.invalid> wrote:
>
> > > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com>
> wrote:
> > > We're hoping to get some feedback on our side if that's something
> people
> > > are interested in. We've gone back and forth privately on our own
> > > preferences, hopes, dreams, etc, but I feel like a public discussion
> > would
> > > be healthy at this point. Does anyone share the view of using Reaper
> as
> > a
> > > starting point? What concerns to people have?
> >
> >
> > I have briefly looked at the Reaper codebase but I am yet to analyze it
> > better to have a real, meaningful opinion.
> >
> > My main concern with starting with an existing codebase is that it
> comes
> > with tech debt. This is not specific to Reaper but to any codebase that
> is
> > imported as a whole. This means future developers and patches have to
> work
> > within the confines of the decisions that were already made.
> Practically
> > speaking once a codebase is established there is inertia in making
> > architectural changes and we're left dealing with technical debt.
> >
> > As it stands I am not against the idea of using Reaper's features and I
> > would very much like using mature code that has been tested. I would
> > however like to propose piece-mealing it into the codebase. This will
> give
> > the community a chance to review what is going in and possibly change
> some
> > of the design decisions upfront. This will also avoid a situation where
> we
> > have to make many breaking changes in the initial versions due to
> > refactoring.
> >
> > I would also like it if we could compare and contrast the functionality
> > with Priam or any other interesting sidecars that folks may want to
> call
> > out. In fact it would be great if we could bring in the best
> functionality
> > from multiple implementations.
> >
> > Dinesh
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> > For additional commands, e-mail: dev-help@cassandra.apache.org
> >
> >
>

Re: Reaper as cassandra-admin

Posted by Blake Eggleston <be...@apple.com>.
I haven’t settled on a position yet (will have more time think about things after the 9/1 freeze), but I wanted to point out that the argument that something new should be written because an existing project has tech debt, and we'll do it the right way this time, is a pretty common software engineering mistake. The thing you’re replacing usually needs to have some really serious problems to make it worth replacing.

I’m sure reaper will bring tech debt with it, but I doubt it's a hopeless mess. It would bring a relatively mature project as well as a community of users and developers that the other options won’t. It’s probably a lot less work to rework whatever shortcomings reaper has, add new-hotness repair schedulers to it, and get people to actually use them than it would be to write something from scratch and build community confidence in it and get reaper users to switch.

On August 28, 2018 at 1:40:59 PM, Roopa Tangirala (rtangirala@netflix.com.invalid) wrote:
I share Dinesh's concern too regarding tech debt with existing codebase.  
Its good we have multiple solutions for repairs which have been always  
painful in Cassandra. It would be great to see the community take the best  
pieces from the available solutions and roll it into the fresh side car  
which will help ease Cassandra's maintenance for lot of folks.  

My main concern with starting with an existing codebase is that it comes  
with tech debt. This is not specific to Reaper but to any codebase that is  
imported as a whole. This means future developers and patches have to work  
within the confines of the decisions that were already made. Practically  
speaking once a codebase is established there is inertia in making  
architectural changes and we're left dealing with technical debt.  



*Regards,*  

*Roopa Tangirala*  

Engineering Manager CDE  

*(408) 438-3156 - mobile*  






On Mon, Aug 27, 2018 at 10:49 PM Dinesh Joshi  
<di...@yahoo.com.invalid> wrote:  

> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:  
> > We're hoping to get some feedback on our side if that's something people  
> > are interested in. We've gone back and forth privately on our own  
> > preferences, hopes, dreams, etc, but I feel like a public discussion  
> would  
> > be healthy at this point. Does anyone share the view of using Reaper as  
> a  
> > starting point? What concerns to people have?  
>  
>  
> I have briefly looked at the Reaper codebase but I am yet to analyze it  
> better to have a real, meaningful opinion.  
>  
> My main concern with starting with an existing codebase is that it comes  
> with tech debt. This is not specific to Reaper but to any codebase that is  
> imported as a whole. This means future developers and patches have to work  
> within the confines of the decisions that were already made. Practically  
> speaking once a codebase is established there is inertia in making  
> architectural changes and we're left dealing with technical debt.  
>  
> As it stands I am not against the idea of using Reaper's features and I  
> would very much like using mature code that has been tested. I would  
> however like to propose piece-mealing it into the codebase. This will give  
> the community a chance to review what is going in and possibly change some  
> of the design decisions upfront. This will also avoid a situation where we  
> have to make many breaking changes in the initial versions due to  
> refactoring.  
>  
> I would also like it if we could compare and contrast the functionality  
> with Priam or any other interesting sidecars that folks may want to call  
> out. In fact it would be great if we could bring in the best functionality  
> from multiple implementations.  
>  
> Dinesh  
> ---------------------------------------------------------------------  
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org  
> For additional commands, e-mail: dev-help@cassandra.apache.org  
>  
>  

Re: Reaper as cassandra-admin

Posted by Roopa Tangirala <rt...@netflix.com.INVALID>.
I share Dinesh's concern too regarding tech debt with existing codebase.
Its good we have multiple solutions for repairs which have been always
painful in Cassandra. It would be great to see the community take the best
pieces from the available solutions and roll it into the fresh side car
which will help ease Cassandra's maintenance for lot of folks.

My main concern with starting with an existing codebase is that it comes
with tech debt. This is not specific to Reaper but to any codebase that is
imported as a whole. This means future developers and patches have to work
within the confines of the decisions that were already made. Practically
speaking once a codebase is established there is inertia in making
architectural changes and we're left dealing with technical debt.



*Regards,*

*Roopa Tangirala*

Engineering Manager CDE

*(408) 438-3156 - mobile*






On Mon, Aug 27, 2018 at 10:49 PM Dinesh Joshi
<di...@yahoo.com.invalid> wrote:

> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> > We're hoping to get some feedback on our side if that's something people
> > are interested in.  We've gone back and forth privately on our own
> > preferences, hopes, dreams, etc, but I feel like a public discussion
> would
> > be healthy at this point.  Does anyone share the view of using Reaper as
> a
> > starting point?  What concerns to people have?
>
>
> I have briefly looked at the Reaper codebase but I am yet to analyze it
> better to have a real, meaningful opinion.
>
> My main concern with starting with an existing codebase is that it comes
> with tech debt. This is not specific to Reaper but to any codebase that is
> imported as a whole. This means future developers and patches have to work
> within the confines of the decisions that were already made. Practically
> speaking once a codebase is established there is inertia in making
> architectural changes and we're left dealing with technical debt.
>
> As it stands I am not against the idea of using Reaper's features and I
> would very much like using mature code that has been tested. I would
> however like to propose piece-mealing it into the codebase. This will give
> the community a chance to review what is going in and possibly change some
> of the design decisions upfront. This will also avoid a situation where we
> have to make many breaking changes in the initial versions due to
> refactoring.
>
> I would also like it if we could compare and contrast the functionality
> with Priam or any other interesting sidecars that folks may want to call
> out. In fact it would be great if we could bring in the best functionality
> from multiple implementations.
>
> Dinesh
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

Re: Reaper as cassandra-admin

Posted by Joseph Lynch <jo...@gmail.com>.
I and the rest of the Netflix Cassandra team share Dinesh's concerns. I was
excited to work on this project precisely because we were taking only the
best designs, techniques, and functionality out of the community sidecars
such as Priam, Reaper, and any other community tool and building the
simplest possible tool into Cassandra that could deliver the maximum value
to our users with the minimal amount of technical debt. For example, a
distributed, shared nothing architecture that communicates only through
state transitions in Cassandra data itself seems to be the most robust and
secure architecture (and indeed Reaper appears to be working towards
refactoring towards that).  Fundamental architecture is, in my experience,
very hard to refactor, and often starting fresh with the lessons learned
from the N previous iterations is the faster way to build real value. For
example, Reaper was built to be a repair tool, it is baked into the core
abstractions. It sounds like the community needs something more like a
distributed task execution engine which is fully pluggable (plugin whatever
ops task you want) and operates scheduled, oneshot, and daemon tasks.

What if we started with a basic framework as proposed in CASSANDRA-14395,
maybe add a pluggable execution engine as the first few commits and then
various community members can contribute plugins/modules that add various
functionality such as repair, backup, distributed restarts, upgrades,
etc..?  We would be striving very hard not to reinvent the wheel, rather we
would want to learn from previous iterations, keep what works well and
leave the rest.

Regarding Priam, we could offer to donate it but I think that the community
shouldn't accept it because it is full of years of technical debt and
decisions made by Netflix for Netflix. For example Priam currently has four
different backup solutions (three used in production, the latest not used
in production) that we have implemented over the years, and only the latest
one that is not yet in production should be contributed to the official
sidecar. The latest iteration is similar to the architecture of
https://github.com/hashbrowncipher/cassandra-mirror which is capable of per
minute, point in time backups; no previous iteration is capable of this.
Yes the earlier versions are "battle hardened" but we know those
architectures have fundamental flaws, are overly expensive, or simply won't
scale to the next 10x requirement. We have learned from those previous
iterations and are creating the next iteration that will scale another
order of magnitude. I also wouldn’t want to burden reviewers with looking
at the first three implementations or building the mental model all at once
of how Priam works end to end.

Practically speaking, I think it's much more logistically difficult to
accept one of the sidecar projects as is than building a new one
incrementally. The existing sidecars have dependencies that have to be
vetted, technical debt that must be trimmed, tens of thousands of lines of
code that have to be reviewed, and even if the community wants to make
changes those changes might be prohibitively difficult as the underlying
architecture has solidified.

Furthermore, all of these tools were designed without the notion that they
were shipping with Cassandra, which precluded them from being capable of
next generation features like removing compaction entirely from the live
request-response path into a separate process that can be limited with e.g.
cgroups to ensure isolation. Also they have supported many versions of
Cassandra over the years and therefore have layers of indirection and
abstraction added simply for dealing with various different APIs and
versions (I personally think the official sidecar should branch with
Cassandra and support current plus previous versions of Cassandra just like
the server does).

I hope that we decide as a community to put all the options on the table in
the open, learn from all of them, and pursue a solution that takes the best
from all the solutions and is unencumbered by historical decisions.

-Joey

Re: Reaper as cassandra-admin

Posted by Dinesh Joshi <di...@yahoo.com.INVALID>.
> On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> We're hoping to get some feedback on our side if that's something people
> are interested in.  We've gone back and forth privately on our own
> preferences, hopes, dreams, etc, but I feel like a public discussion would
> be healthy at this point.  Does anyone share the view of using Reaper as a
> starting point?  What concerns to people have?


I have briefly looked at the Reaper codebase but I am yet to analyze it better to have a real, meaningful opinion. 

My main concern with starting with an existing codebase is that it comes with tech debt. This is not specific to Reaper but to any codebase that is imported as a whole. This means future developers and patches have to work within the confines of the decisions that were already made. Practically speaking once a codebase is established there is inertia in making architectural changes and we're left dealing with technical debt.

As it stands I am not against the idea of using Reaper's features and I would very much like using mature code that has been tested. I would however like to propose piece-mealing it into the codebase. This will give the community a chance to review what is going in and possibly change some of the design decisions upfront. This will also avoid a situation where we have to make many breaking changes in the initial versions due to refactoring.

I would also like it if we could compare and contrast the functionality with Priam or any other interesting sidecars that folks may want to call out. In fact it would be great if we could bring in the best functionality from multiple implementations.

Dinesh
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Jonathan Haddad <jo...@jonhaddad.com>.
I don't believe #1 should be an issue, Mick has been reaching out.

Alex and Mick are putting together some architecture documentation, I won't
step on their toes.  Currently you can run Reaper as a single instance that
connects to your entire cluster, multiple instances in HA mode, and we're
finishing up the rework of the code to run it as a sidecar.

On Mon, Aug 27, 2018 at 6:02 PM Jeff Jirsa <jj...@gmail.com> wrote:

> Can you get all of the contributors cleared?
> What’s the architecture? Is it centralized? Is there a sidecar?
>
>
> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> >
> > Hey folks,
> >
> > Mick brought this up in the sidecar thread, but I wanted to have a clear
> /
> > separate discussion about what we're thinking with regard to contributing
> > Reaper to the C* project.  In my mind, starting with Reaper is a great
> way
> > of having an admin right now, that we know works well at the kind of
> scale
> > we need.  We've worked with a lot of companies putting Reaper in prod (at
> > least 50), running on several hundred clusters.  The codebase has evolved
> > as a direct result of production usage, and we feel it would be great to
> > pair it with the 4.0 release.  There was a LOT of work done on the repair
> > logic to make things work across every supported version of Cassandra,
> with
> > a great deal of documentation as well.
> >
> > In case folks aren't aware, in addition to one off and scheduled repairs,
> > Reaper also does cluster wide snapshots, exposes thread pool stats, and
> > visualizes streaming (in trunk).
> >
> > We're hoping to get some feedback on our side if that's something people
> > are interested in.  We've gone back and forth privately on our own
> > preferences, hopes, dreams, etc, but I feel like a public discussion
> would
> > be healthy at this point.  Does anyone share the view of using Reaper as
> a
> > starting point?  What concerns to people have?
> > --
> > Jon Haddad
> > http://www.rustyrazorblade.com
> > twitter: rustyrazorblade
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>
>

-- 
Jon Haddad
http://www.rustyrazorblade.com
twitter: rustyrazorblade

Re: Reaper as cassandra-admin

Posted by Mick Semb Wever <mc...@apache.org>.
> Can you get all of the contributors cleared?
> What’s the architecture? Is it centralized? Is there a sidecar?


Working on it Jeff. Contributors are close to cleared. Copyright is either Spotify or Stefan, both whom have CLAs in place with ASF.
Licenses of all npm dependencies are good. Still gotta audit the java deps.

Architecture docs need to be fleshed out, especially to address the side-car/management ticket discussions/design.

Reaper is flexible in its design, you can run one or multiple instances. A lot of work has been made to move it towards a eventually-consistent at-least-once design. The hard-work towards the side-car model we feel is trialled and battle-tested, but we do need to re-add the pinning of connections and repair segments to localhosts.

There will be questions about some of Reaper's flexibility: for example will we still want to support memory and postgres storage backends (neither of which support distributed/side-car installations).


> As an aside, it’s frustrating that ya’ll would sit on this for months…

You're quite right Jeff, and I do owe an apology to those that worked on the ticket so far. I've been a bit caught off-guard by it all. I'm really hoping we can start to converge work and ideas from this point on.

regards,
Mick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Jeff Jirsa <jj...@gmail.com>.
As an aside, it’s frustrating that ya’ll would sit on this for months (first e-mail was April); you folks have enough people that know the process to know that communicating early and often helps avoid duplicating (expensive) work. 

The best tech needs to go in and we need to leave ourselves with the ability to meet the goals of the original proposal (and then some). The reaper UI is nice, I wish you’d have talked to the other group of folks to combine efforts in April, we’d be much further ahead. 

-- 
Jeff Jirsa


> On Aug 27, 2018, at 6:02 PM, Jeff Jirsa <jj...@gmail.com> wrote:
> 
> Can you get all of the contributors cleared?
> What’s the architecture? Is it centralized? Is there a sidecar?
> 
> 
>> On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
>> 
>> Hey folks,
>> 
>> Mick brought this up in the sidecar thread, but I wanted to have a clear /
>> separate discussion about what we're thinking with regard to contributing
>> Reaper to the C* project.  In my mind, starting with Reaper is a great way
>> of having an admin right now, that we know works well at the kind of scale
>> we need.  We've worked with a lot of companies putting Reaper in prod (at
>> least 50), running on several hundred clusters.  The codebase has evolved
>> as a direct result of production usage, and we feel it would be great to
>> pair it with the 4.0 release.  There was a LOT of work done on the repair
>> logic to make things work across every supported version of Cassandra, with
>> a great deal of documentation as well.
>> 
>> In case folks aren't aware, in addition to one off and scheduled repairs,
>> Reaper also does cluster wide snapshots, exposes thread pool stats, and
>> visualizes streaming (in trunk).
>> 
>> We're hoping to get some feedback on our side if that's something people
>> are interested in.  We've gone back and forth privately on our own
>> preferences, hopes, dreams, etc, but I feel like a public discussion would
>> be healthy at this point.  Does anyone share the view of using Reaper as a
>> starting point?  What concerns to people have?
>> -- 
>> Jon Haddad
>> http://www.rustyrazorblade.com
>> twitter: rustyrazorblade

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Rahul Singh <ra...@gmail.com>.
I’d be interested in contributing as well. I’ve been working on a skew review / diagnostics tool which feeds off of cfstats/tbstats data (from TXT output to CSV to conditionally formatted excel ) and am starting to store data in C* and wrap a React based grid on it.

I have backlogged forking the reaper core / UI (api / front end ). It has a lot of potential — specifically if the API / Services / UI could be modularized and leverage IoC to add functionality via configuration not code.

There are a lot good conventions in both open source and commercial projects out there for web based administration tools. The most successful ones do the basics related to their tool well and leave the rest to other systems.

The pitfall I don’t want the valuable talent to enter in this group is to reinvent the wheel on things that other tools do well and focus on what Admins/ Architects/ Developers need. Eg. if Prometheus and Grafana are good for stats, keep it - just make it easier to facilitate or compose in Docker.

Another example : There are ideas I had including a data / browser / interactive query interface — but Redash or Zeppelin do a good job for the time being and no matter how much time I spend on it I probably wouldn’t want make a better one.

Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Aug 27, 2018, 9:22 PM -0400, Mick Semb Wever <mc...@apache.org>, wrote:
>
> > Is there a roadmap or release schedule, so we can get an idea of what
> > the Reaper devs have planned for it?
>
>
> Hi Murukesh,
> there's no roadmap per se, as it's open-source and it's the contributions as they come that make it.
>
> What I know that's in progress or been discussed is:
> - more thorough upgrade tests,
> - support for diagnostic events (C* 4.0),
> - more task/operations: compactions, cleanups, sstableupgrades, etc etc,
> - more metrics (better visualisations, for example see the newly added streaming),
> - making the scheduler repair-agnostic (so any task/operation can be scheduled), and
> - making task/operations not based on jmx calls (preparing for non-jmx type tasks).
>
> regards,
> Mick
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>

Re: Reaper as cassandra-admin

Posted by Mick Semb Wever <mc...@apache.org>.
> Is there a roadmap or release schedule, so we can get an idea of what
> the Reaper devs have planned for it?


Hi Murukesh,
 there's no roadmap per se, as it's open-source and it's the contributions as they come that make it. 

What I know that's in progress or been discussed is:
 - more thorough upgrade tests,
 - support for diagnostic events (C* 4.0),
 - more task/operations: compactions, cleanups, sstableupgrades, etc etc,
 - more metrics (better visualisations, for example see the newly added streaming),
 - making the scheduler repair-agnostic (so any task/operation can be scheduled), and
 - making task/operations not based on jmx calls (preparing for non-jmx type tasks).

regards,
Mick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Murukesh Mohanan <mu...@gmail.com>.
Is there a roadmap or release schedule, so we can get an idea of what
the Reaper devs have planned for it?


Yours,
Murukesh Mohanan

On Tue, 28 Aug 2018 at 10:02, Jeff Jirsa <jj...@gmail.com> wrote:
>
> Can you get all of the contributors cleared?
> What’s the architecture? Is it centralized? Is there a sidecar?
>
>
> > On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> >
> > Hey folks,
> >
> > Mick brought this up in the sidecar thread, but I wanted to have a clear /
> > separate discussion about what we're thinking with regard to contributing
> > Reaper to the C* project.  In my mind, starting with Reaper is a great way
> > of having an admin right now, that we know works well at the kind of scale
> > we need.  We've worked with a lot of companies putting Reaper in prod (at
> > least 50), running on several hundred clusters.  The codebase has evolved
> > as a direct result of production usage, and we feel it would be great to
> > pair it with the 4.0 release.  There was a LOT of work done on the repair
> > logic to make things work across every supported version of Cassandra, with
> > a great deal of documentation as well.
> >
> > In case folks aren't aware, in addition to one off and scheduled repairs,
> > Reaper also does cluster wide snapshots, exposes thread pool stats, and
> > visualizes streaming (in trunk).
> >
> > We're hoping to get some feedback on our side if that's something people
> > are interested in.  We've gone back and forth privately on our own
> > preferences, hopes, dreams, etc, but I feel like a public discussion would
> > be healthy at this point.  Does anyone share the view of using Reaper as a
> > starting point?  What concerns to people have?
> > --
> > Jon Haddad
> > http://www.rustyrazorblade.com
> > twitter: rustyrazorblade
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: dev-help@cassandra.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org


Re: Reaper as cassandra-admin

Posted by Jeff Jirsa <jj...@gmail.com>.
Can you get all of the contributors cleared?
What’s the architecture? Is it centralized? Is there a sidecar?


> On Aug 27, 2018, at 5:36 PM, Jonathan Haddad <jo...@jonhaddad.com> wrote:
> 
> Hey folks,
> 
> Mick brought this up in the sidecar thread, but I wanted to have a clear /
> separate discussion about what we're thinking with regard to contributing
> Reaper to the C* project.  In my mind, starting with Reaper is a great way
> of having an admin right now, that we know works well at the kind of scale
> we need.  We've worked with a lot of companies putting Reaper in prod (at
> least 50), running on several hundred clusters.  The codebase has evolved
> as a direct result of production usage, and we feel it would be great to
> pair it with the 4.0 release.  There was a LOT of work done on the repair
> logic to make things work across every supported version of Cassandra, with
> a great deal of documentation as well.
> 
> In case folks aren't aware, in addition to one off and scheduled repairs,
> Reaper also does cluster wide snapshots, exposes thread pool stats, and
> visualizes streaming (in trunk).
> 
> We're hoping to get some feedback on our side if that's something people
> are interested in.  We've gone back and forth privately on our own
> preferences, hopes, dreams, etc, but I feel like a public discussion would
> be healthy at this point.  Does anyone share the view of using Reaper as a
> starting point?  What concerns to people have?
> -- 
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org