You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Matei Zaharia <ma...@gmail.com> on 2016/05/19 15:34:59 UTC

[DISCUSS] Removing or changing maintainer process

Hi folks,

Around 1.5 years ago, Spark added a maintainer process for reviewing API and architectural changes (https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers) to make sure these are seen by people who spent a lot of time on that component. At the time, the worry was that changes might go unnoticed as the project grows, but there were also concerns that this approach makes the project harder to contribute to and less welcoming. Since implementing the model, I think that a good number of developers concluded it doesn't make a huge difference, so because of these concerns, it may be useful to remove it. I've also heard that we should try to keep some other instructions for contributors to find the "right" reviewers, so it would be great to see suggestions on that. For my part, I'd personally prefer something "automatic", such as easily tracking who reviewed each patch and having people look at the commit history of the module they want to work on, instead of a list that needs to be maintained separately.

Matei
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [DISCUSS] Removing or changing maintainer process

Posted by Steve Loughran <st...@hortonworks.com>.
> On 19 May 2016, at 16:34, Matei Zaharia <ma...@gmail.com> wrote:
> 
> Hi folks,
> 
> Around 1.5 years ago, Spark added a maintainer process for reviewing API and architectural changes (https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers) to make sure these are seen by people who spent a lot of time on that component. At the time, the worry was that changes might go unnoticed as the project grows, but there were also concerns that this approach makes the project harder to contribute to and less welcoming. Since implementing the model, I think that a good number of developers concluded it doesn't make a huge difference, so because of these concerns, it may be useful to remove it. I've also heard that we should try to keep some other instructions for contributors to find the "right" reviewers, so it would be great to see suggestions on that. For my part, I'd personally prefer something "automatic", such as easily tracking who reviewed each patch and having people look at the commit history of the module they want to work on, instead of a list that needs to be maintained separately.
> 


Putting on my ASF hat, I'm supportive of this.  An Apache software project is a large, shared, body of code —and committers are expected to be trusted to be allowed to contribute through, and not to break things. That's in their own patches —and in their review of other's work.

Having a nominated owner of a module is marking out bits of the code as territories, which can not only keep others out of it, it can actually put extra responsibility on the maintainer, who, because they are an effective choke point for patches for a module, have to try to keep up with an ever growing patch queue.

Managing code review is a scale problem; any large OSS project, especially those on a RTC process, comes up against it. I'd love to seem some good Software Engineering research/papers in this area

There is one forthcoming soon, "Why Should Architecture-Savvy Developers Write Code?". This looks at some of the development practises of various ASF projects (Hadoop, Hive, Pig, OFBiz, Camel, OpenJPA) and tries to see if there is a correlation in code quality between Linked-In proclaimed architects, code contributions and the (as inferred from JIRA) quality of those classes. Having seen the draft, I think everyone should find it interesting —even if I have a few issues with some of its classifications, at least based on the analysis of my bits.

http://blog.ieeesoftware.org/2016/02/why-should-software-architects-write.html

What I've not seen is anything looking at what makes both a good process for succouring quality contributions, but for doing so in a way which preserves quality, maintainability and an overall coherent architecture of a large system. If there are any, I'd love a copy.


-Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: [DISCUSS] Removing or changing maintainer process

Posted by Mridul Muralidharan <mr...@gmail.com>.
+1 (binding) on removing maintainer process.
I agree with your opinion of "automatic " instead of a manual list.


Regards
Mridul

On Thursday, May 19, 2016, Matei Zaharia <ma...@gmail.com> wrote:

> Hi folks,
>
> Around 1.5 years ago, Spark added a maintainer process for reviewing API
> and architectural changes (
> https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers)
> to make sure these are seen by people who spent a lot of time on that
> component. At the time, the worry was that changes might go unnoticed as
> the project grows, but there were also concerns that this approach makes
> the project harder to contribute to and less welcoming. Since implementing
> the model, I think that a good number of developers concluded it doesn't
> make a huge difference, so because of these concerns, it may be useful to
> remove it. I've also heard that we should try to keep some other
> instructions for contributors to find the "right" reviewers, so it would be
> great to see suggestions on that. For my part, I'd personally prefer
> something "automatic", such as easily tracking who reviewed each patch and
> having people look at the commit history of the module they want to work
> on, instead of a list that needs to be maintained separately.
>
> Matei
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org <javascript:;>
> For additional commands, e-mail: dev-help@spark.apache.org <javascript:;>
>
>

Re: [DISCUSS] Removing or changing maintainer process

Posted by Nicholas Chammas <ni...@gmail.com>.
I just heard about mention-bot at PyCon 2016
<https://www.youtube.com/watch?v=gRFHvavxnos>:

https://github.com/facebook/mention-bot

Do you have a GitHub project that is too big for people to subscribe to all
the notifications? The mention bot will automatically mention potential
reviewers on pull requests. It helps getting faster turnaround on pull
requests by involving the right people early on.

mention-bot checks the blame history for the files modified by a PR and
automatically pings the most likely candidates for review. I wonder if it
would work well for us.

Nick

On Thu, May 19, 2016 at 11:47 AM Nicholas Chammas nicholas.chammas@gmail.com
<ht...@gmail.com> wrote:

I’ve also heard that we should try to keep some other instructions for
> contributors to find the “right” reviewers, so it would be great to see
> suggestions on that. For my part, I’d personally prefer something
> “automatic”, such as easily tracking who reviewed each patch and having
> people look at the commit history of the module they want to work on,
> instead of a list that needs to be maintained separately.
>
> Some code review and management tools like Phabricator have a system for
> this <http://phacility.com/phabricator/herald/>, where you can configure
> alerts to automatically ping certain people if a file matching some rule
> (e.g. has this extension, is in this folder, etc.) is modified by a PR.
>
> I think short of deploying Phabricator somehow, probably the most
> realistic option for us to get automatic alerts like this is to have
> someone add that as a feature to the Spark PR Dashboard
> <https://spark-prs.appspot.com/>.
>
> I created an issue for this some time ago if anyone wants to take a crack
> at it: https://github.com/databricks/spark-pr-dashboard/issues/47
>
> Nick
> ​
>
> On Thu, May 19, 2016 at 11:42 AM Tom Graves <tg...@yahoo.com.invalid>
> wrote:
>
>> +1 (binding)
>>
>> Tom
>>
>>
>> On Thursday, May 19, 2016 10:35 AM, Matei Zaharia <
>> matei.zaharia@gmail.com> wrote:
>>
>>
>> Hi folks,
>>
>> Around 1.5 years ago, Spark added a maintainer process for reviewing API
>> and architectural changes (
>> https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers)
>> to make sure these are seen by people who spent a lot of time on that
>> component. At the time, the worry was that changes might go unnoticed as
>> the project grows, but there were also concerns that this approach makes
>> the project harder to contribute to and less welcoming. Since implementing
>> the model, I think that a good number of developers concluded it doesn't
>> make a huge difference, so because of these concerns, it may be useful to
>> remove it. I've also heard that we should try to keep some other
>> instructions for contributors to find the "right" reviewers, so it would be
>> great to see suggestions on that. For my part, I'd personally prefer
>> something "automatic", such as easily tracking who reviewed each patch and
>> having people look at the commit history of the module they want to work
>> on, instead of a list that needs to be maintained separately.
>>
>> Matei
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>> ​

Re: [DISCUSS] Removing or changing maintainer process

Posted by Nicholas Chammas <ni...@gmail.com>.
I’ve also heard that we should try to keep some other instructions for
contributors to find the “right” reviewers, so it would be great to see
suggestions on that. For my part, I’d personally prefer something
“automatic”, such as easily tracking who reviewed each patch and having
people look at the commit history of the module they want to work on,
instead of a list that needs to be maintained separately.

Some code review and management tools like Phabricator have a system for
this <http://phacility.com/phabricator/herald/>, where you can configure
alerts to automatically ping certain people if a file matching some rule
(e.g. has this extension, is in this folder, etc.) is modified by a PR.

I think short of deploying Phabricator somehow, probably the most realistic
option for us to get automatic alerts like this is to have someone add that
as a feature to the Spark PR Dashboard <https://spark-prs.appspot.com/>.

I created an issue for this some time ago if anyone wants to take a crack
at it: https://github.com/databricks/spark-pr-dashboard/issues/47

Nick
​

On Thu, May 19, 2016 at 11:42 AM Tom Graves <tg...@yahoo.com.invalid>
wrote:

> +1 (binding)
>
> Tom
>
>
> On Thursday, May 19, 2016 10:35 AM, Matei Zaharia <ma...@gmail.com>
> wrote:
>
>
> Hi folks,
>
> Around 1.5 years ago, Spark added a maintainer process for reviewing API
> and architectural changes (
> https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers)
> to make sure these are seen by people who spent a lot of time on that
> component. At the time, the worry was that changes might go unnoticed as
> the project grows, but there were also concerns that this approach makes
> the project harder to contribute to and less welcoming. Since implementing
> the model, I think that a good number of developers concluded it doesn't
> make a huge difference, so because of these concerns, it may be useful to
> remove it. I've also heard that we should try to keep some other
> instructions for contributors to find the "right" reviewers, so it would be
> great to see suggestions on that. For my part, I'd personally prefer
> something "automatic", such as easily tracking who reviewed each patch and
> having people look at the commit history of the module they want to work
> on, instead of a list that needs to be maintained separately.
>
> Matei
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>
>

Re: [DISCUSS] Removing or changing maintainer process

Posted by Tom Graves <tg...@yahoo.com.INVALID>.
+1 (binding)
Tom 

    On Thursday, May 19, 2016 10:35 AM, Matei Zaharia <ma...@gmail.com> wrote:
 

 Hi folks,

Around 1.5 years ago, Spark added a maintainer process for reviewing API and architectural changes (https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers) to make sure these are seen by people who spent a lot of time on that component. At the time, the worry was that changes might go unnoticed as the project grows, but there were also concerns that this approach makes the project harder to contribute to and less welcoming. Since implementing the model, I think that a good number of developers concluded it doesn't make a huge difference, so because of these concerns, it may be useful to remove it. I've also heard that we should try to keep some other instructions for contributors to find the "right" reviewers, so it would be great to see suggestions on that. For my part, I'd personally prefer something "automatic", such as easily tracking who reviewed each patch and having people look at the commit history of the module they want to work on, instead of a list that needs to be maintained separately.

Matei
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


  

Re: [DISCUSS] Removing or changing maintainer process

Posted by Andrew Or <an...@databricks.com>.
+1, some maintainers are hard to find

2016-05-19 9:03 GMT-07:00 Imran Rashid <ir...@cloudera.com>:

> +1 (binding) on removal of maintainers
>
> I dont' have a strong opinion yet on how to have a system for finding the
> right reviewers.  I agree it would be nice to have something to help you
> find reviewers, though I'm a little skeptical of anything automatic.
>
> On Thu, May 19, 2016 at 10:34 AM, Matei Zaharia <ma...@gmail.com>
> wrote:
>
>> Hi folks,
>>
>> Around 1.5 years ago, Spark added a maintainer process for reviewing API
>> and architectural changes (
>> https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers)
>> to make sure these are seen by people who spent a lot of time on that
>> component. At the time, the worry was that changes might go unnoticed as
>> the project grows, but there were also concerns that this approach makes
>> the project harder to contribute to and less welcoming. Since implementing
>> the model, I think that a good number of developers concluded it doesn't
>> make a huge difference, so because of these concerns, it may be useful to
>> remove it. I've also heard that we should try to keep some other
>> instructions for contributors to find the "right" reviewers, so it would be
>> great to see suggestions on that. For my part, I'd personally prefer
>> something "automatic", such as easily tracking who reviewed each patch and
>> having people look at the commit history of the module they want to work
>> on, instead of a list that needs to be maintained separately.
>>
>> Matei
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>

Re: [DISCUSS] Removing or changing maintainer process

Posted by Imran Rashid <ir...@cloudera.com>.
+1 (binding) on removal of maintainers

I dont' have a strong opinion yet on how to have a system for finding the
right reviewers.  I agree it would be nice to have something to help you
find reviewers, though I'm a little skeptical of anything automatic.

On Thu, May 19, 2016 at 10:34 AM, Matei Zaharia <ma...@gmail.com>
wrote:

> Hi folks,
>
> Around 1.5 years ago, Spark added a maintainer process for reviewing API
> and architectural changes (
> https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers)
> to make sure these are seen by people who spent a lot of time on that
> component. At the time, the worry was that changes might go unnoticed as
> the project grows, but there were also concerns that this approach makes
> the project harder to contribute to and less welcoming. Since implementing
> the model, I think that a good number of developers concluded it doesn't
> make a huge difference, so because of these concerns, it may be useful to
> remove it. I've also heard that we should try to keep some other
> instructions for contributors to find the "right" reviewers, so it would be
> great to see suggestions on that. For my part, I'd personally prefer
> something "automatic", such as easily tracking who reviewed each patch and
> having people look at the commit history of the module they want to work
> on, instead of a list that needs to be maintained separately.
>
> Matei
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>