You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Kay Ousterhout <ke...@eecs.berkeley.edu> on 2014/07/21 22:44:54 UTC

-1s on pull requests?

Hi all,

As the number of committers / contributors on Spark has increased, there
are cases where pull requests get merged before all the review comments
have been addressed. This happens say when one committer points out a
problem with the pull request, and another committer doesn't see the
earlier comment and merges the PR before the comment has been addressed.
 This is especially tricky for pull requests with a large number of
comments, because it can be difficult to notice early comments describing
blocking issues.

This also happens when something accidentally gets merged after the tests
have started but before tests have passed.

Do folks have ideas on how we can handle this issue? Are there other
projects that have good ways of handling this? It looks like for Hadoop,
people can -1 / +1 on the JIRA.

-Kay

Re: -1s on pull requests?

Posted by Nicholas Chammas <ni...@gmail.com>.
On Sun, Aug 3, 2014 at 4:35 PM, Nicholas Chammas <nicholas.chammas@gmail.com
> wrote:

> Include the commit hash in the "tests have started/completed" messages, so
> that it's clear what code exactly is/has been tested for each test cycle.


This is now captured in this JIRA issue
<https://issues.apache.org/jira/browse/SPARK-2912> and completed in this PR
<https://github.com/apache/spark/pull/1816> which has been merged in to
master.

Example of old style: tests starting
<https://github.com/apache/spark/pull/1819#issuecomment-51416510> / tests
finished <https://github.com/apache/spark/pull/1819#issuecomment-51417477>
(with
new classes)

Example of new style: tests starting
<https://github.com/apache/spark/pull/1816#issuecomment-51855254> / tests
finished <https://github.com/apache/spark/pull/1816#issuecomment-51855255>
(with
new classes)

Nick

Re: -1s on pull requests?

Posted by Patrick Wendell <pw...@gmail.com>.
Sure thing - feel free to ping me off list if you need pointers. The script
just does string concatenation and a curl to post the comment... I think it
should be pretty accessible!

- Patrick


On Sun, Aug 3, 2014 at 9:12 PM, Nicholas Chammas <nicholas.chammas@gmail.com
> wrote:

> On Sun, Aug 3, 2014 at 11:29 PM, Patrick Wendell <pw...@gmail.com>
> wrote:
>
> Nick - Any interest in doing these? this is all doable from within the
>> spark repo itself because our QA harness scripts are in there:
>>
>> https://github.com/apache/spark/blob/master/dev/run-tests-jenkins
>>
>> If not, could you make a JIRA for them and put it under "Project Infra".
>>
> I'll make the JIRA and think about how to do this stuff. I'll have to
> understand what that run-tests-jenkins script does and see how easy it is
> to extend.
>
> Nick
> 
>

Re: -1s on pull requests?

Posted by Nicholas Chammas <ni...@gmail.com>.
On Sun, Aug 3, 2014 at 11:29 PM, Patrick Wendell <pw...@gmail.com> wrote:

Nick - Any interest in doing these? this is all doable from within the
> spark repo itself because our QA harness scripts are in there:
>
> https://github.com/apache/spark/blob/master/dev/run-tests-jenkins
>
> If not, could you make a JIRA for them and put it under "Project Infra".
>
I’ll make the JIRA and think about how to do this stuff. I’ll have to
understand what that run-tests-jenkins script does and see how easy it is
to extend.

Nick
​

Re: -1s on pull requests?

Posted by Xiangrui Meng <me...@gmail.com>.
I think the build number is included in the SparkQA message, for
example: https://github.com/apache/spark/pull/1788

The build number 17941 is in the URL
"https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17941/consoleFull".
Just need to be careful to match the number.

Another solution is to kill running Jenkins jobs if there is a code change.

On Tue, Aug 5, 2014 at 8:48 AM, Nicholas Chammas
<ni...@gmail.com> wrote:
>>
>> 1. Include the commit hash in the "tests have started/completed"
>>
>
> FYI: Looks like Xiangrui's already got a JIRA issue for this.
>
> SPARK-2622: Add Jenkins build numbers to SparkQA messages
> <https://issues.apache.org/jira/browse/SPARK-2622>
>
> 2. "Pin" a message to the start or end of the PR
>
>
> Should new JIRA issues for this item fall under the following umbrella
> issue?
>
> SPARK-2230: Improvements to Jenkins QA Harness
> <https://issues.apache.org/jira/browse/SPARK-2230>
>
> Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: -1s on pull requests?

Posted by Nicholas Chammas <ni...@gmail.com>.
>
> 1. Include the commit hash in the "tests have started/completed"
>

FYI: Looks like Xiangrui's already got a JIRA issue for this.

SPARK-2622: Add Jenkins build numbers to SparkQA messages
<https://issues.apache.org/jira/browse/SPARK-2622>

2. "Pin" a message to the start or end of the PR


Should new JIRA issues for this item fall under the following umbrella
issue?

SPARK-2230: Improvements to Jenkins QA Harness
<https://issues.apache.org/jira/browse/SPARK-2230>

Nick

Re: -1s on pull requests?

Posted by Patrick Wendell <pw...@gmail.com>.
>
>    1. Include the commit hash in the "tests have started/completed"
>    messages, so that it's clear what code exactly is/has been tested for
> each
>    test cycle.
>

Great idea - I think this is easy to do given the current architecture. We
already have access to the commit ID in the same script that posts the
comments.

   2. "Pin" a message to the start or end of the PR that is updated with
>    the status of the PR. "Testing not complete"; "New commits since last
>    test"; "Tests failed"; etc. It should be easy for committers to get the
>    status of the PR at a glance, without scrolling through the comment
> history.
>

This also is a good idea - I think this would be doable since the github
API allows us to edit comments, but it's a bit tricker. I think it would
require first making an API call to get the "status comment" ID and then
updating it.


>
> Nick
>

Nick - Any interest in doing these? this is all doable from within the
spark repo itself because our QA harness scripts are in there:

https://github.com/apache/spark/blob/master/dev/run-tests-jenkins

If not, could you make a JIRA for them and put it under "Project Infra".

- Patrick

Re: -1s on pull requests?

Posted by Nicholas Chammas <ni...@gmail.com>.
On Mon, Jul 21, 2014 at 4:44 PM, Kay Ousterhout <ke...@eecs.berkeley.edu>
wrote:

> This also happens when something accidentally gets merged after the tests
> have started but before tests have passed.
>

Some improvements to SparkQA <https://github.com/SparkQA> could help with
this. May I suggest:

   1. Include the commit hash in the "tests have started/completed"
   messages, so that it's clear what code exactly is/has been tested for each
   test cycle.
   2. "Pin" a message to the start or end of the PR that is updated with
   the status of the PR. "Testing not complete"; "New commits since last
   test"; "Tests failed"; etc. It should be easy for committers to get the
   status of the PR at a glance, without scrolling through the comment history.

Nick

Re: -1s on pull requests?

Posted by Henry Saputra <he...@gmail.com>.
There is ASF guidelines about Voting, including code review for
patches: http://www.apache.org/foundation/voting.html

Some ASF project do three +1 votes are required (to the issues like
JIRA or Github PR in this case) for a patch unless it is tagged with
lazy consensus [1] of like 48 hours.
For patches that are not critical, waiting for a while to let some
time for additional committers to review would be the best way to go.

Another thing is that all contributors need to be patience once their
patches have been submitted and pending reviewed. This is part of
being in open community.

Hope this helps.


- Henry

[1] http://www.apache.org/foundation/glossary.html#LazyConsensus

On Mon, Jul 21, 2014 at 1:59 PM, Patrick Wendell <pw...@gmail.com> wrote:
> I've always operated under the assumption that if a commiter makes a
> comment on a PR, and that's not addressed, that should block the PR
> from being merged (even without a specific "-1"). I don't know of any
> cases where this has intentionally been violated, but I do think this
> happens accidentally some times.
>
> Unfortunately, we are not allowed to use those github hooks because of
> the way the ASF github integration works.
>
> I've lately been using a custom-made tool to help review pull
> requests. One thing I could do is add a feature here saying which
> committers have said LGTM on a PR (vs the ones that have commented).
> We could also indicate the latest test status as Green/Yellow/Red
> based on the Jenkins comments:
>
> http://pwendell.github.io/spark-github-shim/
>
> As a warning to potential users, my tool might crash your browser.
>
> - Patrick
>
> On Mon, Jul 21, 2014 at 1:44 PM, Kay Ousterhout <ke...@eecs.berkeley.edu> wrote:
>> Hi all,
>>
>> As the number of committers / contributors on Spark has increased, there
>> are cases where pull requests get merged before all the review comments
>> have been addressed. This happens say when one committer points out a
>> problem with the pull request, and another committer doesn't see the
>> earlier comment and merges the PR before the comment has been addressed.
>>  This is especially tricky for pull requests with a large number of
>> comments, because it can be difficult to notice early comments describing
>> blocking issues.
>>
>> This also happens when something accidentally gets merged after the tests
>> have started but before tests have passed.
>>
>> Do folks have ideas on how we can handle this issue? Are there other
>> projects that have good ways of handling this? It looks like for Hadoop,
>> people can -1 / +1 on the JIRA.
>>
>> -Kay

Re: -1s on pull requests?

Posted by Patrick Wendell <pw...@gmail.com>.
I've always operated under the assumption that if a commiter makes a
comment on a PR, and that's not addressed, that should block the PR
from being merged (even without a specific "-1"). I don't know of any
cases where this has intentionally been violated, but I do think this
happens accidentally some times.

Unfortunately, we are not allowed to use those github hooks because of
the way the ASF github integration works.

I've lately been using a custom-made tool to help review pull
requests. One thing I could do is add a feature here saying which
committers have said LGTM on a PR (vs the ones that have commented).
We could also indicate the latest test status as Green/Yellow/Red
based on the Jenkins comments:

http://pwendell.github.io/spark-github-shim/

As a warning to potential users, my tool might crash your browser.

- Patrick

On Mon, Jul 21, 2014 at 1:44 PM, Kay Ousterhout <ke...@eecs.berkeley.edu> wrote:
> Hi all,
>
> As the number of committers / contributors on Spark has increased, there
> are cases where pull requests get merged before all the review comments
> have been addressed. This happens say when one committer points out a
> problem with the pull request, and another committer doesn't see the
> earlier comment and merges the PR before the comment has been addressed.
>  This is especially tricky for pull requests with a large number of
> comments, because it can be difficult to notice early comments describing
> blocking issues.
>
> This also happens when something accidentally gets merged after the tests
> have started but before tests have passed.
>
> Do folks have ideas on how we can handle this issue? Are there other
> projects that have good ways of handling this? It looks like for Hadoop,
> people can -1 / +1 on the JIRA.
>
> -Kay

Re: -1s on pull requests?

Posted by Mridul Muralidharan <mr...@gmail.com>.
Just came across this mail, thanks for initiating this discussion Kay.
To add; another issue which recurs is very rapid commit's: before most
contributors have had a chance to even look at the changes proposed.
There is not much prior discussion on the jira or pr, and the time
between submitting the PR and committing it is < 12 hours.

Particularly relevant when contributors are not on US timezones and/or
colocated; I have raised this a few times before when the commit had
other side effects not considered.
On flip side we have PR's which have been languishing for weeks with
little or no activity from committers side - making the contribution
stale; so too long a delay is also definitely not the direction to
take either !



Regards,
Mridul



On Tue, Jul 22, 2014 at 2:14 AM, Kay Ousterhout <ke...@eecs.berkeley.edu> wrote:
> Hi all,
>
> As the number of committers / contributors on Spark has increased, there
> are cases where pull requests get merged before all the review comments
> have been addressed. This happens say when one committer points out a
> problem with the pull request, and another committer doesn't see the
> earlier comment and merges the PR before the comment has been addressed.
>  This is especially tricky for pull requests with a large number of
> comments, because it can be difficult to notice early comments describing
> blocking issues.
>
> This also happens when something accidentally gets merged after the tests
> have started but before tests have passed.
>
> Do folks have ideas on how we can handle this issue? Are there other
> projects that have good ways of handling this? It looks like for Hadoop,
> people can -1 / +1 on the JIRA.
>
> -Kay

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: -1s on pull requests?

Posted by Shivaram Venkataraman <sh...@eecs.berkeley.edu>.
One way to do this would be to have a Github hook that parses -1s or +1s
and posts a commit status [1] (like say Travis [2]) right next to the PR.
Does anybody know of an existing tool that does this ?

Shivaram

[1] https://github.com/blog/1227-commit-status-api
[2]
http://blog.travis-ci.com/2012-09-04-pull-requests-just-got-even-more-awesome/


On Mon, Jul 21, 2014 at 1:44 PM, Kay Ousterhout <ke...@eecs.berkeley.edu>
wrote:

> Hi all,
>
> As the number of committers / contributors on Spark has increased, there
> are cases where pull requests get merged before all the review comments
> have been addressed. This happens say when one committer points out a
> problem with the pull request, and another committer doesn't see the
> earlier comment and merges the PR before the comment has been addressed.
>  This is especially tricky for pull requests with a large number of
> comments, because it can be difficult to notice early comments describing
> blocking issues.
>
> This also happens when something accidentally gets merged after the tests
> have started but before tests have passed.
>
> Do folks have ideas on how we can handle this issue? Are there other
> projects that have good ways of handling this? It looks like for Hadoop,
> people can -1 / +1 on the JIRA.
>
> -Kay
>