You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Reynold Xin <rx...@databricks.com> on 2016/04/18 21:02:15 UTC

auto closing pull requests that have been inactive > 30 days?

We have hit a new high in open pull requests: 469 today. While we can
certainly get more review bandwidth, many of these are old and still open
for other reasons. Some are stale because the original authors have become
busy and inactive, and some others are stale because the committers are not
sure whether the patch would be useful, but have not rejected the patch
explicitly. We can cut down the signal to noise ratio by closing pull
requests that have been inactive for greater than 30 days, with a nice
message. I just checked and this would close ~ half of the pull requests.

For example:

"Thank you for creating this pull request. Since this pull request has been
inactive for 30 days, we are automatically closing it. Closing the pull
request does not remove it from history and will retain all the diff and
review comments. If you have the bandwidth and would like to continue
pushing this forward, please reopen it. Thanks again!"

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Reynold Xin <rx...@databricks.com>.
Cody,

Thanks for commenting. "inactive" here means no code push nor comments. So
any "ping" would actually keep the pr in the open queue. Getting
auto-closed also by no means indicate the pull request can't be reopened.

On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org> wrote:

> For what it's worth, I have definitely had PRs that sat inactive for
> more than 30 days due to committers not having time to look at them,
> but did eventually end up successfully being merged.
>
> I guess if this just ends up being a committer ping and reopening the
> PR, it's fine, but I don't know if it really addresses the underlying
> issue.
>
> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com> wrote:
> > We have hit a new high in open pull requests: 469 today. While we can
> > certainly get more review bandwidth, many of these are old and still open
> > for other reasons. Some are stale because the original authors have
> become
> > busy and inactive, and some others are stale because the committers are
> not
> > sure whether the patch would be useful, but have not rejected the patch
> > explicitly. We can cut down the signal to noise ratio by closing pull
> > requests that have been inactive for greater than 30 days, with a nice
> > message. I just checked and this would close ~ half of the pull requests.
> >
> > For example:
> >
> > "Thank you for creating this pull request. Since this pull request has
> been
> > inactive for 30 days, we are automatically closing it. Closing the pull
> > request does not remove it from history and will retain all the diff and
> > review comments. If you have the bandwidth and would like to continue
> > pushing this forward, please reopen it. Thanks again!"
> >
> >
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Reynold Xin <rx...@databricks.com>.
Thanks a lot for commenting. We are getting great feedback on this thread.
The take-aways are:

1. In general people prefer having explicit reasons why pull requests
should be closed. We should push committers to leave messages that are more
explicit about why certain PR should be closed or not. I can't agree more.
But this is not mutually exclusive.


2.  It is difficult to deal with the scale we are talking about. There is
not a single measure that could "fix" everything.

Spark is as far as I know one of the most active open source projects in
terms of contributions, in part because we have made it very easy to accept
contributions. There have been very few open source projects that needed to
deal with this scale. Actually if you look at all the historic PRs, we
closed 12k and have ~450 open. That's less than 4% of the prs outstanding
-- not a bad number. The actual ratio is likely even lower because many of
the 450 open will be merged in the future.

I also took a look at some of the most popular projects on github (e.g.
jquery, angular, react) -- they either have far fewer merged pull requests
or a higher ratio of open-to-close. So we are actually doing pretty well.
But of course there is always room for improvement.




On Mon, Apr 18, 2016 at 8:46 PM, Nicholas Chammas <
nicholas.chammas@gmail.com> wrote:

> Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1
>
> A lot of this was discussed a while back when the PR Dashboard was first
> introduced, and several times before and after that as well. (e.g. August
> 2014
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
> )
>
> If there is not enough momentum to build the tooling that people are
> discussing here, then perhaps Reynold's suggestion is the most practical
> one that is likely to see the light of day.
>
> I think asking committers to be more active in commenting on PRs is
> theoretically the correct thing to do, but impractical. I'm not a
> committer, but I would guess that most of them are already way
> overcommitted (ha!) and asking them to do more just won't yield results.
>
> We've had several instances in the past where we all tried to rally
> <https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
> and be more proactive about giving feedback, closing PRs, and nudging
> contributors who have gone silent. My observation is that the level of
> energy required to "properly" curate PR activity in that way is simply not
> sustainable. People can do it for a few weeks and then things revert to the
> way they are now.
>
> Perhaps the missing link that would make this sustainable is better
> tooling. If you think so and can sling some Javascript, you might want to
> contribute to the PR Dashboard <https://spark-prs.appspot.com/>.
>
> Perhaps the missing link is something else: A different PR review process;
> more committers; a higher barrier to contributing; a combination thereof;
> etc...
>
> Also relevant: http://danluu.com/discourage-oss/
>
> By the way, some people noted that closing PRs may discourage
> contributors. I think our open PR count alone is very discouraging. Under
> what circumstances would you feel encouraged to open a PR against a project
> that has hundreds of open PRs, some from many, many months ago
> <https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
> ?
>
> Nick
>
>
> 2016년 4월 18일 (월) 오후 10:30, Ted Yu <yu...@gmail.com>님이 작성:
>
>> During the months of November / December, the 30 day period should be
>> relaxed.
>>
>> Some people(at least in US) may take extended vacation during that time.
>>
>> For Chinese developers, Spring Festival would bear similar circumstance.
>>
>> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>>
>>> I also think this might not have to be closed only because it is
>>> inactive.
>>>
>>>
>>> How about closing issues after 30 days when a committer's comment is
>>> added at the last without responses from the author?
>>>
>>>
>>> IMHO, If the committers are not sure whether the patch would be useful,
>>> then I think they should leave some comments why they are not sure, not
>>> just ignoring.
>>>
>>> Or, simply they could ask the author to prove that the patch is useful
>>> or safe with some references and tests.
>>>
>>>
>>> I think it might be nicer than that users are supposed to keep pinging.
>>> **Personally**, apparently, I am sometimes a bit worried if pinging
>>> multiple times can be a bit annoying.
>>>
>>>
>>>
>>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>>>
>>>> It would be better to have a specific technical reason why this PR
>>>> should be closed, either the implementation is not good or the problem is
>>>> not valid, or something else. That will actually help the contributor to
>>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>>> free to reopen for so-and-so reason" is actually discouraging and no
>>>> difference than directly close the PR.
>>>>
>>>> Just my two cents.
>>>>
>>>> Thanks
>>>> Jerry
>>>>
>>>>
>>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Having a PR closed, especially if due to committers not having hte
>>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>>> confidence are going to see reopening as "pestering" and busy folks
>>>>> are going to see it as a clear indication that their work is not even
>>>>> valuable enough for a human to give a reason for closing. In either
>>>>> case, the cost of reopening is substantially higher than that button
>>>>> press.
>>>>>
>>>>> How about we start by keeping a report of "at-risk" PRs that have been
>>>>> stale for 30 days to make it easier for committers to look at the prs
>>>>> that have been long inactive?
>>>>>
>>>>>
>>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> > The cost of "reopen" is close to zero, because it is just clicking a
>>>>> button.
>>>>> > I think you were referring to the cost of closing the pull request,
>>>>> and you
>>>>> > are assuming people look at the pull requests that have been
>>>>> inactive for a
>>>>> > long time. That seems equally likely (or unlikely) as committers
>>>>> looking at
>>>>> > the recently closed pull requests.
>>>>> >
>>>>> > In either case, most pull requests are scanned through by us when
>>>>> they are
>>>>> > first open, and if they are important enough, usually they get merged
>>>>> > quickly or a target version is set in JIRA. We can definitely
>>>>> improve that
>>>>> > by making it more explicit.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> From committers' perspective, would they look at closed PRs ?
>>>>> >>
>>>>> >> If not, the cost is not close to zero.
>>>>> >> Meaning, some potentially useful PRs would never see the light of
>>>>> day.
>>>>> >>
>>>>> >> My two cents.
>>>>> >>
>>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Part of it is how difficult it is to automate this. We can build a
>>>>> >>> perfect engine with a lot of rules that understand everything. But
>>>>> the more
>>>>> >>> complicated rules we need, the more unlikely for any of these to
>>>>> happen. So
>>>>> >>> I'd rather do this and create a nice enough message to tell
>>>>> contributors
>>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>>> zero (i.e.
>>>>> >>> click a button on the pull request).
>>>>> >>>
>>>>> >>>
>>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> bq. close the ones where they don't respond for a week
>>>>> >>>>
>>>>> >>>> Does this imply that the script understands response from human ?
>>>>> >>>>
>>>>> >>>> Meaning, would the script use some regex which signifies that the
>>>>> >>>> contributor is willing to close the PR ?
>>>>> >>>>
>>>>> >>>> If the contributor is willing to close, why wouldn't he / she do
>>>>> it
>>>>> >>>> him/herself ?
>>>>> >>>>
>>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>>> holden@pigscanfly.ca>
>>>>> >>>> wrote:
>>>>> >>>>>
>>>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>>> >>>>> understand wanting to keep the open PRs limited to ones which
>>>>> have a
>>>>> >>>>> reasonable chance of being merged.
>>>>> >>>>>
>>>>> >>>>> What about if we filtered for non-mergeable PRs or instead left a
>>>>> >>>>> comment asking the author to respond if they are still available
>>>>> to move the
>>>>> >>>>> PR forward - and close the ones where they don't respond for a
>>>>> week?
>>>>> >>>>>
>>>>> >>>>> Just a suggestion.
>>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> I had one PR which got merged after 3 months.
>>>>> >>>>>>
>>>>> >>>>>> If the inactivity was due to contributor, I think it can be
>>>>> closed
>>>>> >>>>>> after 30 days.
>>>>> >>>>>> But if the inactivity was due to lack of review, the PR should
>>>>> be kept
>>>>> >>>>>> open.
>>>>> >>>>>>
>>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>>>> cody@koeninger.org>
>>>>> >>>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> For what it's worth, I have definitely had PRs that sat
>>>>> inactive for
>>>>> >>>>>>> more than 30 days due to committers not having time to look at
>>>>> them,
>>>>> >>>>>>> but did eventually end up successfully being merged.
>>>>> >>>>>>>
>>>>> >>>>>>> I guess if this just ends up being a committer ping and
>>>>> reopening the
>>>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>>>> underlying
>>>>> >>>>>>> issue.
>>>>> >>>>>>>
>>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>>>> rxin@databricks.com>
>>>>> >>>>>>> wrote:
>>>>> >>>>>>> > We have hit a new high in open pull requests: 469 today.
>>>>> While we
>>>>> >>>>>>> > can
>>>>> >>>>>>> > certainly get more review bandwidth, many of these are old
>>>>> and
>>>>> >>>>>>> > still open
>>>>> >>>>>>> > for other reasons. Some are stale because the original
>>>>> authors have
>>>>> >>>>>>> > become
>>>>> >>>>>>> > busy and inactive, and some others are stale because the
>>>>> committers
>>>>> >>>>>>> > are not
>>>>> >>>>>>> > sure whether the patch would be useful, but have not
>>>>> rejected the
>>>>> >>>>>>> > patch
>>>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>>>> closing
>>>>> >>>>>>> > pull
>>>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>>>> with a
>>>>> >>>>>>> > nice
>>>>> >>>>>>> > message. I just checked and this would close ~ half of the
>>>>> pull
>>>>> >>>>>>> > requests.
>>>>> >>>>>>> >
>>>>> >>>>>>> > For example:
>>>>> >>>>>>> >
>>>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>>>> request
>>>>> >>>>>>> > has been
>>>>> >>>>>>> > inactive for 30 days, we are automatically closing it.
>>>>> Closing the
>>>>> >>>>>>> > pull
>>>>> >>>>>>> > request does not remove it from history and will retain all
>>>>> the
>>>>> >>>>>>> > diff and
>>>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>>>> >>>>>>> > continue
>>>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>> >>>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> --
>>>>> >>>>> Cell : 425-233-8271
>>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> busbey
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Saisai Shao <sa...@gmail.com>.
>>>By the way, some people noted that closing PRs may discourage
contributors. I think our open PR count alone is very discouraging. Under
what circumstances would you feel encouraged to open a PR against a project
that has hundreds of open PRs, some from many, many months ago
<https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
?

I think the original meaning of "discouraging contributors" is  closing
without specific technical reasons, or just lack of bandwidth. These PRs
may not be so important for committers/maintainers, but for individual
contributor especially new open source guy a simple fix for a famous
project means a lot. We actually can have other solutions like setting a
high bar beforehand to reduce the PR number.

Thanks
Jerry



On Tue, Apr 19, 2016 at 11:46 AM, Nicholas Chammas <
nicholas.chammas@gmail.com> wrote:

> Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1
>
> A lot of this was discussed a while back when the PR Dashboard was first
> introduced, and several times before and after that as well. (e.g. August
> 2014
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
> )
>
> If there is not enough momentum to build the tooling that people are
> discussing here, then perhaps Reynold's suggestion is the most practical
> one that is likely to see the light of day.
>
> I think asking committers to be more active in commenting on PRs is
> theoretically the correct thing to do, but impractical. I'm not a
> committer, but I would guess that most of them are already way
> overcommitted (ha!) and asking them to do more just won't yield results.
>
> We've had several instances in the past where we all tried to rally
> <https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
> and be more proactive about giving feedback, closing PRs, and nudging
> contributors who have gone silent. My observation is that the level of
> energy required to "properly" curate PR activity in that way is simply not
> sustainable. People can do it for a few weeks and then things revert to the
> way they are now.
>
> Perhaps the missing link that would make this sustainable is better
> tooling. If you think so and can sling some Javascript, you might want to
> contribute to the PR Dashboard <https://spark-prs.appspot.com/>.
>
> Perhaps the missing link is something else: A different PR review process;
> more committers; a higher barrier to contributing; a combination thereof;
> etc...
>
> Also relevant: http://danluu.com/discourage-oss/
>
> By the way, some people noted that closing PRs may discourage
> contributors. I think our open PR count alone is very discouraging. Under
> what circumstances would you feel encouraged to open a PR against a project
> that has hundreds of open PRs, some from many, many months ago
> <https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
> ?
>
> Nick
>
>
> 2016년 4월 18일 (월) 오후 10:30, Ted Yu <yu...@gmail.com>님이 작성:
>
>> During the months of November / December, the 30 day period should be
>> relaxed.
>>
>> Some people(at least in US) may take extended vacation during that time.
>>
>> For Chinese developers, Spring Festival would bear similar circumstance.
>>
>> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>>
>>> I also think this might not have to be closed only because it is
>>> inactive.
>>>
>>>
>>> How about closing issues after 30 days when a committer's comment is
>>> added at the last without responses from the author?
>>>
>>>
>>> IMHO, If the committers are not sure whether the patch would be useful,
>>> then I think they should leave some comments why they are not sure, not
>>> just ignoring.
>>>
>>> Or, simply they could ask the author to prove that the patch is useful
>>> or safe with some references and tests.
>>>
>>>
>>> I think it might be nicer than that users are supposed to keep pinging.
>>> **Personally**, apparently, I am sometimes a bit worried if pinging
>>> multiple times can be a bit annoying.
>>>
>>>
>>>
>>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>>>
>>>> It would be better to have a specific technical reason why this PR
>>>> should be closed, either the implementation is not good or the problem is
>>>> not valid, or something else. That will actually help the contributor to
>>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>>> free to reopen for so-and-so reason" is actually discouraging and no
>>>> difference than directly close the PR.
>>>>
>>>> Just my two cents.
>>>>
>>>> Thanks
>>>> Jerry
>>>>
>>>>
>>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Having a PR closed, especially if due to committers not having hte
>>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>>> confidence are going to see reopening as "pestering" and busy folks
>>>>> are going to see it as a clear indication that their work is not even
>>>>> valuable enough for a human to give a reason for closing. In either
>>>>> case, the cost of reopening is substantially higher than that button
>>>>> press.
>>>>>
>>>>> How about we start by keeping a report of "at-risk" PRs that have been
>>>>> stale for 30 days to make it easier for committers to look at the prs
>>>>> that have been long inactive?
>>>>>
>>>>>
>>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> > The cost of "reopen" is close to zero, because it is just clicking a
>>>>> button.
>>>>> > I think you were referring to the cost of closing the pull request,
>>>>> and you
>>>>> > are assuming people look at the pull requests that have been
>>>>> inactive for a
>>>>> > long time. That seems equally likely (or unlikely) as committers
>>>>> looking at
>>>>> > the recently closed pull requests.
>>>>> >
>>>>> > In either case, most pull requests are scanned through by us when
>>>>> they are
>>>>> > first open, and if they are important enough, usually they get merged
>>>>> > quickly or a target version is set in JIRA. We can definitely
>>>>> improve that
>>>>> > by making it more explicit.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> From committers' perspective, would they look at closed PRs ?
>>>>> >>
>>>>> >> If not, the cost is not close to zero.
>>>>> >> Meaning, some potentially useful PRs would never see the light of
>>>>> day.
>>>>> >>
>>>>> >> My two cents.
>>>>> >>
>>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Part of it is how difficult it is to automate this. We can build a
>>>>> >>> perfect engine with a lot of rules that understand everything. But
>>>>> the more
>>>>> >>> complicated rules we need, the more unlikely for any of these to
>>>>> happen. So
>>>>> >>> I'd rather do this and create a nice enough message to tell
>>>>> contributors
>>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>>> zero (i.e.
>>>>> >>> click a button on the pull request).
>>>>> >>>
>>>>> >>>
>>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> bq. close the ones where they don't respond for a week
>>>>> >>>>
>>>>> >>>> Does this imply that the script understands response from human ?
>>>>> >>>>
>>>>> >>>> Meaning, would the script use some regex which signifies that the
>>>>> >>>> contributor is willing to close the PR ?
>>>>> >>>>
>>>>> >>>> If the contributor is willing to close, why wouldn't he / she do
>>>>> it
>>>>> >>>> him/herself ?
>>>>> >>>>
>>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>>> holden@pigscanfly.ca>
>>>>> >>>> wrote:
>>>>> >>>>>
>>>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>>> >>>>> understand wanting to keep the open PRs limited to ones which
>>>>> have a
>>>>> >>>>> reasonable chance of being merged.
>>>>> >>>>>
>>>>> >>>>> What about if we filtered for non-mergeable PRs or instead left a
>>>>> >>>>> comment asking the author to respond if they are still available
>>>>> to move the
>>>>> >>>>> PR forward - and close the ones where they don't respond for a
>>>>> week?
>>>>> >>>>>
>>>>> >>>>> Just a suggestion.
>>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> I had one PR which got merged after 3 months.
>>>>> >>>>>>
>>>>> >>>>>> If the inactivity was due to contributor, I think it can be
>>>>> closed
>>>>> >>>>>> after 30 days.
>>>>> >>>>>> But if the inactivity was due to lack of review, the PR should
>>>>> be kept
>>>>> >>>>>> open.
>>>>> >>>>>>
>>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>>>> cody@koeninger.org>
>>>>> >>>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> For what it's worth, I have definitely had PRs that sat
>>>>> inactive for
>>>>> >>>>>>> more than 30 days due to committers not having time to look at
>>>>> them,
>>>>> >>>>>>> but did eventually end up successfully being merged.
>>>>> >>>>>>>
>>>>> >>>>>>> I guess if this just ends up being a committer ping and
>>>>> reopening the
>>>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>>>> underlying
>>>>> >>>>>>> issue.
>>>>> >>>>>>>
>>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>>>> rxin@databricks.com>
>>>>> >>>>>>> wrote:
>>>>> >>>>>>> > We have hit a new high in open pull requests: 469 today.
>>>>> While we
>>>>> >>>>>>> > can
>>>>> >>>>>>> > certainly get more review bandwidth, many of these are old
>>>>> and
>>>>> >>>>>>> > still open
>>>>> >>>>>>> > for other reasons. Some are stale because the original
>>>>> authors have
>>>>> >>>>>>> > become
>>>>> >>>>>>> > busy and inactive, and some others are stale because the
>>>>> committers
>>>>> >>>>>>> > are not
>>>>> >>>>>>> > sure whether the patch would be useful, but have not
>>>>> rejected the
>>>>> >>>>>>> > patch
>>>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>>>> closing
>>>>> >>>>>>> > pull
>>>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>>>> with a
>>>>> >>>>>>> > nice
>>>>> >>>>>>> > message. I just checked and this would close ~ half of the
>>>>> pull
>>>>> >>>>>>> > requests.
>>>>> >>>>>>> >
>>>>> >>>>>>> > For example:
>>>>> >>>>>>> >
>>>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>>>> request
>>>>> >>>>>>> > has been
>>>>> >>>>>>> > inactive for 30 days, we are automatically closing it.
>>>>> Closing the
>>>>> >>>>>>> > pull
>>>>> >>>>>>> > request does not remove it from history and will retain all
>>>>> the
>>>>> >>>>>>> > diff and
>>>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>>>> >>>>>>> > continue
>>>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>> >>>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> --
>>>>> >>>>> Cell : 425-233-8271
>>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> busbey
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Ted Yu <yu...@gmail.com>.
bq. there should be more committers or they are asked to be more active.

Bingo.

bq. they can't be closed only because it is "expired" with a copy and
pasted message.

+1

On Mon, Apr 18, 2016 at 9:14 PM, Hyukjin Kwon <gu...@gmail.com> wrote:

> I don't think asking committers to be more active is impractical. I am
> not too sure if other projects apply the same rules here but
>
> I think if a project is being more popular, then I think it is appropriate
> that there should be more committers or they are asked to be more active.
>
>
> In addition, I believe there are a lot of PRs waiting for committer's
> comments.
>
>
> If committers are too busy to review a PR, then I think they better ask
> authors to provide the evidence to decide, maybe with a message such as
>
> "I am currently too busy to review or decide. Could you please add some evidence/benchmark/performance
> test or survey for demands?"
>
>
> If the evidence is not enough or not easy to see, then they can ask to
> simplify the evidence or make a proper conclusion, maybe with a message
> such as
>
> "I think the evidence is not enough/trustable because .... Could you
> please simplify/provide some more evidence?".
>
>
>
> Or, I think they can be manually closed with a explicit message such as
>
> "This is closed for now because we are not sure for this patch because.."
>
>
> I think they can't be closed only because it is "expired" with a copy and
> pasted message.
>
>
>
> 2016-04-19 12:46 GMT+09:00 Nicholas Chammas <ni...@gmail.com>:
>
>> Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1
>>
>> A lot of this was discussed a while back when the PR Dashboard was first
>> introduced, and several times before and after that as well. (e.g. August
>> 2014
>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
>> )
>>
>> If there is not enough momentum to build the tooling that people are
>> discussing here, then perhaps Reynold's suggestion is the most practical
>> one that is likely to see the light of day.
>>
>> I think asking committers to be more active in commenting on PRs is
>> theoretically the correct thing to do, but impractical. I'm not a
>> committer, but I would guess that most of them are already way
>> overcommitted (ha!) and asking them to do more just won't yield results.
>>
>> We've had several instances in the past where we all tried to rally
>> <https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
>> and be more proactive about giving feedback, closing PRs, and nudging
>> contributors who have gone silent. My observation is that the level of
>> energy required to "properly" curate PR activity in that way is simply not
>> sustainable. People can do it for a few weeks and then things revert to the
>> way they are now.
>>
>> Perhaps the missing link that would make this sustainable is better
>> tooling. If you think so and can sling some Javascript, you might want
>> to contribute to the PR Dashboard <https://spark-prs.appspot.com/>.
>>
>> Perhaps the missing link is something else: A different PR review
>> process; more committers; a higher barrier to contributing; a combination
>> thereof; etc...
>>
>> Also relevant: http://danluu.com/discourage-oss/
>>
>> By the way, some people noted that closing PRs may discourage
>> contributors. I think our open PR count alone is very discouraging. Under
>> what circumstances would you feel encouraged to open a PR against a project
>> that has hundreds of open PRs, some from many, many months ago
>> <https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
>> ?
>>
>> Nick
>>
>>
>> 2016년 4월 18일 (월) 오후 10:30, Ted Yu <yu...@gmail.com>님이 작성:
>>
>>> During the months of November / December, the 30 day period should be
>>> relaxed.
>>>
>>> Some people(at least in US) may take extended vacation during that time.
>>>
>>> For Chinese developers, Spring Festival would bear similar circumstance.
>>>
>>> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> I also think this might not have to be closed only because it is
>>>> inactive.
>>>>
>>>>
>>>> How about closing issues after 30 days when a committer's comment is
>>>> added at the last without responses from the author?
>>>>
>>>>
>>>> IMHO, If the committers are not sure whether the patch would be
>>>> useful, then I think they should leave some comments why they are not sure,
>>>> not just ignoring.
>>>>
>>>> Or, simply they could ask the author to prove that the patch is useful
>>>> or safe with some references and tests.
>>>>
>>>>
>>>> I think it might be nicer than that users are supposed to keep pinging.
>>>> **Personally**, apparently, I am sometimes a bit worried if pinging
>>>> multiple times can be a bit annoying.
>>>>
>>>>
>>>>
>>>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>>>>
>>>>> It would be better to have a specific technical reason why this PR
>>>>> should be closed, either the implementation is not good or the problem is
>>>>> not valid, or something else. That will actually help the contributor to
>>>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>>>> free to reopen for so-and-so reason" is actually discouraging and no
>>>>> difference than directly close the PR.
>>>>>
>>>>> Just my two cents.
>>>>>
>>>>> Thanks
>>>>> Jerry
>>>>>
>>>>>
>>>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Having a PR closed, especially if due to committers not having hte
>>>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>>>> confidence are going to see reopening as "pestering" and busy folks
>>>>>> are going to see it as a clear indication that their work is not even
>>>>>> valuable enough for a human to give a reason for closing. In either
>>>>>> case, the cost of reopening is substantially higher than that button
>>>>>> press.
>>>>>>
>>>>>> How about we start by keeping a report of "at-risk" PRs that have been
>>>>>> stale for 30 days to make it easier for committers to look at the prs
>>>>>> that have been long inactive?
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> > The cost of "reopen" is close to zero, because it is just clicking
>>>>>> a button.
>>>>>> > I think you were referring to the cost of closing the pull request,
>>>>>> and you
>>>>>> > are assuming people look at the pull requests that have been
>>>>>> inactive for a
>>>>>> > long time. That seems equally likely (or unlikely) as committers
>>>>>> looking at
>>>>>> > the recently closed pull requests.
>>>>>> >
>>>>>> > In either case, most pull requests are scanned through by us when
>>>>>> they are
>>>>>> > first open, and if they are important enough, usually they get
>>>>>> merged
>>>>>> > quickly or a target version is set in JIRA. We can definitely
>>>>>> improve that
>>>>>> > by making it more explicit.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> From committers' perspective, would they look at closed PRs ?
>>>>>> >>
>>>>>> >> If not, the cost is not close to zero.
>>>>>> >> Meaning, some potentially useful PRs would never see the light of
>>>>>> day.
>>>>>> >>
>>>>>> >> My two cents.
>>>>>> >>
>>>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> Part of it is how difficult it is to automate this. We can build a
>>>>>> >>> perfect engine with a lot of rules that understand everything.
>>>>>> But the more
>>>>>> >>> complicated rules we need, the more unlikely for any of these to
>>>>>> happen. So
>>>>>> >>> I'd rather do this and create a nice enough message to tell
>>>>>> contributors
>>>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>>>> zero (i.e.
>>>>>> >>> click a button on the pull request).
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>>>>> wrote:
>>>>>> >>>>
>>>>>> >>>> bq. close the ones where they don't respond for a week
>>>>>> >>>>
>>>>>> >>>> Does this imply that the script understands response from human ?
>>>>>> >>>>
>>>>>> >>>> Meaning, would the script use some regex which signifies that the
>>>>>> >>>> contributor is willing to close the PR ?
>>>>>> >>>>
>>>>>> >>>> If the contributor is willing to close, why wouldn't he / she do
>>>>>> it
>>>>>> >>>> him/herself ?
>>>>>> >>>>
>>>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>>>> holden@pigscanfly.ca>
>>>>>> >>>> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>>>> >>>>> understand wanting to keep the open PRs limited to ones which
>>>>>> have a
>>>>>> >>>>> reasonable chance of being merged.
>>>>>> >>>>>
>>>>>> >>>>> What about if we filtered for non-mergeable PRs or instead left
>>>>>> a
>>>>>> >>>>> comment asking the author to respond if they are still
>>>>>> available to move the
>>>>>> >>>>> PR forward - and close the ones where they don't respond for a
>>>>>> week?
>>>>>> >>>>>
>>>>>> >>>>> Just a suggestion.
>>>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>>> >>>>>>
>>>>>> >>>>>> I had one PR which got merged after 3 months.
>>>>>> >>>>>>
>>>>>> >>>>>> If the inactivity was due to contributor, I think it can be
>>>>>> closed
>>>>>> >>>>>> after 30 days.
>>>>>> >>>>>> But if the inactivity was due to lack of review, the PR should
>>>>>> be kept
>>>>>> >>>>>> open.
>>>>>> >>>>>>
>>>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>>>>> cody@koeninger.org>
>>>>>> >>>>>> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>> For what it's worth, I have definitely had PRs that sat
>>>>>> inactive for
>>>>>> >>>>>>> more than 30 days due to committers not having time to look
>>>>>> at them,
>>>>>> >>>>>>> but did eventually end up successfully being merged.
>>>>>> >>>>>>>
>>>>>> >>>>>>> I guess if this just ends up being a committer ping and
>>>>>> reopening the
>>>>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>>>>> underlying
>>>>>> >>>>>>> issue.
>>>>>> >>>>>>>
>>>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>>>>> rxin@databricks.com>
>>>>>> >>>>>>> wrote:
>>>>>> >>>>>>> > We have hit a new high in open pull requests: 469 today.
>>>>>> While we
>>>>>> >>>>>>> > can
>>>>>> >>>>>>> > certainly get more review bandwidth, many of these are old
>>>>>> and
>>>>>> >>>>>>> > still open
>>>>>> >>>>>>> > for other reasons. Some are stale because the original
>>>>>> authors have
>>>>>> >>>>>>> > become
>>>>>> >>>>>>> > busy and inactive, and some others are stale because the
>>>>>> committers
>>>>>> >>>>>>> > are not
>>>>>> >>>>>>> > sure whether the patch would be useful, but have not
>>>>>> rejected the
>>>>>> >>>>>>> > patch
>>>>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>>>>> closing
>>>>>> >>>>>>> > pull
>>>>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>>>>> with a
>>>>>> >>>>>>> > nice
>>>>>> >>>>>>> > message. I just checked and this would close ~ half of the
>>>>>> pull
>>>>>> >>>>>>> > requests.
>>>>>> >>>>>>> >
>>>>>> >>>>>>> > For example:
>>>>>> >>>>>>> >
>>>>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>>>>> request
>>>>>> >>>>>>> > has been
>>>>>> >>>>>>> > inactive for 30 days, we are automatically closing it.
>>>>>> Closing the
>>>>>> >>>>>>> > pull
>>>>>> >>>>>>> > request does not remove it from history and will retain all
>>>>>> the
>>>>>> >>>>>>> > diff and
>>>>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>>>>> >>>>>>> > continue
>>>>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>>> >>>>>>> >
>>>>>> >>>>>>> >
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>> >>>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> --
>>>>>> >>>>> Cell : 425-233-8271
>>>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> busbey
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Hyukjin Kwon <gu...@gmail.com>.
I agree with you, Sean, almost all.

If feedback can be given enough, it might be okay to apply the rule as
Reynold said it looks they are not mutually exclusive (although I still
think there should be more committers or should be more active because the
main reason for this looks there are PRs not reviewed or closed without
feedback enough. I think there should be some comments even though it
closes the issue and the comments are "I don't think it is worth adding
because...". This could help develop contributer's logical process which
would eventually decrease committer's help).

FWIW, I strongly feel at least those 10% of other remaining "good" PRs
should be given feedback and comments enough because **Personally** I am
pretty sure that the PRs closed by the rule would never be reopened
including those 10% and I believe the copy-and-pasted message would do
nothing but just notifying "closing this as it is expired". We don't really
care how nice the messages left by Jenkins are.

Lastly, I think being open and closed might mean something and that might
be the reason why there are some PRs not closed even though comitters think
they don't think it is worth adding.
On 19 Apr 2016 1:14 p.m., "Hyukjin Kwon" <gu...@gmail.com> wrote:

> I don't think asking committers to be more active is impractical. I am
> not too sure if other projects apply the same rules here but
>
> I think if a project is being more popular, then I think it is appropriate
> that there should be more committers or they are asked to be more active.
>
>
> In addition, I believe there are a lot of PRs waiting for committer's
> comments.
>
>
> If committers are too busy to review a PR, then I think they better ask
> authors to provide the evidence to decide, maybe with a message such as
>
> "I am currently too busy to review or decide. Could you please add some evidence/benchmark/performance
> test or survey for demands?"
>
>
> If the evidence is not enough or not easy to see, then they can ask to
> simplify the evidence or make a proper conclusion, maybe with a message
> such as
>
> "I think the evidence is not enough/trustable because .... Could you
> please simplify/provide some more evidence?".
>
>
>
> Or, I think they can be manually closed with a explicit message such as
>
> "This is closed for now because we are not sure for this patch because.."
>
>
> I think they can't be closed only because it is "expired" with a copy and
> pasted message.
>
>
>
> 2016-04-19 12:46 GMT+09:00 Nicholas Chammas <ni...@gmail.com>:
>
>> Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1
>>
>> A lot of this was discussed a while back when the PR Dashboard was first
>> introduced, and several times before and after that as well. (e.g. August
>> 2014
>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
>> )
>>
>> If there is not enough momentum to build the tooling that people are
>> discussing here, then perhaps Reynold's suggestion is the most practical
>> one that is likely to see the light of day.
>>
>> I think asking committers to be more active in commenting on PRs is
>> theoretically the correct thing to do, but impractical. I'm not a
>> committer, but I would guess that most of them are already way
>> overcommitted (ha!) and asking them to do more just won't yield results.
>>
>> We've had several instances in the past where we all tried to rally
>> <https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
>> and be more proactive about giving feedback, closing PRs, and nudging
>> contributors who have gone silent. My observation is that the level of
>> energy required to "properly" curate PR activity in that way is simply not
>> sustainable. People can do it for a few weeks and then things revert to the
>> way they are now.
>>
>> Perhaps the missing link that would make this sustainable is better
>> tooling. If you think so and can sling some Javascript, you might want
>> to contribute to the PR Dashboard <https://spark-prs.appspot.com/>.
>>
>> Perhaps the missing link is something else: A different PR review
>> process; more committers; a higher barrier to contributing; a combination
>> thereof; etc...
>>
>> Also relevant: http://danluu.com/discourage-oss/
>>
>> By the way, some people noted that closing PRs may discourage
>> contributors. I think our open PR count alone is very discouraging. Under
>> what circumstances would you feel encouraged to open a PR against a project
>> that has hundreds of open PRs, some from many, many months ago
>> <https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
>> ?
>>
>> Nick
>>
>>
>> 2016년 4월 18일 (월) 오후 10:30, Ted Yu <yu...@gmail.com>님이 작성:
>>
>>> During the months of November / December, the 30 day period should be
>>> relaxed.
>>>
>>> Some people(at least in US) may take extended vacation during that time.
>>>
>>> For Chinese developers, Spring Festival would bear similar circumstance.
>>>
>>> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> I also think this might not have to be closed only because it is
>>>> inactive.
>>>>
>>>>
>>>> How about closing issues after 30 days when a committer's comment is
>>>> added at the last without responses from the author?
>>>>
>>>>
>>>> IMHO, If the committers are not sure whether the patch would be
>>>> useful, then I think they should leave some comments why they are not sure,
>>>> not just ignoring.
>>>>
>>>> Or, simply they could ask the author to prove that the patch is useful
>>>> or safe with some references and tests.
>>>>
>>>>
>>>> I think it might be nicer than that users are supposed to keep pinging.
>>>> **Personally**, apparently, I am sometimes a bit worried if pinging
>>>> multiple times can be a bit annoying.
>>>>
>>>>
>>>>
>>>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>>>>
>>>>> It would be better to have a specific technical reason why this PR
>>>>> should be closed, either the implementation is not good or the problem is
>>>>> not valid, or something else. That will actually help the contributor to
>>>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>>>> free to reopen for so-and-so reason" is actually discouraging and no
>>>>> difference than directly close the PR.
>>>>>
>>>>> Just my two cents.
>>>>>
>>>>> Thanks
>>>>> Jerry
>>>>>
>>>>>
>>>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Having a PR closed, especially if due to committers not having hte
>>>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>>>> confidence are going to see reopening as "pestering" and busy folks
>>>>>> are going to see it as a clear indication that their work is not even
>>>>>> valuable enough for a human to give a reason for closing. In either
>>>>>> case, the cost of reopening is substantially higher than that button
>>>>>> press.
>>>>>>
>>>>>> How about we start by keeping a report of "at-risk" PRs that have been
>>>>>> stale for 30 days to make it easier for committers to look at the prs
>>>>>> that have been long inactive?
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> > The cost of "reopen" is close to zero, because it is just clicking
>>>>>> a button.
>>>>>> > I think you were referring to the cost of closing the pull request,
>>>>>> and you
>>>>>> > are assuming people look at the pull requests that have been
>>>>>> inactive for a
>>>>>> > long time. That seems equally likely (or unlikely) as committers
>>>>>> looking at
>>>>>> > the recently closed pull requests.
>>>>>> >
>>>>>> > In either case, most pull requests are scanned through by us when
>>>>>> they are
>>>>>> > first open, and if they are important enough, usually they get
>>>>>> merged
>>>>>> > quickly or a target version is set in JIRA. We can definitely
>>>>>> improve that
>>>>>> > by making it more explicit.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> From committers' perspective, would they look at closed PRs ?
>>>>>> >>
>>>>>> >> If not, the cost is not close to zero.
>>>>>> >> Meaning, some potentially useful PRs would never see the light of
>>>>>> day.
>>>>>> >>
>>>>>> >> My two cents.
>>>>>> >>
>>>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> Part of it is how difficult it is to automate this. We can build a
>>>>>> >>> perfect engine with a lot of rules that understand everything.
>>>>>> But the more
>>>>>> >>> complicated rules we need, the more unlikely for any of these to
>>>>>> happen. So
>>>>>> >>> I'd rather do this and create a nice enough message to tell
>>>>>> contributors
>>>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>>>> zero (i.e.
>>>>>> >>> click a button on the pull request).
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>>>>> wrote:
>>>>>> >>>>
>>>>>> >>>> bq. close the ones where they don't respond for a week
>>>>>> >>>>
>>>>>> >>>> Does this imply that the script understands response from human ?
>>>>>> >>>>
>>>>>> >>>> Meaning, would the script use some regex which signifies that the
>>>>>> >>>> contributor is willing to close the PR ?
>>>>>> >>>>
>>>>>> >>>> If the contributor is willing to close, why wouldn't he / she do
>>>>>> it
>>>>>> >>>> him/herself ?
>>>>>> >>>>
>>>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>>>> holden@pigscanfly.ca>
>>>>>> >>>> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>>>> >>>>> understand wanting to keep the open PRs limited to ones which
>>>>>> have a
>>>>>> >>>>> reasonable chance of being merged.
>>>>>> >>>>>
>>>>>> >>>>> What about if we filtered for non-mergeable PRs or instead left
>>>>>> a
>>>>>> >>>>> comment asking the author to respond if they are still
>>>>>> available to move the
>>>>>> >>>>> PR forward - and close the ones where they don't respond for a
>>>>>> week?
>>>>>> >>>>>
>>>>>> >>>>> Just a suggestion.
>>>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>>> >>>>>>
>>>>>> >>>>>> I had one PR which got merged after 3 months.
>>>>>> >>>>>>
>>>>>> >>>>>> If the inactivity was due to contributor, I think it can be
>>>>>> closed
>>>>>> >>>>>> after 30 days.
>>>>>> >>>>>> But if the inactivity was due to lack of review, the PR should
>>>>>> be kept
>>>>>> >>>>>> open.
>>>>>> >>>>>>
>>>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>>>>> cody@koeninger.org>
>>>>>> >>>>>> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>> For what it's worth, I have definitely had PRs that sat
>>>>>> inactive for
>>>>>> >>>>>>> more than 30 days due to committers not having time to look
>>>>>> at them,
>>>>>> >>>>>>> but did eventually end up successfully being merged.
>>>>>> >>>>>>>
>>>>>> >>>>>>> I guess if this just ends up being a committer ping and
>>>>>> reopening the
>>>>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>>>>> underlying
>>>>>> >>>>>>> issue.
>>>>>> >>>>>>>
>>>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>>>>> rxin@databricks.com>
>>>>>> >>>>>>> wrote:
>>>>>> >>>>>>> > We have hit a new high in open pull requests: 469 today.
>>>>>> While we
>>>>>> >>>>>>> > can
>>>>>> >>>>>>> > certainly get more review bandwidth, many of these are old
>>>>>> and
>>>>>> >>>>>>> > still open
>>>>>> >>>>>>> > for other reasons. Some are stale because the original
>>>>>> authors have
>>>>>> >>>>>>> > become
>>>>>> >>>>>>> > busy and inactive, and some others are stale because the
>>>>>> committers
>>>>>> >>>>>>> > are not
>>>>>> >>>>>>> > sure whether the patch would be useful, but have not
>>>>>> rejected the
>>>>>> >>>>>>> > patch
>>>>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>>>>> closing
>>>>>> >>>>>>> > pull
>>>>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>>>>> with a
>>>>>> >>>>>>> > nice
>>>>>> >>>>>>> > message. I just checked and this would close ~ half of the
>>>>>> pull
>>>>>> >>>>>>> > requests.
>>>>>> >>>>>>> >
>>>>>> >>>>>>> > For example:
>>>>>> >>>>>>> >
>>>>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>>>>> request
>>>>>> >>>>>>> > has been
>>>>>> >>>>>>> > inactive for 30 days, we are automatically closing it.
>>>>>> Closing the
>>>>>> >>>>>>> > pull
>>>>>> >>>>>>> > request does not remove it from history and will retain all
>>>>>> the
>>>>>> >>>>>>> > diff and
>>>>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>>>>> >>>>>>> > continue
>>>>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>>> >>>>>>> >
>>>>>> >>>>>>> >
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>> >>>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> --
>>>>>> >>>>> Cell : 425-233-8271
>>>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> busbey
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Hyukjin Kwon <gu...@gmail.com>.
I don't think asking committers to be more active is impractical. I am not
too sure if other projects apply the same rules here but

I think if a project is being more popular, then I think it is appropriate
that there should be more committers or they are asked to be more active.


In addition, I believe there are a lot of PRs waiting for committer's
comments.


If committers are too busy to review a PR, then I think they better ask
authors to provide the evidence to decide, maybe with a message such as

"I am currently too busy to review or decide. Could you please add
some evidence/benchmark/performance
test or survey for demands?"


If the evidence is not enough or not easy to see, then they can ask to
simplify the evidence or make a proper conclusion, maybe with a message
such as

"I think the evidence is not enough/trustable because .... Could you
please simplify/provide
some more evidence?".



Or, I think they can be manually closed with a explicit message such as

"This is closed for now because we are not sure for this patch because.."


I think they can't be closed only because it is "expired" with a copy and
pasted message.



2016-04-19 12:46 GMT+09:00 Nicholas Chammas <ni...@gmail.com>:

> Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1
>
> A lot of this was discussed a while back when the PR Dashboard was first
> introduced, and several times before and after that as well. (e.g. August
> 2014
> <http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
> )
>
> If there is not enough momentum to build the tooling that people are
> discussing here, then perhaps Reynold's suggestion is the most practical
> one that is likely to see the light of day.
>
> I think asking committers to be more active in commenting on PRs is
> theoretically the correct thing to do, but impractical. I'm not a
> committer, but I would guess that most of them are already way
> overcommitted (ha!) and asking them to do more just won't yield results.
>
> We've had several instances in the past where we all tried to rally
> <https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
> and be more proactive about giving feedback, closing PRs, and nudging
> contributors who have gone silent. My observation is that the level of
> energy required to "properly" curate PR activity in that way is simply not
> sustainable. People can do it for a few weeks and then things revert to the
> way they are now.
>
> Perhaps the missing link that would make this sustainable is better
> tooling. If you think so and can sling some Javascript, you might want to
> contribute to the PR Dashboard <https://spark-prs.appspot.com/>.
>
> Perhaps the missing link is something else: A different PR review process;
> more committers; a higher barrier to contributing; a combination thereof;
> etc...
>
> Also relevant: http://danluu.com/discourage-oss/
>
> By the way, some people noted that closing PRs may discourage
> contributors. I think our open PR count alone is very discouraging. Under
> what circumstances would you feel encouraged to open a PR against a project
> that has hundreds of open PRs, some from many, many months ago
> <https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
> ?
>
> Nick
>
>
> 2016년 4월 18일 (월) 오후 10:30, Ted Yu <yu...@gmail.com>님이 작성:
>
>> During the months of November / December, the 30 day period should be
>> relaxed.
>>
>> Some people(at least in US) may take extended vacation during that time.
>>
>> For Chinese developers, Spring Festival would bear similar circumstance.
>>
>> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>>
>>> I also think this might not have to be closed only because it is
>>> inactive.
>>>
>>>
>>> How about closing issues after 30 days when a committer's comment is
>>> added at the last without responses from the author?
>>>
>>>
>>> IMHO, If the committers are not sure whether the patch would be useful,
>>> then I think they should leave some comments why they are not sure, not
>>> just ignoring.
>>>
>>> Or, simply they could ask the author to prove that the patch is useful
>>> or safe with some references and tests.
>>>
>>>
>>> I think it might be nicer than that users are supposed to keep pinging.
>>> **Personally**, apparently, I am sometimes a bit worried if pinging
>>> multiple times can be a bit annoying.
>>>
>>>
>>>
>>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>>>
>>>> It would be better to have a specific technical reason why this PR
>>>> should be closed, either the implementation is not good or the problem is
>>>> not valid, or something else. That will actually help the contributor to
>>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>>> free to reopen for so-and-so reason" is actually discouraging and no
>>>> difference than directly close the PR.
>>>>
>>>> Just my two cents.
>>>>
>>>> Thanks
>>>> Jerry
>>>>
>>>>
>>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Having a PR closed, especially if due to committers not having hte
>>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>>> confidence are going to see reopening as "pestering" and busy folks
>>>>> are going to see it as a clear indication that their work is not even
>>>>> valuable enough for a human to give a reason for closing. In either
>>>>> case, the cost of reopening is substantially higher than that button
>>>>> press.
>>>>>
>>>>> How about we start by keeping a report of "at-risk" PRs that have been
>>>>> stale for 30 days to make it easier for committers to look at the prs
>>>>> that have been long inactive?
>>>>>
>>>>>
>>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> > The cost of "reopen" is close to zero, because it is just clicking a
>>>>> button.
>>>>> > I think you were referring to the cost of closing the pull request,
>>>>> and you
>>>>> > are assuming people look at the pull requests that have been
>>>>> inactive for a
>>>>> > long time. That seems equally likely (or unlikely) as committers
>>>>> looking at
>>>>> > the recently closed pull requests.
>>>>> >
>>>>> > In either case, most pull requests are scanned through by us when
>>>>> they are
>>>>> > first open, and if they are important enough, usually they get merged
>>>>> > quickly or a target version is set in JIRA. We can definitely
>>>>> improve that
>>>>> > by making it more explicit.
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com>
>>>>> wrote:
>>>>> >>
>>>>> >> From committers' perspective, would they look at closed PRs ?
>>>>> >>
>>>>> >> If not, the cost is not close to zero.
>>>>> >> Meaning, some potentially useful PRs would never see the light of
>>>>> day.
>>>>> >>
>>>>> >> My two cents.
>>>>> >>
>>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> >>>
>>>>> >>> Part of it is how difficult it is to automate this. We can build a
>>>>> >>> perfect engine with a lot of rules that understand everything. But
>>>>> the more
>>>>> >>> complicated rules we need, the more unlikely for any of these to
>>>>> happen. So
>>>>> >>> I'd rather do this and create a nice enough message to tell
>>>>> contributors
>>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>>> zero (i.e.
>>>>> >>> click a button on the pull request).
>>>>> >>>
>>>>> >>>
>>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> bq. close the ones where they don't respond for a week
>>>>> >>>>
>>>>> >>>> Does this imply that the script understands response from human ?
>>>>> >>>>
>>>>> >>>> Meaning, would the script use some regex which signifies that the
>>>>> >>>> contributor is willing to close the PR ?
>>>>> >>>>
>>>>> >>>> If the contributor is willing to close, why wouldn't he / she do
>>>>> it
>>>>> >>>> him/herself ?
>>>>> >>>>
>>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>>> holden@pigscanfly.ca>
>>>>> >>>> wrote:
>>>>> >>>>>
>>>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>>> >>>>> understand wanting to keep the open PRs limited to ones which
>>>>> have a
>>>>> >>>>> reasonable chance of being merged.
>>>>> >>>>>
>>>>> >>>>> What about if we filtered for non-mergeable PRs or instead left a
>>>>> >>>>> comment asking the author to respond if they are still available
>>>>> to move the
>>>>> >>>>> PR forward - and close the ones where they don't respond for a
>>>>> week?
>>>>> >>>>>
>>>>> >>>>> Just a suggestion.
>>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> I had one PR which got merged after 3 months.
>>>>> >>>>>>
>>>>> >>>>>> If the inactivity was due to contributor, I think it can be
>>>>> closed
>>>>> >>>>>> after 30 days.
>>>>> >>>>>> But if the inactivity was due to lack of review, the PR should
>>>>> be kept
>>>>> >>>>>> open.
>>>>> >>>>>>
>>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>>>> cody@koeninger.org>
>>>>> >>>>>> wrote:
>>>>> >>>>>>>
>>>>> >>>>>>> For what it's worth, I have definitely had PRs that sat
>>>>> inactive for
>>>>> >>>>>>> more than 30 days due to committers not having time to look at
>>>>> them,
>>>>> >>>>>>> but did eventually end up successfully being merged.
>>>>> >>>>>>>
>>>>> >>>>>>> I guess if this just ends up being a committer ping and
>>>>> reopening the
>>>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>>>> underlying
>>>>> >>>>>>> issue.
>>>>> >>>>>>>
>>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>>>> rxin@databricks.com>
>>>>> >>>>>>> wrote:
>>>>> >>>>>>> > We have hit a new high in open pull requests: 469 today.
>>>>> While we
>>>>> >>>>>>> > can
>>>>> >>>>>>> > certainly get more review bandwidth, many of these are old
>>>>> and
>>>>> >>>>>>> > still open
>>>>> >>>>>>> > for other reasons. Some are stale because the original
>>>>> authors have
>>>>> >>>>>>> > become
>>>>> >>>>>>> > busy and inactive, and some others are stale because the
>>>>> committers
>>>>> >>>>>>> > are not
>>>>> >>>>>>> > sure whether the patch would be useful, but have not
>>>>> rejected the
>>>>> >>>>>>> > patch
>>>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>>>> closing
>>>>> >>>>>>> > pull
>>>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>>>> with a
>>>>> >>>>>>> > nice
>>>>> >>>>>>> > message. I just checked and this would close ~ half of the
>>>>> pull
>>>>> >>>>>>> > requests.
>>>>> >>>>>>> >
>>>>> >>>>>>> > For example:
>>>>> >>>>>>> >
>>>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>>>> request
>>>>> >>>>>>> > has been
>>>>> >>>>>>> > inactive for 30 days, we are automatically closing it.
>>>>> Closing the
>>>>> >>>>>>> > pull
>>>>> >>>>>>> > request does not remove it from history and will retain all
>>>>> the
>>>>> >>>>>>> > diff and
>>>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>>>> >>>>>>> > continue
>>>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>> >>>>>>> >
>>>>> >>>>>>> >
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>> >>>>>>>
>>>>> >>>>>>
>>>>> >>>>>
>>>>> >>>>>
>>>>> >>>>> --
>>>>> >>>>> Cell : 425-233-8271
>>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>>> >>>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> busbey
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Nicholas Chammas <ni...@gmail.com>.
Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1

A lot of this was discussed a while back when the PR Dashboard was first
introduced, and several times before and after that as well. (e.g. August
2014
<http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
)

If there is not enough momentum to build the tooling that people are
discussing here, then perhaps Reynold's suggestion is the most practical
one that is likely to see the light of day.

I think asking committers to be more active in commenting on PRs is
theoretically the correct thing to do, but impractical. I'm not a
committer, but I would guess that most of them are already way
overcommitted (ha!) and asking them to do more just won't yield results.

We've had several instances in the past where we all tried to rally
<https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
and be more proactive about giving feedback, closing PRs, and nudging
contributors who have gone silent. My observation is that the level of
energy required to "properly" curate PR activity in that way is simply not
sustainable. People can do it for a few weeks and then things revert to the
way they are now.

Perhaps the missing link that would make this sustainable is better
tooling. If you think so and can sling some Javascript, you might want to
contribute to the PR Dashboard <https://spark-prs.appspot.com/>.

Perhaps the missing link is something else: A different PR review process;
more committers; a higher barrier to contributing; a combination thereof;
etc...

Also relevant: http://danluu.com/discourage-oss/

By the way, some people noted that closing PRs may discourage contributors.
I think our open PR count alone is very discouraging. Under what
circumstances would you feel encouraged to open a PR against a project that
has hundreds of open PRs, some from many, many months ago
<https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
?

Nick


2016년 4월 18일 (월) 오후 10:30, Ted Yu <yu...@gmail.com>님이 작성:

> During the months of November / December, the 30 day period should be
> relaxed.
>
> Some people(at least in US) may take extended vacation during that time.
>
> For Chinese developers, Spring Festival would bear similar circumstance.
>
> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> I also think this might not have to be closed only because it is
>> inactive.
>>
>>
>> How about closing issues after 30 days when a committer's comment is
>> added at the last without responses from the author?
>>
>>
>> IMHO, If the committers are not sure whether the patch would be useful,
>> then I think they should leave some comments why they are not sure, not
>> just ignoring.
>>
>> Or, simply they could ask the author to prove that the patch is useful
>> or safe with some references and tests.
>>
>>
>> I think it might be nicer than that users are supposed to keep pinging.
>> **Personally**, apparently, I am sometimes a bit worried if pinging
>> multiple times can be a bit annoying.
>>
>>
>>
>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>>
>>> It would be better to have a specific technical reason why this PR
>>> should be closed, either the implementation is not good or the problem is
>>> not valid, or something else. That will actually help the contributor to
>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>> free to reopen for so-and-so reason" is actually discouraging and no
>>> difference than directly close the PR.
>>>
>>> Just my two cents.
>>>
>>> Thanks
>>> Jerry
>>>
>>>
>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com>
>>> wrote:
>>>
>>>> Having a PR closed, especially if due to committers not having hte
>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>> confidence are going to see reopening as "pestering" and busy folks
>>>> are going to see it as a clear indication that their work is not even
>>>> valuable enough for a human to give a reason for closing. In either
>>>> case, the cost of reopening is substantially higher than that button
>>>> press.
>>>>
>>>> How about we start by keeping a report of "at-risk" PRs that have been
>>>> stale for 30 days to make it easier for committers to look at the prs
>>>> that have been long inactive?
>>>>
>>>>
>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>>> wrote:
>>>> > The cost of "reopen" is close to zero, because it is just clicking a
>>>> button.
>>>> > I think you were referring to the cost of closing the pull request,
>>>> and you
>>>> > are assuming people look at the pull requests that have been inactive
>>>> for a
>>>> > long time. That seems equally likely (or unlikely) as committers
>>>> looking at
>>>> > the recently closed pull requests.
>>>> >
>>>> > In either case, most pull requests are scanned through by us when
>>>> they are
>>>> > first open, and if they are important enough, usually they get merged
>>>> > quickly or a target version is set in JIRA. We can definitely improve
>>>> that
>>>> > by making it more explicit.
>>>> >
>>>> >
>>>> >
>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com> wrote:
>>>> >>
>>>> >> From committers' perspective, would they look at closed PRs ?
>>>> >>
>>>> >> If not, the cost is not close to zero.
>>>> >> Meaning, some potentially useful PRs would never see the light of
>>>> day.
>>>> >>
>>>> >> My two cents.
>>>> >>
>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>>> wrote:
>>>> >>>
>>>> >>> Part of it is how difficult it is to automate this. We can build a
>>>> >>> perfect engine with a lot of rules that understand everything. But
>>>> the more
>>>> >>> complicated rules we need, the more unlikely for any of these to
>>>> happen. So
>>>> >>> I'd rather do this and create a nice enough message to tell
>>>> contributors
>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>> zero (i.e.
>>>> >>> click a button on the pull request).
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>>> wrote:
>>>> >>>>
>>>> >>>> bq. close the ones where they don't respond for a week
>>>> >>>>
>>>> >>>> Does this imply that the script understands response from human ?
>>>> >>>>
>>>> >>>> Meaning, would the script use some regex which signifies that the
>>>> >>>> contributor is willing to close the PR ?
>>>> >>>>
>>>> >>>> If the contributor is willing to close, why wouldn't he / she do it
>>>> >>>> him/herself ?
>>>> >>>>
>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>> holden@pigscanfly.ca>
>>>> >>>> wrote:
>>>> >>>>>
>>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>> >>>>> understand wanting to keep the open PRs limited to ones which
>>>> have a
>>>> >>>>> reasonable chance of being merged.
>>>> >>>>>
>>>> >>>>> What about if we filtered for non-mergeable PRs or instead left a
>>>> >>>>> comment asking the author to respond if they are still available
>>>> to move the
>>>> >>>>> PR forward - and close the ones where they don't respond for a
>>>> week?
>>>> >>>>>
>>>> >>>>> Just a suggestion.
>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> I had one PR which got merged after 3 months.
>>>> >>>>>>
>>>> >>>>>> If the inactivity was due to contributor, I think it can be
>>>> closed
>>>> >>>>>> after 30 days.
>>>> >>>>>> But if the inactivity was due to lack of review, the PR should
>>>> be kept
>>>> >>>>>> open.
>>>> >>>>>>
>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>>> cody@koeninger.org>
>>>> >>>>>> wrote:
>>>> >>>>>>>
>>>> >>>>>>> For what it's worth, I have definitely had PRs that sat
>>>> inactive for
>>>> >>>>>>> more than 30 days due to committers not having time to look at
>>>> them,
>>>> >>>>>>> but did eventually end up successfully being merged.
>>>> >>>>>>>
>>>> >>>>>>> I guess if this just ends up being a committer ping and
>>>> reopening the
>>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>>> underlying
>>>> >>>>>>> issue.
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>>> rxin@databricks.com>
>>>> >>>>>>> wrote:
>>>> >>>>>>> > We have hit a new high in open pull requests: 469 today.
>>>> While we
>>>> >>>>>>> > can
>>>> >>>>>>> > certainly get more review bandwidth, many of these are old and
>>>> >>>>>>> > still open
>>>> >>>>>>> > for other reasons. Some are stale because the original
>>>> authors have
>>>> >>>>>>> > become
>>>> >>>>>>> > busy and inactive, and some others are stale because the
>>>> committers
>>>> >>>>>>> > are not
>>>> >>>>>>> > sure whether the patch would be useful, but have not rejected
>>>> the
>>>> >>>>>>> > patch
>>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>>> closing
>>>> >>>>>>> > pull
>>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>>> with a
>>>> >>>>>>> > nice
>>>> >>>>>>> > message. I just checked and this would close ~ half of the
>>>> pull
>>>> >>>>>>> > requests.
>>>> >>>>>>> >
>>>> >>>>>>> > For example:
>>>> >>>>>>> >
>>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>>> request
>>>> >>>>>>> > has been
>>>> >>>>>>> > inactive for 30 days, we are automatically closing it.
>>>> Closing the
>>>> >>>>>>> > pull
>>>> >>>>>>> > request does not remove it from history and will retain all
>>>> the
>>>> >>>>>>> > diff and
>>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>>> >>>>>>> > continue
>>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>> >>>>>>> >
>>>> >>>>>>> >
>>>> >>>>>>>
>>>> >>>>>>>
>>>> ---------------------------------------------------------------------
>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>> >>>>>>>
>>>> >>>>>>
>>>> >>>>>
>>>> >>>>>
>>>> >>>>> --
>>>> >>>>> Cell : 425-233-8271
>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>> >>>>>
>>>> >>>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>>
>>>>
>>>> --
>>>> busbey
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>
>>>>
>>>
>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Ted Yu <yu...@gmail.com>.
During the months of November / December, the 30 day period should be
relaxed.

Some people(at least in US) may take extended vacation during that time.

For Chinese developers, Spring Festival would bear similar circumstance.

On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gu...@gmail.com> wrote:

> I also think this might not have to be closed only because it is inactive.
>
>
> How about closing issues after 30 days when a committer's comment is added
> at the last without responses from the author?
>
>
> IMHO, If the committers are not sure whether the patch would be useful,
> then I think they should leave some comments why they are not sure, not
> just ignoring.
>
> Or, simply they could ask the author to prove that the patch is useful or
> safe with some references and tests.
>
>
> I think it might be nicer than that users are supposed to keep pinging.
> **Personally**, apparently, I am sometimes a bit worried if pinging
> multiple times can be a bit annoying.
>
>
>
> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:
>
>> It would be better to have a specific technical reason why this PR should
>> be closed, either the implementation is not good or the problem is not
>> valid, or something else. That will actually help the contributor to shape
>> their codes and reopen the PR again. Otherwise reasons like "feel free
>> to reopen for so-and-so reason" is actually discouraging and no difference
>> than directly close the PR.
>>
>> Just my two cents.
>>
>> Thanks
>> Jerry
>>
>>
>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com> wrote:
>>
>>> Having a PR closed, especially if due to committers not having hte
>>> bandwidth to check on things, will be very discouraging to new folks.
>>> Doubly so for those inexperienced with opensource. Even if the message
>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>> confidence are going to see reopening as "pestering" and busy folks
>>> are going to see it as a clear indication that their work is not even
>>> valuable enough for a human to give a reason for closing. In either
>>> case, the cost of reopening is substantially higher than that button
>>> press.
>>>
>>> How about we start by keeping a report of "at-risk" PRs that have been
>>> stale for 30 days to make it easier for committers to look at the prs
>>> that have been long inactive?
>>>
>>>
>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com>
>>> wrote:
>>> > The cost of "reopen" is close to zero, because it is just clicking a
>>> button.
>>> > I think you were referring to the cost of closing the pull request,
>>> and you
>>> > are assuming people look at the pull requests that have been inactive
>>> for a
>>> > long time. That seems equally likely (or unlikely) as committers
>>> looking at
>>> > the recently closed pull requests.
>>> >
>>> > In either case, most pull requests are scanned through by us when they
>>> are
>>> > first open, and if they are important enough, usually they get merged
>>> > quickly or a target version is set in JIRA. We can definitely improve
>>> that
>>> > by making it more explicit.
>>> >
>>> >
>>> >
>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com> wrote:
>>> >>
>>> >> From committers' perspective, would they look at closed PRs ?
>>> >>
>>> >> If not, the cost is not close to zero.
>>> >> Meaning, some potentially useful PRs would never see the light of day.
>>> >>
>>> >> My two cents.
>>> >>
>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>>> wrote:
>>> >>>
>>> >>> Part of it is how difficult it is to automate this. We can build a
>>> >>> perfect engine with a lot of rules that understand everything. But
>>> the more
>>> >>> complicated rules we need, the more unlikely for any of these to
>>> happen. So
>>> >>> I'd rather do this and create a nice enough message to tell
>>> contributors
>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>> zero (i.e.
>>> >>> click a button on the pull request).
>>> >>>
>>> >>>
>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> bq. close the ones where they don't respond for a week
>>> >>>>
>>> >>>> Does this imply that the script understands response from human ?
>>> >>>>
>>> >>>> Meaning, would the script use some regex which signifies that the
>>> >>>> contributor is willing to close the PR ?
>>> >>>>
>>> >>>> If the contributor is willing to close, why wouldn't he / she do it
>>> >>>> him/herself ?
>>> >>>>
>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>> holden@pigscanfly.ca>
>>> >>>> wrote:
>>> >>>>>
>>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>> >>>>> understand wanting to keep the open PRs limited to ones which have
>>> a
>>> >>>>> reasonable chance of being merged.
>>> >>>>>
>>> >>>>> What about if we filtered for non-mergeable PRs or instead left a
>>> >>>>> comment asking the author to respond if they are still available
>>> to move the
>>> >>>>> PR forward - and close the ones where they don't respond for a
>>> week?
>>> >>>>>
>>> >>>>> Just a suggestion.
>>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> I had one PR which got merged after 3 months.
>>> >>>>>>
>>> >>>>>> If the inactivity was due to contributor, I think it can be closed
>>> >>>>>> after 30 days.
>>> >>>>>> But if the inactivity was due to lack of review, the PR should be
>>> kept
>>> >>>>>> open.
>>> >>>>>>
>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>>> cody@koeninger.org>
>>> >>>>>> wrote:
>>> >>>>>>>
>>> >>>>>>> For what it's worth, I have definitely had PRs that sat inactive
>>> for
>>> >>>>>>> more than 30 days due to committers not having time to look at
>>> them,
>>> >>>>>>> but did eventually end up successfully being merged.
>>> >>>>>>>
>>> >>>>>>> I guess if this just ends up being a committer ping and
>>> reopening the
>>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>>> underlying
>>> >>>>>>> issue.
>>> >>>>>>>
>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <
>>> rxin@databricks.com>
>>> >>>>>>> wrote:
>>> >>>>>>> > We have hit a new high in open pull requests: 469 today. While
>>> we
>>> >>>>>>> > can
>>> >>>>>>> > certainly get more review bandwidth, many of these are old and
>>> >>>>>>> > still open
>>> >>>>>>> > for other reasons. Some are stale because the original authors
>>> have
>>> >>>>>>> > become
>>> >>>>>>> > busy and inactive, and some others are stale because the
>>> committers
>>> >>>>>>> > are not
>>> >>>>>>> > sure whether the patch would be useful, but have not rejected
>>> the
>>> >>>>>>> > patch
>>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by
>>> closing
>>> >>>>>>> > pull
>>> >>>>>>> > requests that have been inactive for greater than 30 days,
>>> with a
>>> >>>>>>> > nice
>>> >>>>>>> > message. I just checked and this would close ~ half of the pull
>>> >>>>>>> > requests.
>>> >>>>>>> >
>>> >>>>>>> > For example:
>>> >>>>>>> >
>>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>>> request
>>> >>>>>>> > has been
>>> >>>>>>> > inactive for 30 days, we are automatically closing it. Closing
>>> the
>>> >>>>>>> > pull
>>> >>>>>>> > request does not remove it from history and will retain all the
>>> >>>>>>> > diff and
>>> >>>>>>> > review comments. If you have the bandwidth and would like to
>>> >>>>>>> > continue
>>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>> >>>>>>> >
>>> >>>>>>> >
>>> >>>>>>>
>>> >>>>>>>
>>> ---------------------------------------------------------------------
>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>> >>>>>>>
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> Cell : 425-233-8271
>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> busbey
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Hyukjin Kwon <gu...@gmail.com>.
I also think this might not have to be closed only because it is inactive.


How about closing issues after 30 days when a committer's comment is added
at the last without responses from the author?


IMHO, If the committers are not sure whether the patch would be useful,
then I think they should leave some comments why they are not sure, not
just ignoring.

Or, simply they could ask the author to prove that the patch is useful or
safe with some references and tests.


I think it might be nicer than that users are supposed to keep pinging.
**Personally**, apparently, I am sometimes a bit worried if pinging
multiple times can be a bit annoying.



2016-04-19 9:56 GMT+09:00 Saisai Shao <sa...@gmail.com>:

> It would be better to have a specific technical reason why this PR should
> be closed, either the implementation is not good or the problem is not
> valid, or something else. That will actually help the contributor to shape
> their codes and reopen the PR again. Otherwise reasons like "feel free to
> reopen for so-and-so reason" is actually discouraging and no difference
> than directly close the PR.
>
> Just my two cents.
>
> Thanks
> Jerry
>
>
> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com> wrote:
>
>> Having a PR closed, especially if due to committers not having hte
>> bandwidth to check on things, will be very discouraging to new folks.
>> Doubly so for those inexperienced with opensource. Even if the message
>> says "feel free to reopen for so-and-so reason", new folks who lack
>> confidence are going to see reopening as "pestering" and busy folks
>> are going to see it as a clear indication that their work is not even
>> valuable enough for a human to give a reason for closing. In either
>> case, the cost of reopening is substantially higher than that button
>> press.
>>
>> How about we start by keeping a report of "at-risk" PRs that have been
>> stale for 30 days to make it easier for committers to look at the prs
>> that have been long inactive?
>>
>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com> wrote:
>> > The cost of "reopen" is close to zero, because it is just clicking a
>> button.
>> > I think you were referring to the cost of closing the pull request, and
>> you
>> > are assuming people look at the pull requests that have been inactive
>> for a
>> > long time. That seems equally likely (or unlikely) as committers
>> looking at
>> > the recently closed pull requests.
>> >
>> > In either case, most pull requests are scanned through by us when they
>> are
>> > first open, and if they are important enough, usually they get merged
>> > quickly or a target version is set in JIRA. We can definitely improve
>> that
>> > by making it more explicit.
>> >
>> >
>> >
>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com> wrote:
>> >>
>> >> From committers' perspective, would they look at closed PRs ?
>> >>
>> >> If not, the cost is not close to zero.
>> >> Meaning, some potentially useful PRs would never see the light of day.
>> >>
>> >> My two cents.
>> >>
>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
>> wrote:
>> >>>
>> >>> Part of it is how difficult it is to automate this. We can build a
>> >>> perfect engine with a lot of rules that understand everything. But
>> the more
>> >>> complicated rules we need, the more unlikely for any of these to
>> happen. So
>> >>> I'd rather do this and create a nice enough message to tell
>> contributors
>> >>> sometimes mistake happen but the cost to reopen is approximately zero
>> (i.e.
>> >>> click a button on the pull request).
>> >>>
>> >>>
>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com> wrote:
>> >>>>
>> >>>> bq. close the ones where they don't respond for a week
>> >>>>
>> >>>> Does this imply that the script understands response from human ?
>> >>>>
>> >>>> Meaning, would the script use some regex which signifies that the
>> >>>> contributor is willing to close the PR ?
>> >>>>
>> >>>> If the contributor is willing to close, why wouldn't he / she do it
>> >>>> him/herself ?
>> >>>>
>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <holden@pigscanfly.ca
>> >
>> >>>> wrote:
>> >>>>>
>> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
>> >>>>> understand wanting to keep the open PRs limited to ones which have a
>> >>>>> reasonable chance of being merged.
>> >>>>>
>> >>>>> What about if we filtered for non-mergeable PRs or instead left a
>> >>>>> comment asking the author to respond if they are still available to
>> move the
>> >>>>> PR forward - and close the ones where they don't respond for a week?
>> >>>>>
>> >>>>> Just a suggestion.
>> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> I had one PR which got merged after 3 months.
>> >>>>>>
>> >>>>>> If the inactivity was due to contributor, I think it can be closed
>> >>>>>> after 30 days.
>> >>>>>> But if the inactivity was due to lack of review, the PR should be
>> kept
>> >>>>>> open.
>> >>>>>>
>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
>> cody@koeninger.org>
>> >>>>>> wrote:
>> >>>>>>>
>> >>>>>>> For what it's worth, I have definitely had PRs that sat inactive
>> for
>> >>>>>>> more than 30 days due to committers not having time to look at
>> them,
>> >>>>>>> but did eventually end up successfully being merged.
>> >>>>>>>
>> >>>>>>> I guess if this just ends up being a committer ping and reopening
>> the
>> >>>>>>> PR, it's fine, but I don't know if it really addresses the
>> underlying
>> >>>>>>> issue.
>> >>>>>>>
>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rxin@databricks.com
>> >
>> >>>>>>> wrote:
>> >>>>>>> > We have hit a new high in open pull requests: 469 today. While
>> we
>> >>>>>>> > can
>> >>>>>>> > certainly get more review bandwidth, many of these are old and
>> >>>>>>> > still open
>> >>>>>>> > for other reasons. Some are stale because the original authors
>> have
>> >>>>>>> > become
>> >>>>>>> > busy and inactive, and some others are stale because the
>> committers
>> >>>>>>> > are not
>> >>>>>>> > sure whether the patch would be useful, but have not rejected
>> the
>> >>>>>>> > patch
>> >>>>>>> > explicitly. We can cut down the signal to noise ratio by closing
>> >>>>>>> > pull
>> >>>>>>> > requests that have been inactive for greater than 30 days, with
>> a
>> >>>>>>> > nice
>> >>>>>>> > message. I just checked and this would close ~ half of the pull
>> >>>>>>> > requests.
>> >>>>>>> >
>> >>>>>>> > For example:
>> >>>>>>> >
>> >>>>>>> > "Thank you for creating this pull request. Since this pull
>> request
>> >>>>>>> > has been
>> >>>>>>> > inactive for 30 days, we are automatically closing it. Closing
>> the
>> >>>>>>> > pull
>> >>>>>>> > request does not remove it from history and will retain all the
>> >>>>>>> > diff and
>> >>>>>>> > review comments. If you have the bandwidth and would like to
>> >>>>>>> > continue
>> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
>> >>>>>>> >
>> >>>>>>> >
>> >>>>>>>
>> >>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Cell : 425-233-8271
>> >>>>> Twitter: https://twitter.com/holdenkarau
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>>
>>
>> --
>> busbey
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> For additional commands, e-mail: dev-help@spark.apache.org
>>
>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Saisai Shao <sa...@gmail.com>.
It would be better to have a specific technical reason why this PR should
be closed, either the implementation is not good or the problem is not
valid, or something else. That will actually help the contributor to shape
their codes and reopen the PR again. Otherwise reasons like "feel free to
reopen for so-and-so reason" is actually discouraging and no difference
than directly close the PR.

Just my two cents.

Thanks
Jerry


On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <bu...@cloudera.com> wrote:

> Having a PR closed, especially if due to committers not having hte
> bandwidth to check on things, will be very discouraging to new folks.
> Doubly so for those inexperienced with opensource. Even if the message
> says "feel free to reopen for so-and-so reason", new folks who lack
> confidence are going to see reopening as "pestering" and busy folks
> are going to see it as a clear indication that their work is not even
> valuable enough for a human to give a reason for closing. In either
> case, the cost of reopening is substantially higher than that button
> press.
>
> How about we start by keeping a report of "at-risk" PRs that have been
> stale for 30 days to make it easier for committers to look at the prs
> that have been long inactive?
>
> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com> wrote:
> > The cost of "reopen" is close to zero, because it is just clicking a
> button.
> > I think you were referring to the cost of closing the pull request, and
> you
> > are assuming people look at the pull requests that have been inactive
> for a
> > long time. That seems equally likely (or unlikely) as committers looking
> at
> > the recently closed pull requests.
> >
> > In either case, most pull requests are scanned through by us when they
> are
> > first open, and if they are important enough, usually they get merged
> > quickly or a target version is set in JIRA. We can definitely improve
> that
> > by making it more explicit.
> >
> >
> >
> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com> wrote:
> >>
> >> From committers' perspective, would they look at closed PRs ?
> >>
> >> If not, the cost is not close to zero.
> >> Meaning, some potentially useful PRs would never see the light of day.
> >>
> >> My two cents.
> >>
> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com>
> wrote:
> >>>
> >>> Part of it is how difficult it is to automate this. We can build a
> >>> perfect engine with a lot of rules that understand everything. But the
> more
> >>> complicated rules we need, the more unlikely for any of these to
> happen. So
> >>> I'd rather do this and create a nice enough message to tell
> contributors
> >>> sometimes mistake happen but the cost to reopen is approximately zero
> (i.e.
> >>> click a button on the pull request).
> >>>
> >>>
> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com> wrote:
> >>>>
> >>>> bq. close the ones where they don't respond for a week
> >>>>
> >>>> Does this imply that the script understands response from human ?
> >>>>
> >>>> Meaning, would the script use some regex which signifies that the
> >>>> contributor is willing to close the PR ?
> >>>>
> >>>> If the contributor is willing to close, why wouldn't he / she do it
> >>>> him/herself ?
> >>>>
> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <ho...@pigscanfly.ca>
> >>>> wrote:
> >>>>>
> >>>>> Personally I'd rather err on the side of keeping PRs open, but I
> >>>>> understand wanting to keep the open PRs limited to ones which have a
> >>>>> reasonable chance of being merged.
> >>>>>
> >>>>> What about if we filtered for non-mergeable PRs or instead left a
> >>>>> comment asking the author to respond if they are still available to
> move the
> >>>>> PR forward - and close the ones where they don't respond for a week?
> >>>>>
> >>>>> Just a suggestion.
> >>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
> >>>>>>
> >>>>>> I had one PR which got merged after 3 months.
> >>>>>>
> >>>>>> If the inactivity was due to contributor, I think it can be closed
> >>>>>> after 30 days.
> >>>>>> But if the inactivity was due to lack of review, the PR should be
> kept
> >>>>>> open.
> >>>>>>
> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <
> cody@koeninger.org>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> For what it's worth, I have definitely had PRs that sat inactive
> for
> >>>>>>> more than 30 days due to committers not having time to look at
> them,
> >>>>>>> but did eventually end up successfully being merged.
> >>>>>>>
> >>>>>>> I guess if this just ends up being a committer ping and reopening
> the
> >>>>>>> PR, it's fine, but I don't know if it really addresses the
> underlying
> >>>>>>> issue.
> >>>>>>>
> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
> >>>>>>> wrote:
> >>>>>>> > We have hit a new high in open pull requests: 469 today. While we
> >>>>>>> > can
> >>>>>>> > certainly get more review bandwidth, many of these are old and
> >>>>>>> > still open
> >>>>>>> > for other reasons. Some are stale because the original authors
> have
> >>>>>>> > become
> >>>>>>> > busy and inactive, and some others are stale because the
> committers
> >>>>>>> > are not
> >>>>>>> > sure whether the patch would be useful, but have not rejected the
> >>>>>>> > patch
> >>>>>>> > explicitly. We can cut down the signal to noise ratio by closing
> >>>>>>> > pull
> >>>>>>> > requests that have been inactive for greater than 30 days, with a
> >>>>>>> > nice
> >>>>>>> > message. I just checked and this would close ~ half of the pull
> >>>>>>> > requests.
> >>>>>>> >
> >>>>>>> > For example:
> >>>>>>> >
> >>>>>>> > "Thank you for creating this pull request. Since this pull
> request
> >>>>>>> > has been
> >>>>>>> > inactive for 30 days, we are automatically closing it. Closing
> the
> >>>>>>> > pull
> >>>>>>> > request does not remove it from history and will retain all the
> >>>>>>> > diff and
> >>>>>>> > review comments. If you have the bandwidth and would like to
> >>>>>>> > continue
> >>>>>>> > pushing this forward, please reopen it. Thanks again!"
> >>>>>>> >
> >>>>>>> >
> >>>>>>>
> >>>>>>>
> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Cell : 425-233-8271
> >>>>> Twitter: https://twitter.com/holdenkarau
> >>>>>
> >>>>
> >>>
> >>
> >
>
>
>
> --
> busbey
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Sean Busbey <bu...@cloudera.com>.
Having a PR closed, especially if due to committers not having hte
bandwidth to check on things, will be very discouraging to new folks.
Doubly so for those inexperienced with opensource. Even if the message
says "feel free to reopen for so-and-so reason", new folks who lack
confidence are going to see reopening as "pestering" and busy folks
are going to see it as a clear indication that their work is not even
valuable enough for a human to give a reason for closing. In either
case, the cost of reopening is substantially higher than that button
press.

How about we start by keeping a report of "at-risk" PRs that have been
stale for 30 days to make it easier for committers to look at the prs
that have been long inactive?

On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rx...@databricks.com> wrote:
> The cost of "reopen" is close to zero, because it is just clicking a button.
> I think you were referring to the cost of closing the pull request, and you
> are assuming people look at the pull requests that have been inactive for a
> long time. That seems equally likely (or unlikely) as committers looking at
> the recently closed pull requests.
>
> In either case, most pull requests are scanned through by us when they are
> first open, and if they are important enough, usually they get merged
> quickly or a target version is set in JIRA. We can definitely improve that
> by making it more explicit.
>
>
>
> On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>> From committers' perspective, would they look at closed PRs ?
>>
>> If not, the cost is not close to zero.
>> Meaning, some potentially useful PRs would never see the light of day.
>>
>> My two cents.
>>
>> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com> wrote:
>>>
>>> Part of it is how difficult it is to automate this. We can build a
>>> perfect engine with a lot of rules that understand everything. But the more
>>> complicated rules we need, the more unlikely for any of these to happen. So
>>> I'd rather do this and create a nice enough message to tell contributors
>>> sometimes mistake happen but the cost to reopen is approximately zero (i.e.
>>> click a button on the pull request).
>>>
>>>
>>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com> wrote:
>>>>
>>>> bq. close the ones where they don't respond for a week
>>>>
>>>> Does this imply that the script understands response from human ?
>>>>
>>>> Meaning, would the script use some regex which signifies that the
>>>> contributor is willing to close the PR ?
>>>>
>>>> If the contributor is willing to close, why wouldn't he / she do it
>>>> him/herself ?
>>>>
>>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>>>
>>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>>> understand wanting to keep the open PRs limited to ones which have a
>>>>> reasonable chance of being merged.
>>>>>
>>>>> What about if we filtered for non-mergeable PRs or instead left a
>>>>> comment asking the author to respond if they are still available to move the
>>>>> PR forward - and close the ones where they don't respond for a week?
>>>>>
>>>>> Just a suggestion.
>>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>>>
>>>>>> I had one PR which got merged after 3 months.
>>>>>>
>>>>>> If the inactivity was due to contributor, I think it can be closed
>>>>>> after 30 days.
>>>>>> But if the inactivity was due to lack of review, the PR should be kept
>>>>>> open.
>>>>>>
>>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> For what it's worth, I have definitely had PRs that sat inactive for
>>>>>>> more than 30 days due to committers not having time to look at them,
>>>>>>> but did eventually end up successfully being merged.
>>>>>>>
>>>>>>> I guess if this just ends up being a committer ping and reopening the
>>>>>>> PR, it's fine, but I don't know if it really addresses the underlying
>>>>>>> issue.
>>>>>>>
>>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
>>>>>>> wrote:
>>>>>>> > We have hit a new high in open pull requests: 469 today. While we
>>>>>>> > can
>>>>>>> > certainly get more review bandwidth, many of these are old and
>>>>>>> > still open
>>>>>>> > for other reasons. Some are stale because the original authors have
>>>>>>> > become
>>>>>>> > busy and inactive, and some others are stale because the committers
>>>>>>> > are not
>>>>>>> > sure whether the patch would be useful, but have not rejected the
>>>>>>> > patch
>>>>>>> > explicitly. We can cut down the signal to noise ratio by closing
>>>>>>> > pull
>>>>>>> > requests that have been inactive for greater than 30 days, with a
>>>>>>> > nice
>>>>>>> > message. I just checked and this would close ~ half of the pull
>>>>>>> > requests.
>>>>>>> >
>>>>>>> > For example:
>>>>>>> >
>>>>>>> > "Thank you for creating this pull request. Since this pull request
>>>>>>> > has been
>>>>>>> > inactive for 30 days, we are automatically closing it. Closing the
>>>>>>> > pull
>>>>>>> > request does not remove it from history and will retain all the
>>>>>>> > diff and
>>>>>>> > review comments. If you have the bandwidth and would like to
>>>>>>> > continue
>>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cell : 425-233-8271
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>>
>>>
>>
>



-- 
busbey

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: auto closing pull requests that have been inactive > 30 days?

Posted by Reynold Xin <rx...@databricks.com>.
The cost of "reopen" is close to zero, because it is just clicking a
button. I think you were referring to the cost of closing the pull request,
and you are assuming people look at the pull requests that have been
inactive for a long time. That seems equally likely (or unlikely) as
committers looking at the recently closed pull requests.

In either case, most pull requests are scanned through by us when they are
first open, and if they are important enough, usually they get merged
quickly or a target version is set in JIRA. We can definitely improve that
by making it more explicit.



On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yu...@gmail.com> wrote:

> From committers' perspective, would they look at closed PRs ?
>
> If not, the cost is not close to zero.
> Meaning, some potentially useful PRs would never see the light of day.
>
> My two cents.
>
> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com> wrote:
>
>> Part of it is how difficult it is to automate this. We can build a
>> perfect engine with a lot of rules that understand everything. But the more
>> complicated rules we need, the more unlikely for any of these to happen. So
>> I'd rather do this and create a nice enough message to tell contributors
>> sometimes mistake happen but the cost to reopen is approximately zero (i.e.
>> click a button on the pull request).
>>
>>
>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com> wrote:
>>
>>> bq. close the ones where they don't respond for a week
>>>
>>> Does this imply that the script understands response from human ?
>>>
>>> Meaning, would the script use some regex which signifies that the
>>> contributor is willing to close the PR ?
>>>
>>> If the contributor is willing to close, why wouldn't he / she do it
>>> him/herself ?
>>>
>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>
>>>> Personally I'd rather err on the side of keeping PRs open, but I
>>>> understand wanting to keep the open PRs limited to ones which have a
>>>> reasonable chance of being merged.
>>>>
>>>> What about if we filtered for non-mergeable PRs or instead left a
>>>> comment asking the author to respond if they are still available to move
>>>> the PR forward - and close the ones where they don't respond for a week?
>>>>
>>>> Just a suggestion.
>>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>>
>>>>> I had one PR which got merged after 3 months.
>>>>>
>>>>> If the inactivity was due to contributor, I think it can be closed
>>>>> after 30 days.
>>>>> But if the inactivity was due to lack of review, the PR should be kept
>>>>> open.
>>>>>
>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org>
>>>>> wrote:
>>>>>
>>>>>> For what it's worth, I have definitely had PRs that sat inactive for
>>>>>> more than 30 days due to committers not having time to look at them,
>>>>>> but did eventually end up successfully being merged.
>>>>>>
>>>>>> I guess if this just ends up being a committer ping and reopening the
>>>>>> PR, it's fine, but I don't know if it really addresses the underlying
>>>>>> issue.
>>>>>>
>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
>>>>>> wrote:
>>>>>> > We have hit a new high in open pull requests: 469 today. While we
>>>>>> can
>>>>>> > certainly get more review bandwidth, many of these are old and
>>>>>> still open
>>>>>> > for other reasons. Some are stale because the original authors have
>>>>>> become
>>>>>> > busy and inactive, and some others are stale because the committers
>>>>>> are not
>>>>>> > sure whether the patch would be useful, but have not rejected the
>>>>>> patch
>>>>>> > explicitly. We can cut down the signal to noise ratio by closing
>>>>>> pull
>>>>>> > requests that have been inactive for greater than 30 days, with a
>>>>>> nice
>>>>>> > message. I just checked and this would close ~ half of the pull
>>>>>> requests.
>>>>>> >
>>>>>> > For example:
>>>>>> >
>>>>>> > "Thank you for creating this pull request. Since this pull request
>>>>>> has been
>>>>>> > inactive for 30 days, we are automatically closing it. Closing the
>>>>>> pull
>>>>>> > request does not remove it from history and will retain all the
>>>>>> diff and
>>>>>> > review comments. If you have the bandwidth and would like to
>>>>>> continue
>>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> Cell : 425-233-8271
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>>
>>>
>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Ted Yu <yu...@gmail.com>.
>From committers' perspective, would they look at closed PRs ?

If not, the cost is not close to zero.
Meaning, some potentially useful PRs would never see the light of day.

My two cents.

On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rx...@databricks.com> wrote:

> Part of it is how difficult it is to automate this. We can build a perfect
> engine with a lot of rules that understand everything. But the more
> complicated rules we need, the more unlikely for any of these to happen. So
> I'd rather do this and create a nice enough message to tell contributors
> sometimes mistake happen but the cost to reopen is approximately zero (i.e.
> click a button on the pull request).
>
>
> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> bq. close the ones where they don't respond for a week
>>
>> Does this imply that the script understands response from human ?
>>
>> Meaning, would the script use some regex which signifies that the
>> contributor is willing to close the PR ?
>>
>> If the contributor is willing to close, why wouldn't he / she do it
>> him/herself ?
>>
>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> Personally I'd rather err on the side of keeping PRs open, but I
>>> understand wanting to keep the open PRs limited to ones which have a
>>> reasonable chance of being merged.
>>>
>>> What about if we filtered for non-mergeable PRs or instead left a
>>> comment asking the author to respond if they are still available to move
>>> the PR forward - and close the ones where they don't respond for a week?
>>>
>>> Just a suggestion.
>>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>>
>>>> I had one PR which got merged after 3 months.
>>>>
>>>> If the inactivity was due to contributor, I think it can be closed
>>>> after 30 days.
>>>> But if the inactivity was due to lack of review, the PR should be kept
>>>> open.
>>>>
>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org>
>>>> wrote:
>>>>
>>>>> For what it's worth, I have definitely had PRs that sat inactive for
>>>>> more than 30 days due to committers not having time to look at them,
>>>>> but did eventually end up successfully being merged.
>>>>>
>>>>> I guess if this just ends up being a committer ping and reopening the
>>>>> PR, it's fine, but I don't know if it really addresses the underlying
>>>>> issue.
>>>>>
>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
>>>>> wrote:
>>>>> > We have hit a new high in open pull requests: 469 today. While we can
>>>>> > certainly get more review bandwidth, many of these are old and still
>>>>> open
>>>>> > for other reasons. Some are stale because the original authors have
>>>>> become
>>>>> > busy and inactive, and some others are stale because the committers
>>>>> are not
>>>>> > sure whether the patch would be useful, but have not rejected the
>>>>> patch
>>>>> > explicitly. We can cut down the signal to noise ratio by closing pull
>>>>> > requests that have been inactive for greater than 30 days, with a
>>>>> nice
>>>>> > message. I just checked and this would close ~ half of the pull
>>>>> requests.
>>>>> >
>>>>> > For example:
>>>>> >
>>>>> > "Thank you for creating this pull request. Since this pull request
>>>>> has been
>>>>> > inactive for 30 days, we are automatically closing it. Closing the
>>>>> pull
>>>>> > request does not remove it from history and will retain all the diff
>>>>> and
>>>>> > review comments. If you have the bandwidth and would like to continue
>>>>> > pushing this forward, please reopen it. Thanks again!"
>>>>> >
>>>>> >
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Cell : 425-233-8271
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>>
>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Reynold Xin <rx...@databricks.com>.
Part of it is how difficult it is to automate this. We can build a perfect
engine with a lot of rules that understand everything. But the more
complicated rules we need, the more unlikely for any of these to happen. So
I'd rather do this and create a nice enough message to tell contributors
sometimes mistake happen but the cost to reopen is approximately zero (i.e.
click a button on the pull request).


On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yu...@gmail.com> wrote:

> bq. close the ones where they don't respond for a week
>
> Does this imply that the script understands response from human ?
>
> Meaning, would the script use some regex which signifies that the
> contributor is willing to close the PR ?
>
> If the contributor is willing to close, why wouldn't he / she do it
> him/herself ?
>
> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
>
>> Personally I'd rather err on the side of keeping PRs open, but I
>> understand wanting to keep the open PRs limited to ones which have a
>> reasonable chance of being merged.
>>
>> What about if we filtered for non-mergeable PRs or instead left a comment
>> asking the author to respond if they are still available to move the PR
>> forward - and close the ones where they don't respond for a week?
>>
>> Just a suggestion.
>> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>>
>>> I had one PR which got merged after 3 months.
>>>
>>> If the inactivity was due to contributor, I think it can be closed after
>>> 30 days.
>>> But if the inactivity was due to lack of review, the PR should be kept
>>> open.
>>>
>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org>
>>> wrote:
>>>
>>>> For what it's worth, I have definitely had PRs that sat inactive for
>>>> more than 30 days due to committers not having time to look at them,
>>>> but did eventually end up successfully being merged.
>>>>
>>>> I guess if this just ends up being a committer ping and reopening the
>>>> PR, it's fine, but I don't know if it really addresses the underlying
>>>> issue.
>>>>
>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
>>>> wrote:
>>>> > We have hit a new high in open pull requests: 469 today. While we can
>>>> > certainly get more review bandwidth, many of these are old and still
>>>> open
>>>> > for other reasons. Some are stale because the original authors have
>>>> become
>>>> > busy and inactive, and some others are stale because the committers
>>>> are not
>>>> > sure whether the patch would be useful, but have not rejected the
>>>> patch
>>>> > explicitly. We can cut down the signal to noise ratio by closing pull
>>>> > requests that have been inactive for greater than 30 days, with a nice
>>>> > message. I just checked and this would close ~ half of the pull
>>>> requests.
>>>> >
>>>> > For example:
>>>> >
>>>> > "Thank you for creating this pull request. Since this pull request
>>>> has been
>>>> > inactive for 30 days, we are automatically closing it. Closing the
>>>> pull
>>>> > request does not remove it from history and will retain all the diff
>>>> and
>>>> > review comments. If you have the bandwidth and would like to continue
>>>> > pushing this forward, please reopen it. Thanks again!"
>>>> >
>>>> >
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>
>>>>
>>>
>>
>> --
>> Cell : 425-233-8271
>> Twitter: https://twitter.com/holdenkarau
>>
>>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Ted Yu <yu...@gmail.com>.
bq. close the ones where they don't respond for a week

Does this imply that the script understands response from human ?

Meaning, would the script use some regex which signifies that the
contributor is willing to close the PR ?

If the contributor is willing to close, why wouldn't he / she do it
him/herself ?

On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <ho...@pigscanfly.ca> wrote:

> Personally I'd rather err on the side of keeping PRs open, but I
> understand wanting to keep the open PRs limited to ones which have a
> reasonable chance of being merged.
>
> What about if we filtered for non-mergeable PRs or instead left a comment
> asking the author to respond if they are still available to move the PR
> forward - and close the ones where they don't respond for a week?
>
> Just a suggestion.
> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>
>> I had one PR which got merged after 3 months.
>>
>> If the inactivity was due to contributor, I think it can be closed after
>> 30 days.
>> But if the inactivity was due to lack of review, the PR should be kept
>> open.
>>
>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>>
>>> For what it's worth, I have definitely had PRs that sat inactive for
>>> more than 30 days due to committers not having time to look at them,
>>> but did eventually end up successfully being merged.
>>>
>>> I guess if this just ends up being a committer ping and reopening the
>>> PR, it's fine, but I don't know if it really addresses the underlying
>>> issue.
>>>
>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
>>> wrote:
>>> > We have hit a new high in open pull requests: 469 today. While we can
>>> > certainly get more review bandwidth, many of these are old and still
>>> open
>>> > for other reasons. Some are stale because the original authors have
>>> become
>>> > busy and inactive, and some others are stale because the committers
>>> are not
>>> > sure whether the patch would be useful, but have not rejected the patch
>>> > explicitly. We can cut down the signal to noise ratio by closing pull
>>> > requests that have been inactive for greater than 30 days, with a nice
>>> > message. I just checked and this would close ~ half of the pull
>>> requests.
>>> >
>>> > For example:
>>> >
>>> > "Thank you for creating this pull request. Since this pull request has
>>> been
>>> > inactive for 30 days, we are automatically closing it. Closing the pull
>>> > request does not remove it from history and will retain all the diff
>>> and
>>> > review comments. If you have the bandwidth and would like to continue
>>> > pushing this forward, please reopen it. Thanks again!"
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Marcin Tustin <mt...@handybook.com>.
+1 and at the same time maybe surface a report to this list of PRs which
need committer action and have only had submitters responding to pings in
the last 30 days?

On Mon, Apr 18, 2016 at 3:33 PM, Holden Karau <ho...@pigscanfly.ca> wrote:

> Personally I'd rather err on the side of keeping PRs open, but I
> understand wanting to keep the open PRs limited to ones which have a
> reasonable chance of being merged.
>
> What about if we filtered for non-mergeable PRs or instead left a comment
> asking the author to respond if they are still available to move the PR
> forward - and close the ones where they don't respond for a week?
>
> Just a suggestion.
>
> On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:
>
>> I had one PR which got merged after 3 months.
>>
>> If the inactivity was due to contributor, I think it can be closed after
>> 30 days.
>> But if the inactivity was due to lack of review, the PR should be kept
>> open.
>>
>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org>
>> wrote:
>>
>>> For what it's worth, I have definitely had PRs that sat inactive for
>>> more than 30 days due to committers not having time to look at them,
>>> but did eventually end up successfully being merged.
>>>
>>> I guess if this just ends up being a committer ping and reopening the
>>> PR, it's fine, but I don't know if it really addresses the underlying
>>> issue.
>>>
>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com>
>>> wrote:
>>> > We have hit a new high in open pull requests: 469 today. While we can
>>> > certainly get more review bandwidth, many of these are old and still
>>> open
>>> > for other reasons. Some are stale because the original authors have
>>> become
>>> > busy and inactive, and some others are stale because the committers
>>> are not
>>> > sure whether the patch would be useful, but have not rejected the patch
>>> > explicitly. We can cut down the signal to noise ratio by closing pull
>>> > requests that have been inactive for greater than 30 days, with a nice
>>> > message. I just checked and this would close ~ half of the pull
>>> requests.
>>> >
>>> > For example:
>>> >
>>> > "Thank you for creating this pull request. Since this pull request has
>>> been
>>> > inactive for 30 days, we are automatically closing it. Closing the pull
>>> > request does not remove it from history and will retain all the diff
>>> and
>>> > review comments. If you have the bandwidth and would like to continue
>>> > pushing this forward, please reopen it. Thanks again!"
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>
>>>
>>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>
>

-- 
Want to work at Handy? Check out our culture deck and open roles 
<http://www.handy.com/careers>
Latest news <http://www.handy.com/press> at Handy
Handy just raised $50m 
<http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> led 
by Fidelity


Re: auto closing pull requests that have been inactive > 30 days?

Posted by Holden Karau <ho...@pigscanfly.ca>.
Personally I'd rather err on the side of keeping PRs open, but I understand
wanting to keep the open PRs limited to ones which have a reasonable chance
of being merged.

What about if we filtered for non-mergeable PRs or instead left a comment
asking the author to respond if they are still available to move the PR
forward - and close the ones where they don't respond for a week?

Just a suggestion.
On Monday, April 18, 2016, Ted Yu <yu...@gmail.com> wrote:

> I had one PR which got merged after 3 months.
>
> If the inactivity was due to contributor, I think it can be closed after
> 30 days.
> But if the inactivity was due to lack of review, the PR should be kept
> open.
>
> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <cody@koeninger.org
> <javascript:_e(%7B%7D,'cvml','cody@koeninger.org');>> wrote:
>
>> For what it's worth, I have definitely had PRs that sat inactive for
>> more than 30 days due to committers not having time to look at them,
>> but did eventually end up successfully being merged.
>>
>> I guess if this just ends up being a committer ping and reopening the
>> PR, it's fine, but I don't know if it really addresses the underlying
>> issue.
>>
>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rxin@databricks.com
>> <javascript:_e(%7B%7D,'cvml','rxin@databricks.com');>> wrote:
>> > We have hit a new high in open pull requests: 469 today. While we can
>> > certainly get more review bandwidth, many of these are old and still
>> open
>> > for other reasons. Some are stale because the original authors have
>> become
>> > busy and inactive, and some others are stale because the committers are
>> not
>> > sure whether the patch would be useful, but have not rejected the patch
>> > explicitly. We can cut down the signal to noise ratio by closing pull
>> > requests that have been inactive for greater than 30 days, with a nice
>> > message. I just checked and this would close ~ half of the pull
>> requests.
>> >
>> > For example:
>> >
>> > "Thank you for creating this pull request. Since this pull request has
>> been
>> > inactive for 30 days, we are automatically closing it. Closing the pull
>> > request does not remove it from history and will retain all the diff and
>> > review comments. If you have the bandwidth and would like to continue
>> > pushing this forward, please reopen it. Thanks again!"
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>> <javascript:_e(%7B%7D,'cvml','dev-unsubscribe@spark.apache.org');>
>> For additional commands, e-mail: dev-help@spark.apache.org
>> <javascript:_e(%7B%7D,'cvml','dev-help@spark.apache.org');>
>>
>>
>

-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Ted Yu <yu...@gmail.com>.
I had one PR which got merged after 3 months.

If the inactivity was due to contributor, I think it can be closed after 30
days.
But if the inactivity was due to lack of review, the PR should be kept open.

On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger <co...@koeninger.org> wrote:

> For what it's worth, I have definitely had PRs that sat inactive for
> more than 30 days due to committers not having time to look at them,
> but did eventually end up successfully being merged.
>
> I guess if this just ends up being a committer ping and reopening the
> PR, it's fine, but I don't know if it really addresses the underlying
> issue.
>
> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com> wrote:
> > We have hit a new high in open pull requests: 469 today. While we can
> > certainly get more review bandwidth, many of these are old and still open
> > for other reasons. Some are stale because the original authors have
> become
> > busy and inactive, and some others are stale because the committers are
> not
> > sure whether the patch would be useful, but have not rejected the patch
> > explicitly. We can cut down the signal to noise ratio by closing pull
> > requests that have been inactive for greater than 30 days, with a nice
> > message. I just checked and this would close ~ half of the pull requests.
> >
> > For example:
> >
> > "Thank you for creating this pull request. Since this pull request has
> been
> > inactive for 30 days, we are automatically closing it. Closing the pull
> > request does not remove it from history and will retain all the diff and
> > review comments. If you have the bandwidth and would like to continue
> > pushing this forward, please reopen it. Thanks again!"
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>

Re: auto closing pull requests that have been inactive > 30 days?

Posted by Cody Koeninger <co...@koeninger.org>.
For what it's worth, I have definitely had PRs that sat inactive for
more than 30 days due to committers not having time to look at them,
but did eventually end up successfully being merged.

I guess if this just ends up being a committer ping and reopening the
PR, it's fine, but I don't know if it really addresses the underlying
issue.

On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <rx...@databricks.com> wrote:
> We have hit a new high in open pull requests: 469 today. While we can
> certainly get more review bandwidth, many of these are old and still open
> for other reasons. Some are stale because the original authors have become
> busy and inactive, and some others are stale because the committers are not
> sure whether the patch would be useful, but have not rejected the patch
> explicitly. We can cut down the signal to noise ratio by closing pull
> requests that have been inactive for greater than 30 days, with a nice
> message. I just checked and this would close ~ half of the pull requests.
>
> For example:
>
> "Thank you for creating this pull request. Since this pull request has been
> inactive for 30 days, we are automatically closing it. Closing the pull
> request does not remove it from history and will retain all the diff and
> review comments. If you have the bandwidth and would like to continue
> pushing this forward, please reopen it. Thanks again!"
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Re: auto closing pull requests that have been inactive > 30 days?

Posted by Sean Owen <so...@cloudera.com>.
I support this. We used to do this, right? Anecdotally, from watching
the stream most days, most stale PRs are, in descending order of
frequency:

1. Probably not a good change, and not looked at (as a result)
2. Abandoned by submitter at some stage
3. Not an important change, not so bad, not really reviewed
4. A good change that needs review

Whether your PR is #1 or #4 is a matter of perspective. But, I
disagree with the tacit assumption that we're mostly talking about
good PRs being closed because nobody could be bothered; #4 is, I
think, well under 10%.

So generating reports and warnings etc don't seem to address that.
Closing merely means "at the moment there's not a reason to expect
this to proceed, but that could change". Unlike JIRA we don't have
more nuanced resolutions like "WontFix" vs "Later". Welcome the
submitter to reopen if they really think it should be kept alive in
good faith.

As for always stating a close reason: well, a lot of PRs are simply
not very good code, or features that just don't look that useful
relative to their cost. Is it more polite to soft-close or honestly
say "I don't think this is worth adding"?

There is a carrying cost to not doing this. Right now being "Open" is
fairly meaningless, and I've long since stopped bothering reviewing
the backlog of open PRs since it's noise, instead sifting for good new
ones to fast-track.


I agree with comments here that suggest more can be pushed back on the
contributors. We've started with a campaign to get people to read
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
first, which would solve a lot of the "mystery pull request" problems
if followed: no real description, no test, no real problem statement.

Put another way, any contribution that is clearly explained, cleanly
implemented, and makes a good case why its pros outweigh its cons, is
pretty consistently reviewed and quickly. Pushing on contributors to
do these things won't harm good contributions, which already do these
things; it'll make it harder for bad contributions to distract from
them.

And I think the effect of a change like this is, in the main, to push
back mostly on less good contributions.

On Mon, Apr 18, 2016 at 8:02 PM, Reynold Xin <rx...@databricks.com> wrote:
> We have hit a new high in open pull requests: 469 today. While we can
> certainly get more review bandwidth, many of these are old and still open
> for other reasons. Some are stale because the original authors have become
> busy and inactive, and some others are stale because the committers are not
> sure whether the patch would be useful, but have not rejected the patch
> explicitly. We can cut down the signal to noise ratio by closing pull
> requests that have been inactive for greater than 30 days, with a nice
> message. I just checked and this would close ~ half of the pull requests.
>
> For example:
>
> "Thank you for creating this pull request. Since this pull request has been
> inactive for 30 days, we are automatically closing it. Closing the pull
> request does not remove it from history and will retain all the diff and
> review comments. If you have the bandwidth and would like to continue
> pushing this forward, please reopen it. Thanks again!"
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org