You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by Hyukjin Kwon <gu...@gmail.com> on 2019/05/15 09:24:55 UTC

Resolving all JIRAs affecting EOL releases

Hi all,

I would like to propose to resolve all JIRAs that affects EOL releases -
2.2 and below. and affected version
not specified. I was rather against this way and considered this as last
resort in roughly 3 years ago
when we discussed. Now I think we should go ahead with this. See below.

I have been talking care of this for so long time almost every day those 3
years. The number of JIRAs
keeps increasing and it does never go down. Now the number is going over
2500 JIRAs.
Did you guys know? in JIRA, we can only go through page by page up to 1000
items. So, currently we're even
having difficulties to go through every JIRA. We should manually filter out
and check each.
The number is going over the manageable size.

I am not suggesting this without anything actually trying. This is what we
have tried within my visibility:

  1. In roughly 3 years ago, Sean tried to gather committers and even
non-committers people to sort
    out this number. At that time, we were only able to keep this number as
is. After we lost this momentum,
    it kept increasing back.
  2. At least I scanned _all_ the previous JIRAs at least more than two
times and resolved them. Roughly
    once a year. The rest of them are mostly obsolete but not enough
information to investigate further.
  3. I strictly stick to "Contributing to JIRA Maintenance"
https://spark.apache.org/contributing.html and
    resolve JIRAs.
  4. Promoting other people to comment on JIRA or actively resolve them.

One of the facts I realised is the increasing number of committers doesn't
virtually help this much (although
it might be helpful if somebody active in JIRA becomes a committer.)

One of the important thing I should note is that, it's now almost pretty
difficult to reproduce and test the
issues found in EOL releases. We should git clone, checkout, build and
test. And then, see if that issue
still exists in upstream, and fix. This is non-trivial overhead.

Therefore, I would like to propose resolving _all_ the JIRAs that targets
EOL releases - 2.2 and below.
Please let me know if anyone has some concerns or objections.

Thanks.

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
I took an action for those JIRAs.

The JIRAs that has not been updated for the last year, and having affect
version of EOL releases were now:
  - Resolved as 'Incomplete' status
  - Has a 'bulk-closed' label.

Thanks guys.

2019년 5월 21일 (화) 오전 8:35, shane knapp <sk...@berkeley.edu>님이 작성:

> alright, i found 3 jiras that i was able to close:
>
>    1. SPARK-19612 <https://issues.apache.org/jira/browse/SPARK-19612>
>    2.
>       1. SPARK-22996 <https://issues.apache.org/jira/browse/SPARK-22996>
>          2.
>             1. SPARK-22766
>             <https://issues.apache.org/jira/browse/SPARK-22766>
>             2.
>             3.
>
>
> On Sun, May 19, 2019 at 6:43 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Thanks Shane .. the URL I linked somehow didn't work in other people
>> browser. Hope this link works:
>>
>>
>> https://issues.apache.org/jira/browse/SPARK-23492?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>>
>> I will take an action around this time tomorrow considering there were
>> some more changes to make at the last minute.
>>
>>
>> 2019년 5월 19일 (일) 오후 6:39, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>
>>> I will add one more condition for "updated". So, it will additionally
>>> avoid things updated within one year but left open against EOL releases.
>>>
>>> project = SPARK
>>>   AND status in (Open, "In Progress", Reopened)
>>>   AND (
>>>     affectedVersion = EMPTY OR
>>>     NOT (affectedVersion in versionMatch("^3.*")
>>>       OR affectedVersion in versionMatch("^2.4.*")
>>>       OR affectedVersion in versionMatch("^2.3.*")
>>>     )
>>>   )
>>>   AND updated <= -52w
>>>
>>>
>>> https://issues.apache.org/jira/issues/?filter=12344168&jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>>>
>>> This still reduces JIRAs under 1000 which I originally targeted.
>>>
>>>
>>>
>>> 2019년 5월 19일 (일) 오후 6:08, Sean Owen <sr...@gmail.com>님이 작성:
>>>
>>>> I'd only tweak this to perhaps not close JIRAs that have been updated
>>>> recently -- even just avoiding things updated in the last month. For
>>>> example this would close
>>>> https://issues.apache.org/jira/browse/SPARK-27758 which was opened
>>>> Friday (though, for other reasons it should probably be closed). Still I
>>>> don't mind it under the logic that it has been reported against 2.1.0.
>>>>
>>>> On the other hand, I'd go further and close _anything_ not updated in a
>>>> long time, like a year (or 2 if feeling conservative). That is there's
>>>> probably a lot of old cruft out there that wasn't marked with an Affected
>>>> Version, before that was required.
>>>>
>>>> On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks guys.
>>>>>
>>>>> This thread got more than 3 PMC votes without any objection. I
>>>>> slightly edited JQL from Abdeali's suggestion (thanks, Abdeali).
>>>>>
>>>>>
>>>>> JQL:
>>>>>
>>>>> project = SPARK
>>>>>   AND status in (Open, "In Progress", Reopened)
>>>>>   AND (
>>>>>     affectedVersion = EMPTY OR
>>>>>     NOT (affectedVersion in versionMatch("^3.*")
>>>>>       OR affectedVersion in versionMatch("^2.4.*")
>>>>>       OR affectedVersion in versionMatch("^2.3.*")
>>>>>     )
>>>>>   )
>>>>>
>>>>>
>>>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)
>>>>>
>>>>>
>>>>> It means we will resolve all JIRAs that have EOL releases as affected
>>>>> versions, including no version specified in affected versions - this will
>>>>> reduce open JIRAs under 900.
>>>>>
>>>>> Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
>>>>> time, I will
>>>>> - Label those JIRAs as 'bulk-closed'
>>>>> - Resolve them via `Incomplete` status.
>>>>>
>>>>> Please double check the list and let me know if you guys have any
>>>>> concern.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <do...@gmail.com>님이
>>>>> 작성:
>>>>>
>>>>>> +1, too.
>>>>>>
>>>>>> Thank you, Hyukjin!
>>>>>>
>>>>>> Bests,
>>>>>> Dongjoon.
>>>>>>
>>>>>>
>>>>>> On Fri, May 17, 2019 at 9:07 AM Imran Rashid
>>>>>> <ir...@cloudera.com.invalid> wrote:
>>>>>>
>>>>>>> +1, thanks for taking this on
>>>>>>>
>>>>>>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>>>>>>> Yes, I am good with 'Incomplete' too.
>>>>>>>>
>>>>>>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>>>>>>
>>>>>>>>> I actually recently used 'Incomplete'  a bit when the JIRA is
>>>>>>>>> basically too poorly formed (like just copying and pasting an error) ...
>>>>>>>>>
>>>>>>>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I
>>>>>>>>> double checked they can be reopen as well after resolution.
>>>>>>>>>
>>>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>>>>>>
>>>>>>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>>>>>>>
>>>>>>>>>> Agree, anything without an Affected Version should be old enough
>>>>>>>>>> to time out.
>>>>>>>>>> I might use "Incomplete" or something as the status, as we
>>>>>>>>>> haven't otherwise used that. Maybe that's simpler than a label. But,
>>>>>>>>>> anything like that sounds good.
>>>>>>>>>>
>>>>>>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> BTW, affected version became a required field (I don't remember
>>>>>>>>>>> when exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>>>>>>>
>>>>>>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>>>>>>
>>>>>>>>>>> So, including all EOL versions and affected versions not
>>>>>>>>>>> specified will roughly work.
>>>>>>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label
>>>>>>>>>>> makes the best sense to me.
>>>>>>>>>>>
>>>>>>>>>>> Okie. I want to open this roughly for a week before taking an
>>>>>>>>>>> actual action for this. If there's no more feedback, I will do as I said ^
>>>>>>>>>>> next week.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이
>>>>>>>>>>> 작성:
>>>>>>>>>>>
>>>>>>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>>>>>>
>>>>>>>>>>>> My only request is that we attach some sort of 'bulk-closed'
>>>>>>>>>>>> label to issues that we close via JIRA filter batch operations (and resolve
>>>>>>>>>>>> the issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>>>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>>>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I gave up looking through JIRAs a long time ago, so, big
>>>>>>>>>>>>> respect for
>>>>>>>>>>>>> continuing to try to triage them. I am afraid we're missing a
>>>>>>>>>>>>> few
>>>>>>>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>>>>>>>> well-formed, just questions, stale, or simply things that
>>>>>>>>>>>>> won't be
>>>>>>>>>>>>> added. I do think it's important to reflect that reality, and
>>>>>>>>>>>>> so I'm
>>>>>>>>>>>>> always in favor of more aggressively closing JIRAs. I think
>>>>>>>>>>>>> this is
>>>>>>>>>>>>> more standard practice, from projects like TensorFlow/Keras,
>>>>>>>>>>>>> pandas,
>>>>>>>>>>>>> etc to just automatically drop Issues that don't see activity
>>>>>>>>>>>>> for N
>>>>>>>>>>>>> days. We won't do that, but, are probably on the other hand
>>>>>>>>>>>>> far too
>>>>>>>>>>>>> lax in closing them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Remember that JIRAs stay searchable and can be reopened, so
>>>>>>>>>>>>> it's not
>>>>>>>>>>>>> like we lose much information.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as
>>>>>>>>>>>>> a start.
>>>>>>>>>>>>> I like the idea of closing things that only affect an EOL
>>>>>>>>>>>>> release,
>>>>>>>>>>>>> but, many items aren't marked, so may need to cast the net
>>>>>>>>>>>>> wider.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>>>>>>>> reproduce
>>>>>>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <
>>>>>>>>>>>>> gurwls223@gmail.com> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi all,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I would like to propose to resolve all JIRAs that affects
>>>>>>>>>>>>> EOL releases - 2.2 and below. and affected version
>>>>>>>>>>>>> > not specified. I was rather against this way and considered
>>>>>>>>>>>>> this as last resort in roughly 3 years ago
>>>>>>>>>>>>> > when we discussed. Now I think we should go ahead with this.
>>>>>>>>>>>>> See below.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I have been talking care of this for so long time almost
>>>>>>>>>>>>> every day those 3 years. The number of JIRAs
>>>>>>>>>>>>> > keeps increasing and it does never go down. Now the number
>>>>>>>>>>>>> is going over 2500 JIRAs.
>>>>>>>>>>>>> > Did you guys know? in JIRA, we can only go through page by
>>>>>>>>>>>>> page up to 1000 items. So, currently we're even
>>>>>>>>>>>>> > having difficulties to go through every JIRA. We should
>>>>>>>>>>>>> manually filter out and check each.
>>>>>>>>>>>>> > The number is going over the manageable size.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > I am not suggesting this without anything actually trying.
>>>>>>>>>>>>> This is what we have tried within my visibility:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers
>>>>>>>>>>>>> and even non-committers people to sort
>>>>>>>>>>>>> >     out this number. At that time, we were only able to keep
>>>>>>>>>>>>> this number as is. After we lost this momentum,
>>>>>>>>>>>>> >     it kept increasing back.
>>>>>>>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least
>>>>>>>>>>>>> more than two times and resolved them. Roughly
>>>>>>>>>>>>> >     once a year. The rest of them are mostly obsolete but
>>>>>>>>>>>>> not enough information to investigate further.
>>>>>>>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>>>>>>>> >     resolve JIRAs.
>>>>>>>>>>>>> >   4. Promoting other people to comment on JIRA or actively
>>>>>>>>>>>>> resolve them.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > One of the facts I realised is the increasing number of
>>>>>>>>>>>>> committers doesn't virtually help this much (although
>>>>>>>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>>>>>>>> committer.)
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > One of the important thing I should note is that, it's now
>>>>>>>>>>>>> almost pretty difficult to reproduce and test the
>>>>>>>>>>>>> > issues found in EOL releases. We should git clone, checkout,
>>>>>>>>>>>>> build and test. And then, see if that issue
>>>>>>>>>>>>> > still exists in upstream, and fix. This is non-trivial
>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs
>>>>>>>>>>>>> that targets EOL releases - 2.2 and below.
>>>>>>>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Thanks.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>

Re: Resolving all JIRAs affecting EOL releases

Posted by shane knapp <sk...@berkeley.edu>.
alright, i found 3 jiras that i was able to close:

   1. SPARK-19612 <https://issues.apache.org/jira/browse/SPARK-19612>
   2.
      1. SPARK-22996 <https://issues.apache.org/jira/browse/SPARK-22996>
         2.
            1. SPARK-22766
            <https://issues.apache.org/jira/browse/SPARK-22766>
            2.
            3.


On Sun, May 19, 2019 at 6:43 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks Shane .. the URL I linked somehow didn't work in other people
> browser. Hope this link works:
>
>
> https://issues.apache.org/jira/browse/SPARK-23492?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>
> I will take an action around this time tomorrow considering there were
> some more changes to make at the last minute.
>
>
> 2019년 5월 19일 (일) 오후 6:39, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>
>> I will add one more condition for "updated". So, it will additionally
>> avoid things updated within one year but left open against EOL releases.
>>
>> project = SPARK
>>   AND status in (Open, "In Progress", Reopened)
>>   AND (
>>     affectedVersion = EMPTY OR
>>     NOT (affectedVersion in versionMatch("^3.*")
>>       OR affectedVersion in versionMatch("^2.4.*")
>>       OR affectedVersion in versionMatch("^2.3.*")
>>     )
>>   )
>>   AND updated <= -52w
>>
>>
>> https://issues.apache.org/jira/issues/?filter=12344168&jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>>
>> This still reduces JIRAs under 1000 which I originally targeted.
>>
>>
>>
>> 2019년 5월 19일 (일) 오후 6:08, Sean Owen <sr...@gmail.com>님이 작성:
>>
>>> I'd only tweak this to perhaps not close JIRAs that have been updated
>>> recently -- even just avoiding things updated in the last month. For
>>> example this would close
>>> https://issues.apache.org/jira/browse/SPARK-27758 which was opened
>>> Friday (though, for other reasons it should probably be closed). Still I
>>> don't mind it under the logic that it has been reported against 2.1.0.
>>>
>>> On the other hand, I'd go further and close _anything_ not updated in a
>>> long time, like a year (or 2 if feeling conservative). That is there's
>>> probably a lot of old cruft out there that wasn't marked with an Affected
>>> Version, before that was required.
>>>
>>> On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> Thanks guys.
>>>>
>>>> This thread got more than 3 PMC votes without any objection. I slightly
>>>> edited JQL from Abdeali's suggestion (thanks, Abdeali).
>>>>
>>>>
>>>> JQL:
>>>>
>>>> project = SPARK
>>>>   AND status in (Open, "In Progress", Reopened)
>>>>   AND (
>>>>     affectedVersion = EMPTY OR
>>>>     NOT (affectedVersion in versionMatch("^3.*")
>>>>       OR affectedVersion in versionMatch("^2.4.*")
>>>>       OR affectedVersion in versionMatch("^2.3.*")
>>>>     )
>>>>   )
>>>>
>>>>
>>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)
>>>>
>>>>
>>>> It means we will resolve all JIRAs that have EOL releases as affected
>>>> versions, including no version specified in affected versions - this will
>>>> reduce open JIRAs under 900.
>>>>
>>>> Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
>>>> time, I will
>>>> - Label those JIRAs as 'bulk-closed'
>>>> - Resolve them via `Incomplete` status.
>>>>
>>>> Please double check the list and let me know if you guys have any
>>>> concern.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <do...@gmail.com>님이
>>>> 작성:
>>>>
>>>>> +1, too.
>>>>>
>>>>> Thank you, Hyukjin!
>>>>>
>>>>> Bests,
>>>>> Dongjoon.
>>>>>
>>>>>
>>>>> On Fri, May 17, 2019 at 9:07 AM Imran Rashid
>>>>> <ir...@cloudera.com.invalid> wrote:
>>>>>
>>>>>> +1, thanks for taking this on
>>>>>>
>>>>>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>>>>>> Yes, I am good with 'Incomplete' too.
>>>>>>>
>>>>>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>>>>>
>>>>>>>> I actually recently used 'Incomplete'  a bit when the JIRA is
>>>>>>>> basically too poorly formed (like just copying and pasting an error) ...
>>>>>>>>
>>>>>>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I
>>>>>>>> double checked they can be reopen as well after resolution.
>>>>>>>>
>>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>>>>>
>>>>>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>>>>>>
>>>>>>>>> Agree, anything without an Affected Version should be old enough
>>>>>>>>> to time out.
>>>>>>>>> I might use "Incomplete" or something as the status, as we haven't
>>>>>>>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>>>>>>>> that sounds good.
>>>>>>>>>
>>>>>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> BTW, affected version became a required field (I don't remember
>>>>>>>>>> when exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>>>>>>
>>>>>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>>>>>
>>>>>>>>>> So, including all EOL versions and affected versions not
>>>>>>>>>> specified will roughly work.
>>>>>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label
>>>>>>>>>> makes the best sense to me.
>>>>>>>>>>
>>>>>>>>>> Okie. I want to open this roughly for a week before taking an
>>>>>>>>>> actual action for this. If there's no more feedback, I will do as I said ^
>>>>>>>>>> next week.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이
>>>>>>>>>> 작성:
>>>>>>>>>>
>>>>>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>>>>>
>>>>>>>>>>> My only request is that we attach some sort of 'bulk-closed'
>>>>>>>>>>> label to issues that we close via JIRA filter batch operations (and resolve
>>>>>>>>>>> the issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I gave up looking through JIRAs a long time ago, so, big
>>>>>>>>>>>> respect for
>>>>>>>>>>>> continuing to try to triage them. I am afraid we're missing a
>>>>>>>>>>>> few
>>>>>>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>>>>>>> well-formed, just questions, stale, or simply things that won't
>>>>>>>>>>>> be
>>>>>>>>>>>> added. I do think it's important to reflect that reality, and
>>>>>>>>>>>> so I'm
>>>>>>>>>>>> always in favor of more aggressively closing JIRAs. I think
>>>>>>>>>>>> this is
>>>>>>>>>>>> more standard practice, from projects like TensorFlow/Keras,
>>>>>>>>>>>> pandas,
>>>>>>>>>>>> etc to just automatically drop Issues that don't see activity
>>>>>>>>>>>> for N
>>>>>>>>>>>> days. We won't do that, but, are probably on the other hand far
>>>>>>>>>>>> too
>>>>>>>>>>>> lax in closing them.
>>>>>>>>>>>>
>>>>>>>>>>>> Remember that JIRAs stay searchable and can be reopened, so
>>>>>>>>>>>> it's not
>>>>>>>>>>>> like we lose much information.
>>>>>>>>>>>>
>>>>>>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as
>>>>>>>>>>>> a start.
>>>>>>>>>>>> I like the idea of closing things that only affect an EOL
>>>>>>>>>>>> release,
>>>>>>>>>>>> but, many items aren't marked, so may need to cast the net
>>>>>>>>>>>> wider.
>>>>>>>>>>>>
>>>>>>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>>>>>>> reproduce
>>>>>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <
>>>>>>>>>>>> gurwls223@gmail.com> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi all,
>>>>>>>>>>>> >
>>>>>>>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>>>>>>>> releases - 2.2 and below. and affected version
>>>>>>>>>>>> > not specified. I was rather against this way and considered
>>>>>>>>>>>> this as last resort in roughly 3 years ago
>>>>>>>>>>>> > when we discussed. Now I think we should go ahead with this.
>>>>>>>>>>>> See below.
>>>>>>>>>>>> >
>>>>>>>>>>>> > I have been talking care of this for so long time almost
>>>>>>>>>>>> every day those 3 years. The number of JIRAs
>>>>>>>>>>>> > keeps increasing and it does never go down. Now the number is
>>>>>>>>>>>> going over 2500 JIRAs.
>>>>>>>>>>>> > Did you guys know? in JIRA, we can only go through page by
>>>>>>>>>>>> page up to 1000 items. So, currently we're even
>>>>>>>>>>>> > having difficulties to go through every JIRA. We should
>>>>>>>>>>>> manually filter out and check each.
>>>>>>>>>>>> > The number is going over the manageable size.
>>>>>>>>>>>> >
>>>>>>>>>>>> > I am not suggesting this without anything actually trying.
>>>>>>>>>>>> This is what we have tried within my visibility:
>>>>>>>>>>>> >
>>>>>>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers
>>>>>>>>>>>> and even non-committers people to sort
>>>>>>>>>>>> >     out this number. At that time, we were only able to keep
>>>>>>>>>>>> this number as is. After we lost this momentum,
>>>>>>>>>>>> >     it kept increasing back.
>>>>>>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least
>>>>>>>>>>>> more than two times and resolved them. Roughly
>>>>>>>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>>>>>>>> enough information to investigate further.
>>>>>>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>>>>>>> >     resolve JIRAs.
>>>>>>>>>>>> >   4. Promoting other people to comment on JIRA or actively
>>>>>>>>>>>> resolve them.
>>>>>>>>>>>> >
>>>>>>>>>>>> > One of the facts I realised is the increasing number of
>>>>>>>>>>>> committers doesn't virtually help this much (although
>>>>>>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>>>>>>> committer.)
>>>>>>>>>>>> >
>>>>>>>>>>>> > One of the important thing I should note is that, it's now
>>>>>>>>>>>> almost pretty difficult to reproduce and test the
>>>>>>>>>>>> > issues found in EOL releases. We should git clone, checkout,
>>>>>>>>>>>> build and test. And then, see if that issue
>>>>>>>>>>>> > still exists in upstream, and fix. This is non-trivial
>>>>>>>>>>>> overhead.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs
>>>>>>>>>>>> that targets EOL releases - 2.2 and below.
>>>>>>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>>>>
>>>>>>>>>>>>

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks Shane .. the URL I linked somehow didn't work in other people
browser. Hope this link works:

https://issues.apache.org/jira/browse/SPARK-23492?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w

I will take an action around this time tomorrow considering there were some
more changes to make at the last minute.


2019년 5월 19일 (일) 오후 6:39, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> I will add one more condition for "updated". So, it will additionally
> avoid things updated within one year but left open against EOL releases.
>
> project = SPARK
>   AND status in (Open, "In Progress", Reopened)
>   AND (
>     affectedVersion = EMPTY OR
>     NOT (affectedVersion in versionMatch("^3.*")
>       OR affectedVersion in versionMatch("^2.4.*")
>       OR affectedVersion in versionMatch("^2.3.*")
>     )
>   )
>   AND updated <= -52w
>
>
> https://issues.apache.org/jira/issues/?filter=12344168&jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>
> This still reduces JIRAs under 1000 which I originally targeted.
>
>
>
> 2019년 5월 19일 (일) 오후 6:08, Sean Owen <sr...@gmail.com>님이 작성:
>
>> I'd only tweak this to perhaps not close JIRAs that have been updated
>> recently -- even just avoiding things updated in the last month. For
>> example this would close
>> https://issues.apache.org/jira/browse/SPARK-27758 which was opened
>> Friday (though, for other reasons it should probably be closed). Still I
>> don't mind it under the logic that it has been reported against 2.1.0.
>>
>> On the other hand, I'd go further and close _anything_ not updated in a
>> long time, like a year (or 2 if feeling conservative). That is there's
>> probably a lot of old cruft out there that wasn't marked with an Affected
>> Version, before that was required.
>>
>> On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gu...@gmail.com>
>> wrote:
>>
>>> Thanks guys.
>>>
>>> This thread got more than 3 PMC votes without any objection. I slightly
>>> edited JQL from Abdeali's suggestion (thanks, Abdeali).
>>>
>>>
>>> JQL:
>>>
>>> project = SPARK
>>>   AND status in (Open, "In Progress", Reopened)
>>>   AND (
>>>     affectedVersion = EMPTY OR
>>>     NOT (affectedVersion in versionMatch("^3.*")
>>>       OR affectedVersion in versionMatch("^2.4.*")
>>>       OR affectedVersion in versionMatch("^2.3.*")
>>>     )
>>>   )
>>>
>>>
>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)
>>>
>>>
>>> It means we will resolve all JIRAs that have EOL releases as affected
>>> versions, including no version specified in affected versions - this will
>>> reduce open JIRAs under 900.
>>>
>>> Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
>>> time, I will
>>> - Label those JIRAs as 'bulk-closed'
>>> - Resolve them via `Incomplete` status.
>>>
>>> Please double check the list and let me know if you guys have any
>>> concern.
>>>
>>>
>>>
>>>
>>>
>>> 2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>>
>>>> +1, too.
>>>>
>>>> Thank you, Hyukjin!
>>>>
>>>> Bests,
>>>> Dongjoon.
>>>>
>>>>
>>>> On Fri, May 17, 2019 at 9:07 AM Imran Rashid
>>>> <ir...@cloudera.com.invalid> wrote:
>>>>
>>>>> +1, thanks for taking this on
>>>>>
>>>>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>>>>> Yes, I am good with 'Incomplete' too.
>>>>>>
>>>>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>>>>
>>>>>>> I actually recently used 'Incomplete'  a bit when the JIRA is
>>>>>>> basically too poorly formed (like just copying and pasting an error) ...
>>>>>>>
>>>>>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I
>>>>>>> double checked they can be reopen as well after resolution.
>>>>>>>
>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>>>>
>>>>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>>>>>
>>>>>>>> Agree, anything without an Affected Version should be old enough to
>>>>>>>> time out.
>>>>>>>> I might use "Incomplete" or something as the status, as we haven't
>>>>>>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>>>>>>> that sounds good.
>>>>>>>>
>>>>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> BTW, affected version became a required field (I don't remember
>>>>>>>>> when exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>>>>>
>>>>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>>>>
>>>>>>>>> So, including all EOL versions and affected versions not specified
>>>>>>>>> will roughly work.
>>>>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label
>>>>>>>>> makes the best sense to me.
>>>>>>>>>
>>>>>>>>> Okie. I want to open this roughly for a week before taking an
>>>>>>>>> actual action for this. If there's no more feedback, I will do as I said ^
>>>>>>>>> next week.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>>>>>>>
>>>>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>>>>
>>>>>>>>>> My only request is that we attach some sort of 'bulk-closed'
>>>>>>>>>> label to issues that we close via JIRA filter batch operations (and resolve
>>>>>>>>>> the issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I gave up looking through JIRAs a long time ago, so, big respect
>>>>>>>>>>> for
>>>>>>>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>>>>>> well-formed, just questions, stale, or simply things that won't
>>>>>>>>>>> be
>>>>>>>>>>> added. I do think it's important to reflect that reality, and so
>>>>>>>>>>> I'm
>>>>>>>>>>> always in favor of more aggressively closing JIRAs. I think this
>>>>>>>>>>> is
>>>>>>>>>>> more standard practice, from projects like TensorFlow/Keras,
>>>>>>>>>>> pandas,
>>>>>>>>>>> etc to just automatically drop Issues that don't see activity
>>>>>>>>>>> for N
>>>>>>>>>>> days. We won't do that, but, are probably on the other hand far
>>>>>>>>>>> too
>>>>>>>>>>> lax in closing them.
>>>>>>>>>>>
>>>>>>>>>>> Remember that JIRAs stay searchable and can be reopened, so it's
>>>>>>>>>>> not
>>>>>>>>>>> like we lose much information.
>>>>>>>>>>>
>>>>>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as a
>>>>>>>>>>> start.
>>>>>>>>>>> I like the idea of closing things that only affect an EOL
>>>>>>>>>>> release,
>>>>>>>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>>>>>>>
>>>>>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>>>>>> reproduce
>>>>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <
>>>>>>>>>>> gurwls223@gmail.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Hi all,
>>>>>>>>>>> >
>>>>>>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>>>>>>> releases - 2.2 and below. and affected version
>>>>>>>>>>> > not specified. I was rather against this way and considered
>>>>>>>>>>> this as last resort in roughly 3 years ago
>>>>>>>>>>> > when we discussed. Now I think we should go ahead with this.
>>>>>>>>>>> See below.
>>>>>>>>>>> >
>>>>>>>>>>> > I have been talking care of this for so long time almost every
>>>>>>>>>>> day those 3 years. The number of JIRAs
>>>>>>>>>>> > keeps increasing and it does never go down. Now the number is
>>>>>>>>>>> going over 2500 JIRAs.
>>>>>>>>>>> > Did you guys know? in JIRA, we can only go through page by
>>>>>>>>>>> page up to 1000 items. So, currently we're even
>>>>>>>>>>> > having difficulties to go through every JIRA. We should
>>>>>>>>>>> manually filter out and check each.
>>>>>>>>>>> > The number is going over the manageable size.
>>>>>>>>>>> >
>>>>>>>>>>> > I am not suggesting this without anything actually trying.
>>>>>>>>>>> This is what we have tried within my visibility:
>>>>>>>>>>> >
>>>>>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers
>>>>>>>>>>> and even non-committers people to sort
>>>>>>>>>>> >     out this number. At that time, we were only able to keep
>>>>>>>>>>> this number as is. After we lost this momentum,
>>>>>>>>>>> >     it kept increasing back.
>>>>>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more
>>>>>>>>>>> than two times and resolved them. Roughly
>>>>>>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>>>>>>> enough information to investigate further.
>>>>>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>>>>>> >     resolve JIRAs.
>>>>>>>>>>> >   4. Promoting other people to comment on JIRA or actively
>>>>>>>>>>> resolve them.
>>>>>>>>>>> >
>>>>>>>>>>> > One of the facts I realised is the increasing number of
>>>>>>>>>>> committers doesn't virtually help this much (although
>>>>>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>>>>>> committer.)
>>>>>>>>>>> >
>>>>>>>>>>> > One of the important thing I should note is that, it's now
>>>>>>>>>>> almost pretty difficult to reproduce and test the
>>>>>>>>>>> > issues found in EOL releases. We should git clone, checkout,
>>>>>>>>>>> build and test. And then, see if that issue
>>>>>>>>>>> > still exists in upstream, and fix. This is non-trivial
>>>>>>>>>>> overhead.
>>>>>>>>>>> >
>>>>>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs
>>>>>>>>>>> that targets EOL releases - 2.2 and below.
>>>>>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>>>>>> >
>>>>>>>>>>> > Thanks.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>>>
>>>>>>>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
I will add one more condition for "updated". So, it will additionally avoid
things updated within one year but left open against EOL releases.

project = SPARK
  AND status in (Open, "In Progress", Reopened)
  AND (
    affectedVersion = EMPTY OR
    NOT (affectedVersion in versionMatch("^3.*")
      OR affectedVersion in versionMatch("^2.4.*")
      OR affectedVersion in versionMatch("^2.3.*")
    )
  )
  AND updated <= -52w

https://issues.apache.org/jira/issues/?filter=12344168&jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w

This still reduces JIRAs under 1000 which I originally targeted.



2019년 5월 19일 (일) 오후 6:08, Sean Owen <sr...@gmail.com>님이 작성:

> I'd only tweak this to perhaps not close JIRAs that have been updated
> recently -- even just avoiding things updated in the last month. For
> example this would close https://issues.apache.org/jira/browse/SPARK-27758 which
> was opened Friday (though, for other reasons it should probably be closed).
> Still I don't mind it under the logic that it has been reported against
> 2.1.0.
>
> On the other hand, I'd go further and close _anything_ not updated in a
> long time, like a year (or 2 if feeling conservative). That is there's
> probably a lot of old cruft out there that wasn't marked with an Affected
> Version, before that was required.
>
> On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Thanks guys.
>>
>> This thread got more than 3 PMC votes without any objection. I slightly
>> edited JQL from Abdeali's suggestion (thanks, Abdeali).
>>
>>
>> JQL:
>>
>> project = SPARK
>>   AND status in (Open, "In Progress", Reopened)
>>   AND (
>>     affectedVersion = EMPTY OR
>>     NOT (affectedVersion in versionMatch("^3.*")
>>       OR affectedVersion in versionMatch("^2.4.*")
>>       OR affectedVersion in versionMatch("^2.3.*")
>>     )
>>   )
>>
>>
>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)
>>
>>
>> It means we will resolve all JIRAs that have EOL releases as affected
>> versions, including no version specified in affected versions - this will
>> reduce open JIRAs under 900.
>>
>> Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
>> time, I will
>> - Label those JIRAs as 'bulk-closed'
>> - Resolve them via `Incomplete` status.
>>
>> Please double check the list and let me know if you guys have any concern.
>>
>>
>>
>>
>>
>> 2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <do...@gmail.com>님이 작성:
>>
>>> +1, too.
>>>
>>> Thank you, Hyukjin!
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>>
>>> On Fri, May 17, 2019 at 9:07 AM Imran Rashid
>>> <ir...@cloudera.com.invalid> wrote:
>>>
>>>> +1, thanks for taking this on
>>>>
>>>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>>>> Yes, I am good with 'Incomplete' too.
>>>>>
>>>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>>>
>>>>>> I actually recently used 'Incomplete'  a bit when the JIRA is
>>>>>> basically too poorly formed (like just copying and pasting an error) ...
>>>>>>
>>>>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I
>>>>>> double checked they can be reopen as well after resolution.
>>>>>>
>>>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>>>
>>>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>>>>
>>>>>>> Agree, anything without an Affected Version should be old enough to
>>>>>>> time out.
>>>>>>> I might use "Incomplete" or something as the status, as we haven't
>>>>>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>>>>>> that sounds good.
>>>>>>>
>>>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> BTW, affected version became a required field (I don't remember
>>>>>>>> when exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>>>>
>>>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>>>
>>>>>>>> So, including all EOL versions and affected versions not specified
>>>>>>>> will roughly work.
>>>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label
>>>>>>>> makes the best sense to me.
>>>>>>>>
>>>>>>>> Okie. I want to open this roughly for a week before taking an
>>>>>>>> actual action for this. If there's no more feedback, I will do as I said ^
>>>>>>>> next week.
>>>>>>>>
>>>>>>>>
>>>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>>>>>>
>>>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>>>
>>>>>>>>> My only request is that we attach some sort of 'bulk-closed' label
>>>>>>>>> to issues that we close via JIRA filter batch operations (and resolve the
>>>>>>>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I gave up looking through JIRAs a long time ago, so, big respect
>>>>>>>>>> for
>>>>>>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>>>>> well-formed, just questions, stale, or simply things that won't be
>>>>>>>>>> added. I do think it's important to reflect that reality, and so
>>>>>>>>>> I'm
>>>>>>>>>> always in favor of more aggressively closing JIRAs. I think this
>>>>>>>>>> is
>>>>>>>>>> more standard practice, from projects like TensorFlow/Keras,
>>>>>>>>>> pandas,
>>>>>>>>>> etc to just automatically drop Issues that don't see activity for
>>>>>>>>>> N
>>>>>>>>>> days. We won't do that, but, are probably on the other hand far
>>>>>>>>>> too
>>>>>>>>>> lax in closing them.
>>>>>>>>>>
>>>>>>>>>> Remember that JIRAs stay searchable and can be reopened, so it's
>>>>>>>>>> not
>>>>>>>>>> like we lose much information.
>>>>>>>>>>
>>>>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as a
>>>>>>>>>> start.
>>>>>>>>>> I like the idea of closing things that only affect an EOL release,
>>>>>>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>>>>>>
>>>>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>>>>> reproduce
>>>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>>>
>>>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> >
>>>>>>>>>> > Hi all,
>>>>>>>>>> >
>>>>>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>>>>>> releases - 2.2 and below. and affected version
>>>>>>>>>> > not specified. I was rather against this way and considered
>>>>>>>>>> this as last resort in roughly 3 years ago
>>>>>>>>>> > when we discussed. Now I think we should go ahead with this.
>>>>>>>>>> See below.
>>>>>>>>>> >
>>>>>>>>>> > I have been talking care of this for so long time almost every
>>>>>>>>>> day those 3 years. The number of JIRAs
>>>>>>>>>> > keeps increasing and it does never go down. Now the number is
>>>>>>>>>> going over 2500 JIRAs.
>>>>>>>>>> > Did you guys know? in JIRA, we can only go through page by page
>>>>>>>>>> up to 1000 items. So, currently we're even
>>>>>>>>>> > having difficulties to go through every JIRA. We should
>>>>>>>>>> manually filter out and check each.
>>>>>>>>>> > The number is going over the manageable size.
>>>>>>>>>> >
>>>>>>>>>> > I am not suggesting this without anything actually trying. This
>>>>>>>>>> is what we have tried within my visibility:
>>>>>>>>>> >
>>>>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers
>>>>>>>>>> and even non-committers people to sort
>>>>>>>>>> >     out this number. At that time, we were only able to keep
>>>>>>>>>> this number as is. After we lost this momentum,
>>>>>>>>>> >     it kept increasing back.
>>>>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more
>>>>>>>>>> than two times and resolved them. Roughly
>>>>>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>>>>>> enough information to investigate further.
>>>>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>>>>> >     resolve JIRAs.
>>>>>>>>>> >   4. Promoting other people to comment on JIRA or actively
>>>>>>>>>> resolve them.
>>>>>>>>>> >
>>>>>>>>>> > One of the facts I realised is the increasing number of
>>>>>>>>>> committers doesn't virtually help this much (although
>>>>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>>>>> committer.)
>>>>>>>>>> >
>>>>>>>>>> > One of the important thing I should note is that, it's now
>>>>>>>>>> almost pretty difficult to reproduce and test the
>>>>>>>>>> > issues found in EOL releases. We should git clone, checkout,
>>>>>>>>>> build and test. And then, see if that issue
>>>>>>>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>>>>>>>> >
>>>>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs
>>>>>>>>>> that targets EOL releases - 2.2 and below.
>>>>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>>>>> >
>>>>>>>>>> > Thanks.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>>
>>>>>>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Sean Owen <sr...@gmail.com>.
I'd only tweak this to perhaps not close JIRAs that have been updated
recently -- even just avoiding things updated in the last month. For
example this would close
https://issues.apache.org/jira/browse/SPARK-27758 which
was opened Friday (though, for other reasons it should probably be closed).
Still I don't mind it under the logic that it has been reported against
2.1.0.

On the other hand, I'd go further and close _anything_ not updated in a
long time, like a year (or 2 if feeling conservative). That is there's
probably a lot of old cruft out there that wasn't marked with an Affected
Version, before that was required.

On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> Thanks guys.
>
> This thread got more than 3 PMC votes without any objection. I slightly
> edited JQL from Abdeali's suggestion (thanks, Abdeali).
>
>
> JQL:
>
> project = SPARK
>   AND status in (Open, "In Progress", Reopened)
>   AND (
>     affectedVersion = EMPTY OR
>     NOT (affectedVersion in versionMatch("^3.*")
>       OR affectedVersion in versionMatch("^2.4.*")
>       OR affectedVersion in versionMatch("^2.3.*")
>     )
>   )
>
>
> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)
>
>
> It means we will resolve all JIRAs that have EOL releases as affected
> versions, including no version specified in affected versions - this will
> reduce open JIRAs under 900.
>
> Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
> time, I will
> - Label those JIRAs as 'bulk-closed'
> - Resolve them via `Incomplete` status.
>
> Please double check the list and let me know if you guys have any concern.
>
>
>
>
>
> 2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <do...@gmail.com>님이 작성:
>
>> +1, too.
>>
>> Thank you, Hyukjin!
>>
>> Bests,
>> Dongjoon.
>>
>>
>> On Fri, May 17, 2019 at 9:07 AM Imran Rashid <ir...@cloudera.com.invalid>
>> wrote:
>>
>>> +1, thanks for taking this on
>>>
>>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>>> Yes, I am good with 'Incomplete' too.
>>>>
>>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>>
>>>>> I actually recently used 'Incomplete'  a bit when the JIRA is
>>>>> basically too poorly formed (like just copying and pasting an error) ...
>>>>>
>>>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I
>>>>> double checked they can be reopen as well after resolution.
>>>>>
>>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>>
>>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>>>
>>>>>> Agree, anything without an Affected Version should be old enough to
>>>>>> time out.
>>>>>> I might use "Incomplete" or something as the status, as we haven't
>>>>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>>>>> that sounds good.
>>>>>>
>>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> BTW, affected version became a required field (I don't remember when
>>>>>>> exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>>>
>>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>>
>>>>>>> So, including all EOL versions and affected versions not specified
>>>>>>> will roughly work.
>>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes
>>>>>>> the best sense to me.
>>>>>>>
>>>>>>> Okie. I want to open this roughly for a week before taking an actual
>>>>>>> action for this. If there's no more feedback, I will do as I said ^ next
>>>>>>> week.
>>>>>>>
>>>>>>>
>>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>>>>>
>>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>>
>>>>>>>> My only request is that we attach some sort of 'bulk-closed' label
>>>>>>>> to issues that we close via JIRA filter batch operations (and resolve the
>>>>>>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I gave up looking through JIRAs a long time ago, so, big respect
>>>>>>>>> for
>>>>>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>>>> well-formed, just questions, stale, or simply things that won't be
>>>>>>>>> added. I do think it's important to reflect that reality, and so
>>>>>>>>> I'm
>>>>>>>>> always in favor of more aggressively closing JIRAs. I think this is
>>>>>>>>> more standard practice, from projects like TensorFlow/Keras,
>>>>>>>>> pandas,
>>>>>>>>> etc to just automatically drop Issues that don't see activity for N
>>>>>>>>> days. We won't do that, but, are probably on the other hand far too
>>>>>>>>> lax in closing them.
>>>>>>>>>
>>>>>>>>> Remember that JIRAs stay searchable and can be reopened, so it's
>>>>>>>>> not
>>>>>>>>> like we lose much information.
>>>>>>>>>
>>>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as a
>>>>>>>>> start.
>>>>>>>>> I like the idea of closing things that only affect an EOL release,
>>>>>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>>>>>
>>>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>>>> reproduce
>>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>>
>>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi all,
>>>>>>>>> >
>>>>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>>>>> releases - 2.2 and below. and affected version
>>>>>>>>> > not specified. I was rather against this way and considered this
>>>>>>>>> as last resort in roughly 3 years ago
>>>>>>>>> > when we discussed. Now I think we should go ahead with this. See
>>>>>>>>> below.
>>>>>>>>> >
>>>>>>>>> > I have been talking care of this for so long time almost every
>>>>>>>>> day those 3 years. The number of JIRAs
>>>>>>>>> > keeps increasing and it does never go down. Now the number is
>>>>>>>>> going over 2500 JIRAs.
>>>>>>>>> > Did you guys know? in JIRA, we can only go through page by page
>>>>>>>>> up to 1000 items. So, currently we're even
>>>>>>>>> > having difficulties to go through every JIRA. We should manually
>>>>>>>>> filter out and check each.
>>>>>>>>> > The number is going over the manageable size.
>>>>>>>>> >
>>>>>>>>> > I am not suggesting this without anything actually trying. This
>>>>>>>>> is what we have tried within my visibility:
>>>>>>>>> >
>>>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers and
>>>>>>>>> even non-committers people to sort
>>>>>>>>> >     out this number. At that time, we were only able to keep
>>>>>>>>> this number as is. After we lost this momentum,
>>>>>>>>> >     it kept increasing back.
>>>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more
>>>>>>>>> than two times and resolved them. Roughly
>>>>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>>>>> enough information to investigate further.
>>>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>>>> >     resolve JIRAs.
>>>>>>>>> >   4. Promoting other people to comment on JIRA or actively
>>>>>>>>> resolve them.
>>>>>>>>> >
>>>>>>>>> > One of the facts I realised is the increasing number of
>>>>>>>>> committers doesn't virtually help this much (although
>>>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>>>> committer.)
>>>>>>>>> >
>>>>>>>>> > One of the important thing I should note is that, it's now
>>>>>>>>> almost pretty difficult to reproduce and test the
>>>>>>>>> > issues found in EOL releases. We should git clone, checkout,
>>>>>>>>> build and test. And then, see if that issue
>>>>>>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>>>>>>> >
>>>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs
>>>>>>>>> that targets EOL releases - 2.2 and below.
>>>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>>>> >
>>>>>>>>> > Thanks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>
>>>>>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
Thanks guys.

This thread got more than 3 PMC votes without any objection. I slightly
edited JQL from Abdeali's suggestion (thanks, Abdeali).


JQL:

project = SPARK
  AND status in (Open, "In Progress", Reopened)
  AND (
    affectedVersion = EMPTY OR
    NOT (affectedVersion in versionMatch("^3.*")
      OR affectedVersion in versionMatch("^2.4.*")
      OR affectedVersion in versionMatch("^2.3.*")
    )
  )

https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)


It means we will resolve all JIRAs that have EOL releases as affected
versions, including no version specified in affected versions - this will
reduce open JIRAs under 900.

Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
time, I will
- Label those JIRAs as 'bulk-closed'
- Resolve them via `Incomplete` status.

Please double check the list and let me know if you guys have any concern.





2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <do...@gmail.com>님이 작성:

> +1, too.
>
> Thank you, Hyukjin!
>
> Bests,
> Dongjoon.
>
>
> On Fri, May 17, 2019 at 9:07 AM Imran Rashid <ir...@cloudera.com.invalid>
> wrote:
>
>> +1, thanks for taking this on
>>
>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>> Yes, I am good with 'Incomplete' too.
>>>
>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>>
>>>> I actually recently used 'Incomplete'  a bit when the JIRA is basically
>>>> too poorly formed (like just copying and pasting an error) ...
>>>>
>>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I double
>>>> checked they can be reopen as well after resolution.
>>>>
>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>
>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>>
>>>>> Agree, anything without an Affected Version should be old enough to
>>>>> time out.
>>>>> I might use "Incomplete" or something as the status, as we haven't
>>>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>>>> that sounds good.
>>>>>
>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> BTW, affected version became a required field (I don't remember when
>>>>>> exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>>
>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>
>>>>>> So, including all EOL versions and affected versions not specified
>>>>>> will roughly work.
>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes
>>>>>> the best sense to me.
>>>>>>
>>>>>> Okie. I want to open this roughly for a week before taking an actual
>>>>>> action for this. If there's no more feedback, I will do as I said ^ next
>>>>>> week.
>>>>>>
>>>>>>
>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>>>>
>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>
>>>>>>> My only request is that we attach some sort of 'bulk-closed' label
>>>>>>> to issues that we close via JIRA filter batch operations (and resolve the
>>>>>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>>>>>>
>>>>>>>> I gave up looking through JIRAs a long time ago, so, big respect for
>>>>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>>> well-formed, just questions, stale, or simply things that won't be
>>>>>>>> added. I do think it's important to reflect that reality, and so I'm
>>>>>>>> always in favor of more aggressively closing JIRAs. I think this is
>>>>>>>> more standard practice, from projects like TensorFlow/Keras, pandas,
>>>>>>>> etc to just automatically drop Issues that don't see activity for N
>>>>>>>> days. We won't do that, but, are probably on the other hand far too
>>>>>>>> lax in closing them.
>>>>>>>>
>>>>>>>> Remember that JIRAs stay searchable and can be reopened, so it's not
>>>>>>>> like we lose much information.
>>>>>>>>
>>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as a
>>>>>>>> start.
>>>>>>>> I like the idea of closing things that only affect an EOL release,
>>>>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>>>>
>>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>>> reproduce
>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>
>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> > Hi all,
>>>>>>>> >
>>>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>>>> releases - 2.2 and below. and affected version
>>>>>>>> > not specified. I was rather against this way and considered this
>>>>>>>> as last resort in roughly 3 years ago
>>>>>>>> > when we discussed. Now I think we should go ahead with this. See
>>>>>>>> below.
>>>>>>>> >
>>>>>>>> > I have been talking care of this for so long time almost every
>>>>>>>> day those 3 years. The number of JIRAs
>>>>>>>> > keeps increasing and it does never go down. Now the number is
>>>>>>>> going over 2500 JIRAs.
>>>>>>>> > Did you guys know? in JIRA, we can only go through page by page
>>>>>>>> up to 1000 items. So, currently we're even
>>>>>>>> > having difficulties to go through every JIRA. We should manually
>>>>>>>> filter out and check each.
>>>>>>>> > The number is going over the manageable size.
>>>>>>>> >
>>>>>>>> > I am not suggesting this without anything actually trying. This
>>>>>>>> is what we have tried within my visibility:
>>>>>>>> >
>>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers and
>>>>>>>> even non-committers people to sort
>>>>>>>> >     out this number. At that time, we were only able to keep this
>>>>>>>> number as is. After we lost this momentum,
>>>>>>>> >     it kept increasing back.
>>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more
>>>>>>>> than two times and resolved them. Roughly
>>>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>>>> enough information to investigate further.
>>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>>> >     resolve JIRAs.
>>>>>>>> >   4. Promoting other people to comment on JIRA or actively
>>>>>>>> resolve them.
>>>>>>>> >
>>>>>>>> > One of the facts I realised is the increasing number of
>>>>>>>> committers doesn't virtually help this much (although
>>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>>> committer.)
>>>>>>>> >
>>>>>>>> > One of the important thing I should note is that, it's now almost
>>>>>>>> pretty difficult to reproduce and test the
>>>>>>>> > issues found in EOL releases. We should git clone, checkout,
>>>>>>>> build and test. And then, see if that issue
>>>>>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>>>>>> >
>>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>>>>>>>> targets EOL releases - 2.2 and below.
>>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>>> >
>>>>>>>> > Thanks.
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>
>>>>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Dongjoon Hyun <do...@gmail.com>.
+1, too.

Thank you, Hyukjin!

Bests,
Dongjoon.


On Fri, May 17, 2019 at 9:07 AM Imran Rashid <ir...@cloudera.com.invalid>
wrote:

> +1, thanks for taking this on
>
> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> oh, wait. 'Incomplete' can still make sense in this way then.
>> Yes, I am good with 'Incomplete' too.
>>
>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>>
>>> I actually recently used 'Incomplete'  a bit when the JIRA is basically
>>> too poorly formed (like just copying and pasting an error) ...
>>>
>>> I was thinking about 'Unresolved' status or `Auto Closed' too. I double
>>> checked they can be reopen as well after resolution.
>>>
>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>
>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>>
>>>> Agree, anything without an Affected Version should be old enough to
>>>> time out.
>>>> I might use "Incomplete" or something as the status, as we haven't
>>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>>> that sounds good.
>>>>
>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>>
>>>>> BTW, affected version became a required field (I don't remember when
>>>>> exactly was .. I believe it's around when we work on Spark 2.3):
>>>>>
>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>
>>>>> So, including all EOL versions and affected versions not specified
>>>>> will roughly work.
>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes
>>>>> the best sense to me.
>>>>>
>>>>> Okie. I want to open this roughly for a week before taking an actual
>>>>> action for this. If there's no more feedback, I will do as I said ^ next
>>>>> week.
>>>>>
>>>>>
>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>>>
>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>
>>>>>> My only request is that we attach some sort of 'bulk-closed' label to
>>>>>> issues that we close via JIRA filter batch operations (and resolve the
>>>>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>>
>>>>>>
>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>>>>>
>>>>>>> I gave up looking through JIRAs a long time ago, so, big respect for
>>>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>>> well-formed, just questions, stale, or simply things that won't be
>>>>>>> added. I do think it's important to reflect that reality, and so I'm
>>>>>>> always in favor of more aggressively closing JIRAs. I think this is
>>>>>>> more standard practice, from projects like TensorFlow/Keras, pandas,
>>>>>>> etc to just automatically drop Issues that don't see activity for N
>>>>>>> days. We won't do that, but, are probably on the other hand far too
>>>>>>> lax in closing them.
>>>>>>>
>>>>>>> Remember that JIRAs stay searchable and can be reopened, so it's not
>>>>>>> like we lose much information.
>>>>>>>
>>>>>>> I'd close anything that hasn't had activity in 2 years (?), as a
>>>>>>> start.
>>>>>>> I like the idea of closing things that only affect an EOL release,
>>>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>>>
>>>>>>> I think only then does it make sense to look at bothering to
>>>>>>> reproduce
>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>
>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi all,
>>>>>>> >
>>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>>> releases - 2.2 and below. and affected version
>>>>>>> > not specified. I was rather against this way and considered this
>>>>>>> as last resort in roughly 3 years ago
>>>>>>> > when we discussed. Now I think we should go ahead with this. See
>>>>>>> below.
>>>>>>> >
>>>>>>> > I have been talking care of this for so long time almost every day
>>>>>>> those 3 years. The number of JIRAs
>>>>>>> > keeps increasing and it does never go down. Now the number is
>>>>>>> going over 2500 JIRAs.
>>>>>>> > Did you guys know? in JIRA, we can only go through page by page up
>>>>>>> to 1000 items. So, currently we're even
>>>>>>> > having difficulties to go through every JIRA. We should manually
>>>>>>> filter out and check each.
>>>>>>> > The number is going over the manageable size.
>>>>>>> >
>>>>>>> > I am not suggesting this without anything actually trying. This is
>>>>>>> what we have tried within my visibility:
>>>>>>> >
>>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers and
>>>>>>> even non-committers people to sort
>>>>>>> >     out this number. At that time, we were only able to keep this
>>>>>>> number as is. After we lost this momentum,
>>>>>>> >     it kept increasing back.
>>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more
>>>>>>> than two times and resolved them. Roughly
>>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>>> enough information to investigate further.
>>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>>> https://spark.apache.org/contributing.html and
>>>>>>> >     resolve JIRAs.
>>>>>>> >   4. Promoting other people to comment on JIRA or actively resolve
>>>>>>> them.
>>>>>>> >
>>>>>>> > One of the facts I realised is the increasing number of committers
>>>>>>> doesn't virtually help this much (although
>>>>>>> > it might be helpful if somebody active in JIRA becomes a
>>>>>>> committer.)
>>>>>>> >
>>>>>>> > One of the important thing I should note is that, it's now almost
>>>>>>> pretty difficult to reproduce and test the
>>>>>>> > issues found in EOL releases. We should git clone, checkout, build
>>>>>>> and test. And then, see if that issue
>>>>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>>>>> >
>>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>>>>>>> targets EOL releases - 2.2 and below.
>>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>>> >
>>>>>>> > Thanks.
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>
>>>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Imran Rashid <ir...@cloudera.com.INVALID>.
+1, thanks for taking this on

On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> oh, wait. 'Incomplete' can still make sense in this way then.
> Yes, I am good with 'Incomplete' too.
>
> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:
>
>> I actually recently used 'Incomplete'  a bit when the JIRA is basically
>> too poorly formed (like just copying and pasting an error) ...
>>
>> I was thinking about 'Unresolved' status or `Auto Closed' too. I double
>> checked they can be reopen as well after resolution.
>>
>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>
>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>>
>>> Agree, anything without an Affected Version should be old enough to time
>>> out.
>>> I might use "Incomplete" or something as the status, as we haven't
>>> otherwise used that. Maybe that's simpler than a label. But, anything like
>>> that sounds good.
>>>
>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>>
>>>> BTW, affected version became a required field (I don't remember when
>>>> exactly was .. I believe it's around when we work on Spark 2.3):
>>>>
>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>
>>>> So, including all EOL versions and affected versions not specified will
>>>> roughly work.
>>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes
>>>> the best sense to me.
>>>>
>>>> Okie. I want to open this roughly for a week before taking an actual
>>>> action for this. If there's no more feedback, I will do as I said ^ next
>>>> week.
>>>>
>>>>
>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>>
>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>
>>>>> My only request is that we attach some sort of 'bulk-closed' label to
>>>>> issues that we close via JIRA filter batch operations (and resolve the
>>>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>>> makes it easier to audit what was closed, simplifying the process of
>>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>>
>>>>>
>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>>>>
>>>>>> I gave up looking through JIRAs a long time ago, so, big respect for
>>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>>> well-formed, just questions, stale, or simply things that won't be
>>>>>> added. I do think it's important to reflect that reality, and so I'm
>>>>>> always in favor of more aggressively closing JIRAs. I think this is
>>>>>> more standard practice, from projects like TensorFlow/Keras, pandas,
>>>>>> etc to just automatically drop Issues that don't see activity for N
>>>>>> days. We won't do that, but, are probably on the other hand far too
>>>>>> lax in closing them.
>>>>>>
>>>>>> Remember that JIRAs stay searchable and can be reopened, so it's not
>>>>>> like we lose much information.
>>>>>>
>>>>>> I'd close anything that hasn't had activity in 2 years (?), as a
>>>>>> start.
>>>>>> I like the idea of closing things that only affect an EOL release,
>>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>>
>>>>>> I think only then does it make sense to look at bothering to reproduce
>>>>>> or evaluate the 1000s that will still remain.
>>>>>>
>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi all,
>>>>>> >
>>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>>> releases - 2.2 and below. and affected version
>>>>>> > not specified. I was rather against this way and considered this as
>>>>>> last resort in roughly 3 years ago
>>>>>> > when we discussed. Now I think we should go ahead with this. See
>>>>>> below.
>>>>>> >
>>>>>> > I have been talking care of this for so long time almost every day
>>>>>> those 3 years. The number of JIRAs
>>>>>> > keeps increasing and it does never go down. Now the number is going
>>>>>> over 2500 JIRAs.
>>>>>> > Did you guys know? in JIRA, we can only go through page by page up
>>>>>> to 1000 items. So, currently we're even
>>>>>> > having difficulties to go through every JIRA. We should manually
>>>>>> filter out and check each.
>>>>>> > The number is going over the manageable size.
>>>>>> >
>>>>>> > I am not suggesting this without anything actually trying. This is
>>>>>> what we have tried within my visibility:
>>>>>> >
>>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers and
>>>>>> even non-committers people to sort
>>>>>> >     out this number. At that time, we were only able to keep this
>>>>>> number as is. After we lost this momentum,
>>>>>> >     it kept increasing back.
>>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more than
>>>>>> two times and resolved them. Roughly
>>>>>> >     once a year. The rest of them are mostly obsolete but not
>>>>>> enough information to investigate further.
>>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>>> https://spark.apache.org/contributing.html and
>>>>>> >     resolve JIRAs.
>>>>>> >   4. Promoting other people to comment on JIRA or actively resolve
>>>>>> them.
>>>>>> >
>>>>>> > One of the facts I realised is the increasing number of committers
>>>>>> doesn't virtually help this much (although
>>>>>> > it might be helpful if somebody active in JIRA becomes a committer.)
>>>>>> >
>>>>>> > One of the important thing I should note is that, it's now almost
>>>>>> pretty difficult to reproduce and test the
>>>>>> > issues found in EOL releases. We should git clone, checkout, build
>>>>>> and test. And then, see if that issue
>>>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>>>> >
>>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>>>>>> targets EOL releases - 2.2 and below.
>>>>>> > Please let me know if anyone has some concerns or objections.
>>>>>> >
>>>>>> > Thanks.
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>
>>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
oh, wait. 'Incomplete' can still make sense in this way then.
Yes, I am good with 'Incomplete' too.

2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gu...@gmail.com>님이 작성:

> I actually recently used 'Incomplete'  a bit when the JIRA is basically
> too poorly formed (like just copying and pasting an error) ...
>
> I was thinking about 'Unresolved' status or `Auto Closed' too. I double
> checked they can be reopen as well after resolution.
>
> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>
> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:
>
>> Agree, anything without an Affected Version should be old enough to time
>> out.
>> I might use "Incomplete" or something as the status, as we haven't
>> otherwise used that. Maybe that's simpler than a label. But, anything like
>> that sounds good.
>>
>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>>
>>> BTW, affected version became a required field (I don't remember when
>>> exactly was .. I believe it's around when we work on Spark 2.3):
>>>
>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>
>>> So, including all EOL versions and affected versions not specified will
>>> roughly work.
>>> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes the
>>> best sense to me.
>>>
>>> Okie. I want to open this roughly for a week before taking an actual
>>> action for this. If there's no more feedback, I will do as I said ^ next
>>> week.
>>>
>>>
>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>>
>>>> +1 in favor of some sort of JIRA cleanup.
>>>>
>>>> My only request is that we attach some sort of 'bulk-closed' label to
>>>> issues that we close via JIRA filter batch operations (and resolve the
>>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>>> makes it easier to audit what was closed, simplifying the process of
>>>> identifying and re-opening valid issues caught in our dragnet.
>>>>
>>>>
>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>>>
>>>>> I gave up looking through JIRAs a long time ago, so, big respect for
>>>>> continuing to try to triage them. I am afraid we're missing a few
>>>>> important bug reports in the torrent, but most JIRAs are not
>>>>> well-formed, just questions, stale, or simply things that won't be
>>>>> added. I do think it's important to reflect that reality, and so I'm
>>>>> always in favor of more aggressively closing JIRAs. I think this is
>>>>> more standard practice, from projects like TensorFlow/Keras, pandas,
>>>>> etc to just automatically drop Issues that don't see activity for N
>>>>> days. We won't do that, but, are probably on the other hand far too
>>>>> lax in closing them.
>>>>>
>>>>> Remember that JIRAs stay searchable and can be reopened, so it's not
>>>>> like we lose much information.
>>>>>
>>>>> I'd close anything that hasn't had activity in 2 years (?), as a start.
>>>>> I like the idea of closing things that only affect an EOL release,
>>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>>
>>>>> I think only then does it make sense to look at bothering to reproduce
>>>>> or evaluate the 1000s that will still remain.
>>>>>
>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi all,
>>>>> >
>>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>>> releases - 2.2 and below. and affected version
>>>>> > not specified. I was rather against this way and considered this as
>>>>> last resort in roughly 3 years ago
>>>>> > when we discussed. Now I think we should go ahead with this. See
>>>>> below.
>>>>> >
>>>>> > I have been talking care of this for so long time almost every day
>>>>> those 3 years. The number of JIRAs
>>>>> > keeps increasing and it does never go down. Now the number is going
>>>>> over 2500 JIRAs.
>>>>> > Did you guys know? in JIRA, we can only go through page by page up
>>>>> to 1000 items. So, currently we're even
>>>>> > having difficulties to go through every JIRA. We should manually
>>>>> filter out and check each.
>>>>> > The number is going over the manageable size.
>>>>> >
>>>>> > I am not suggesting this without anything actually trying. This is
>>>>> what we have tried within my visibility:
>>>>> >
>>>>> >   1. In roughly 3 years ago, Sean tried to gather committers and
>>>>> even non-committers people to sort
>>>>> >     out this number. At that time, we were only able to keep this
>>>>> number as is. After we lost this momentum,
>>>>> >     it kept increasing back.
>>>>> >   2. At least I scanned _all_ the previous JIRAs at least more than
>>>>> two times and resolved them. Roughly
>>>>> >     once a year. The rest of them are mostly obsolete but not enough
>>>>> information to investigate further.
>>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>>> https://spark.apache.org/contributing.html and
>>>>> >     resolve JIRAs.
>>>>> >   4. Promoting other people to comment on JIRA or actively resolve
>>>>> them.
>>>>> >
>>>>> > One of the facts I realised is the increasing number of committers
>>>>> doesn't virtually help this much (although
>>>>> > it might be helpful if somebody active in JIRA becomes a committer.)
>>>>> >
>>>>> > One of the important thing I should note is that, it's now almost
>>>>> pretty difficult to reproduce and test the
>>>>> > issues found in EOL releases. We should git clone, checkout, build
>>>>> and test. And then, see if that issue
>>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>>> >
>>>>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>>>>> targets EOL releases - 2.2 and below.
>>>>> > Please let me know if anyone has some concerns or objections.
>>>>> >
>>>>> > Thanks.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>
>>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
I actually recently used 'Incomplete'  a bit when the JIRA is basically too
poorly formed (like just copying and pasting an error) ...

I was thinking about 'Unresolved' status or `Auto Closed' too. I double
checked they can be reopen as well after resolution.

[image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
[image: Screen Shot 2019-05-16 at 10.35.39 AM.png]

2019년 5월 16일 (목) 오전 11:04, Sean Owen <sr...@gmail.com>님이 작성:

> Agree, anything without an Affected Version should be old enough to time
> out.
> I might use "Incomplete" or something as the status, as we haven't
> otherwise used that. Maybe that's simpler than a label. But, anything like
> that sounds good.
>
> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> BTW, affected version became a required field (I don't remember when
>> exactly was .. I believe it's around when we work on Spark 2.3):
>>
>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>
>> So, including all EOL versions and affected versions not specified will
>> roughly work.
>> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes the
>> best sense to me.
>>
>> Okie. I want to open this roughly for a week before taking an actual
>> action for this. If there's no more feedback, I will do as I said ^ next
>> week.
>>
>>
>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>>
>>> +1 in favor of some sort of JIRA cleanup.
>>>
>>> My only request is that we attach some sort of 'bulk-closed' label to
>>> issues that we close via JIRA filter batch operations (and resolve the
>>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>>> makes it easier to audit what was closed, simplifying the process of
>>> identifying and re-opening valid issues caught in our dragnet.
>>>
>>>
>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>>
>>>> I gave up looking through JIRAs a long time ago, so, big respect for
>>>> continuing to try to triage them. I am afraid we're missing a few
>>>> important bug reports in the torrent, but most JIRAs are not
>>>> well-formed, just questions, stale, or simply things that won't be
>>>> added. I do think it's important to reflect that reality, and so I'm
>>>> always in favor of more aggressively closing JIRAs. I think this is
>>>> more standard practice, from projects like TensorFlow/Keras, pandas,
>>>> etc to just automatically drop Issues that don't see activity for N
>>>> days. We won't do that, but, are probably on the other hand far too
>>>> lax in closing them.
>>>>
>>>> Remember that JIRAs stay searchable and can be reopened, so it's not
>>>> like we lose much information.
>>>>
>>>> I'd close anything that hasn't had activity in 2 years (?), as a start.
>>>> I like the idea of closing things that only affect an EOL release,
>>>> but, many items aren't marked, so may need to cast the net wider.
>>>>
>>>> I think only then does it make sense to look at bothering to reproduce
>>>> or evaluate the 1000s that will still remain.
>>>>
>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Hi all,
>>>> >
>>>> > I would like to propose to resolve all JIRAs that affects EOL
>>>> releases - 2.2 and below. and affected version
>>>> > not specified. I was rather against this way and considered this as
>>>> last resort in roughly 3 years ago
>>>> > when we discussed. Now I think we should go ahead with this. See
>>>> below.
>>>> >
>>>> > I have been talking care of this for so long time almost every day
>>>> those 3 years. The number of JIRAs
>>>> > keeps increasing and it does never go down. Now the number is going
>>>> over 2500 JIRAs.
>>>> > Did you guys know? in JIRA, we can only go through page by page up to
>>>> 1000 items. So, currently we're even
>>>> > having difficulties to go through every JIRA. We should manually
>>>> filter out and check each.
>>>> > The number is going over the manageable size.
>>>> >
>>>> > I am not suggesting this without anything actually trying. This is
>>>> what we have tried within my visibility:
>>>> >
>>>> >   1. In roughly 3 years ago, Sean tried to gather committers and even
>>>> non-committers people to sort
>>>> >     out this number. At that time, we were only able to keep this
>>>> number as is. After we lost this momentum,
>>>> >     it kept increasing back.
>>>> >   2. At least I scanned _all_ the previous JIRAs at least more than
>>>> two times and resolved them. Roughly
>>>> >     once a year. The rest of them are mostly obsolete but not enough
>>>> information to investigate further.
>>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>>> https://spark.apache.org/contributing.html and
>>>> >     resolve JIRAs.
>>>> >   4. Promoting other people to comment on JIRA or actively resolve
>>>> them.
>>>> >
>>>> > One of the facts I realised is the increasing number of committers
>>>> doesn't virtually help this much (although
>>>> > it might be helpful if somebody active in JIRA becomes a committer.)
>>>> >
>>>> > One of the important thing I should note is that, it's now almost
>>>> pretty difficult to reproduce and test the
>>>> > issues found in EOL releases. We should git clone, checkout, build
>>>> and test. And then, see if that issue
>>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>>> >
>>>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>>>> targets EOL releases - 2.2 and below.
>>>> > Please let me know if anyone has some concerns or objections.
>>>> >
>>>> > Thanks.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>
>>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Sean Owen <sr...@gmail.com>.
Agree, anything without an Affected Version should be old enough to time
out.
I might use "Incomplete" or something as the status, as we haven't
otherwise used that. Maybe that's simpler than a label. But, anything like
that sounds good.

On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gu...@gmail.com> wrote:

> BTW, affected version became a required field (I don't remember when
> exactly was .. I believe it's around when we work on Spark 2.3):
>
> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>
> So, including all EOL versions and affected versions not specified will
> roughly work.
> Using "Cannot Reproduce" as its status and 'bulk-closed' label makes the
> best sense to me.
>
> Okie. I want to open this roughly for a week before taking an actual
> action for this. If there's no more feedback, I will do as I said ^ next
> week.
>
>
> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:
>
>> +1 in favor of some sort of JIRA cleanup.
>>
>> My only request is that we attach some sort of 'bulk-closed' label to
>> issues that we close via JIRA filter batch operations (and resolve the
>> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
>> makes it easier to audit what was closed, simplifying the process of
>> identifying and re-opening valid issues caught in our dragnet.
>>
>>
>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>>
>>> I gave up looking through JIRAs a long time ago, so, big respect for
>>> continuing to try to triage them. I am afraid we're missing a few
>>> important bug reports in the torrent, but most JIRAs are not
>>> well-formed, just questions, stale, or simply things that won't be
>>> added. I do think it's important to reflect that reality, and so I'm
>>> always in favor of more aggressively closing JIRAs. I think this is
>>> more standard practice, from projects like TensorFlow/Keras, pandas,
>>> etc to just automatically drop Issues that don't see activity for N
>>> days. We won't do that, but, are probably on the other hand far too
>>> lax in closing them.
>>>
>>> Remember that JIRAs stay searchable and can be reopened, so it's not
>>> like we lose much information.
>>>
>>> I'd close anything that hasn't had activity in 2 years (?), as a start.
>>> I like the idea of closing things that only affect an EOL release,
>>> but, many items aren't marked, so may need to cast the net wider.
>>>
>>> I think only then does it make sense to look at bothering to reproduce
>>> or evaluate the 1000s that will still remain.
>>>
>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com>
>>> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I would like to propose to resolve all JIRAs that affects EOL releases
>>> - 2.2 and below. and affected version
>>> > not specified. I was rather against this way and considered this as
>>> last resort in roughly 3 years ago
>>> > when we discussed. Now I think we should go ahead with this. See below.
>>> >
>>> > I have been talking care of this for so long time almost every day
>>> those 3 years. The number of JIRAs
>>> > keeps increasing and it does never go down. Now the number is going
>>> over 2500 JIRAs.
>>> > Did you guys know? in JIRA, we can only go through page by page up to
>>> 1000 items. So, currently we're even
>>> > having difficulties to go through every JIRA. We should manually
>>> filter out and check each.
>>> > The number is going over the manageable size.
>>> >
>>> > I am not suggesting this without anything actually trying. This is
>>> what we have tried within my visibility:
>>> >
>>> >   1. In roughly 3 years ago, Sean tried to gather committers and even
>>> non-committers people to sort
>>> >     out this number. At that time, we were only able to keep this
>>> number as is. After we lost this momentum,
>>> >     it kept increasing back.
>>> >   2. At least I scanned _all_ the previous JIRAs at least more than
>>> two times and resolved them. Roughly
>>> >     once a year. The rest of them are mostly obsolete but not enough
>>> information to investigate further.
>>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>>> https://spark.apache.org/contributing.html and
>>> >     resolve JIRAs.
>>> >   4. Promoting other people to comment on JIRA or actively resolve
>>> them.
>>> >
>>> > One of the facts I realised is the increasing number of committers
>>> doesn't virtually help this much (although
>>> > it might be helpful if somebody active in JIRA becomes a committer.)
>>> >
>>> > One of the important thing I should note is that, it's now almost
>>> pretty difficult to reproduce and test the
>>> > issues found in EOL releases. We should git clone, checkout, build and
>>> test. And then, see if that issue
>>> > still exists in upstream, and fix. This is non-trivial overhead.
>>> >
>>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>>> targets EOL releases - 2.2 and below.
>>> > Please let me know if anyone has some concerns or objections.
>>> >
>>> > Thanks.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>
>>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
BTW, affected version became a required field (I don't remember when
exactly was .. I believe it's around when we work on Spark 2.3):

[image: Screen Shot 2019-05-16 at 10.29.50 AM.png]

So, including all EOL versions and affected versions not specified will
roughly work.
Using "Cannot Reproduce" as its status and 'bulk-closed' label makes the
best sense to me.

Okie. I want to open this roughly for a week before taking an actual action
for this. If there's no more feedback, I will do as I said ^ next week.


2019년 5월 15일 (수) 오후 11:33, Josh Rosen <ro...@gmail.com>님이 작성:

> +1 in favor of some sort of JIRA cleanup.
>
> My only request is that we attach some sort of 'bulk-closed' label to
> issues that we close via JIRA filter batch operations (and resolve the
> issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
> makes it easier to audit what was closed, simplifying the process of
> identifying and re-opening valid issues caught in our dragnet.
>
>
> On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:
>
>> I gave up looking through JIRAs a long time ago, so, big respect for
>> continuing to try to triage them. I am afraid we're missing a few
>> important bug reports in the torrent, but most JIRAs are not
>> well-formed, just questions, stale, or simply things that won't be
>> added. I do think it's important to reflect that reality, and so I'm
>> always in favor of more aggressively closing JIRAs. I think this is
>> more standard practice, from projects like TensorFlow/Keras, pandas,
>> etc to just automatically drop Issues that don't see activity for N
>> days. We won't do that, but, are probably on the other hand far too
>> lax in closing them.
>>
>> Remember that JIRAs stay searchable and can be reopened, so it's not
>> like we lose much information.
>>
>> I'd close anything that hasn't had activity in 2 years (?), as a start.
>> I like the idea of closing things that only affect an EOL release,
>> but, many items aren't marked, so may need to cast the net wider.
>>
>> I think only then does it make sense to look at bothering to reproduce
>> or evaluate the 1000s that will still remain.
>>
>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > I would like to propose to resolve all JIRAs that affects EOL releases
>> - 2.2 and below. and affected version
>> > not specified. I was rather against this way and considered this as
>> last resort in roughly 3 years ago
>> > when we discussed. Now I think we should go ahead with this. See below.
>> >
>> > I have been talking care of this for so long time almost every day
>> those 3 years. The number of JIRAs
>> > keeps increasing and it does never go down. Now the number is going
>> over 2500 JIRAs.
>> > Did you guys know? in JIRA, we can only go through page by page up to
>> 1000 items. So, currently we're even
>> > having difficulties to go through every JIRA. We should manually filter
>> out and check each.
>> > The number is going over the manageable size.
>> >
>> > I am not suggesting this without anything actually trying. This is what
>> we have tried within my visibility:
>> >
>> >   1. In roughly 3 years ago, Sean tried to gather committers and even
>> non-committers people to sort
>> >     out this number. At that time, we were only able to keep this
>> number as is. After we lost this momentum,
>> >     it kept increasing back.
>> >   2. At least I scanned _all_ the previous JIRAs at least more than two
>> times and resolved them. Roughly
>> >     once a year. The rest of them are mostly obsolete but not enough
>> information to investigate further.
>> >   3. I strictly stick to "Contributing to JIRA Maintenance"
>> https://spark.apache.org/contributing.html and
>> >     resolve JIRAs.
>> >   4. Promoting other people to comment on JIRA or actively resolve them.
>> >
>> > One of the facts I realised is the increasing number of committers
>> doesn't virtually help this much (although
>> > it might be helpful if somebody active in JIRA becomes a committer.)
>> >
>> > One of the important thing I should note is that, it's now almost
>> pretty difficult to reproduce and test the
>> > issues found in EOL releases. We should git clone, checkout, build and
>> test. And then, see if that issue
>> > still exists in upstream, and fix. This is non-trivial overhead.
>> >
>> > Therefore, I would like to propose resolving _all_ the JIRAs that
>> targets EOL releases - 2.2 and below.
>> > Please let me know if anyone has some concerns or objections.
>> >
>> > Thanks.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>
>>

Re: Resolving all JIRAs affecting EOL releases

Posted by Josh Rosen <ro...@gmail.com>.
+1 in favor of some sort of JIRA cleanup.

My only request is that we attach some sort of 'bulk-closed' label to
issues that we close via JIRA filter batch operations (and resolve the
issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label
makes it easier to audit what was closed, simplifying the process of
identifying and re-opening valid issues caught in our dragnet.


On Wed, May 15, 2019 at 7:19 AM Sean Owen <sr...@gmail.com> wrote:

> I gave up looking through JIRAs a long time ago, so, big respect for
> continuing to try to triage them. I am afraid we're missing a few
> important bug reports in the torrent, but most JIRAs are not
> well-formed, just questions, stale, or simply things that won't be
> added. I do think it's important to reflect that reality, and so I'm
> always in favor of more aggressively closing JIRAs. I think this is
> more standard practice, from projects like TensorFlow/Keras, pandas,
> etc to just automatically drop Issues that don't see activity for N
> days. We won't do that, but, are probably on the other hand far too
> lax in closing them.
>
> Remember that JIRAs stay searchable and can be reopened, so it's not
> like we lose much information.
>
> I'd close anything that hasn't had activity in 2 years (?), as a start.
> I like the idea of closing things that only affect an EOL release,
> but, many items aren't marked, so may need to cast the net wider.
>
> I think only then does it make sense to look at bothering to reproduce
> or evaluate the 1000s that will still remain.
>
> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:
> >
> > Hi all,
> >
> > I would like to propose to resolve all JIRAs that affects EOL releases -
> 2.2 and below. and affected version
> > not specified. I was rather against this way and considered this as last
> resort in roughly 3 years ago
> > when we discussed. Now I think we should go ahead with this. See below.
> >
> > I have been talking care of this for so long time almost every day those
> 3 years. The number of JIRAs
> > keeps increasing and it does never go down. Now the number is going over
> 2500 JIRAs.
> > Did you guys know? in JIRA, we can only go through page by page up to
> 1000 items. So, currently we're even
> > having difficulties to go through every JIRA. We should manually filter
> out and check each.
> > The number is going over the manageable size.
> >
> > I am not suggesting this without anything actually trying. This is what
> we have tried within my visibility:
> >
> >   1. In roughly 3 years ago, Sean tried to gather committers and even
> non-committers people to sort
> >     out this number. At that time, we were only able to keep this number
> as is. After we lost this momentum,
> >     it kept increasing back.
> >   2. At least I scanned _all_ the previous JIRAs at least more than two
> times and resolved them. Roughly
> >     once a year. The rest of them are mostly obsolete but not enough
> information to investigate further.
> >   3. I strictly stick to "Contributing to JIRA Maintenance"
> https://spark.apache.org/contributing.html and
> >     resolve JIRAs.
> >   4. Promoting other people to comment on JIRA or actively resolve them.
> >
> > One of the facts I realised is the increasing number of committers
> doesn't virtually help this much (although
> > it might be helpful if somebody active in JIRA becomes a committer.)
> >
> > One of the important thing I should note is that, it's now almost pretty
> difficult to reproduce and test the
> > issues found in EOL releases. We should git clone, checkout, build and
> test. And then, see if that issue
> > still exists in upstream, and fix. This is non-trivial overhead.
> >
> > Therefore, I would like to propose resolving _all_ the JIRAs that
> targets EOL releases - 2.2 and below.
> > Please let me know if anyone has some concerns or objections.
> >
> > Thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>
>

Re: Resolving all JIRAs affecting EOL releases

Posted by Sean Owen <sr...@gmail.com>.
I gave up looking through JIRAs a long time ago, so, big respect for
continuing to try to triage them. I am afraid we're missing a few
important bug reports in the torrent, but most JIRAs are not
well-formed, just questions, stale, or simply things that won't be
added. I do think it's important to reflect that reality, and so I'm
always in favor of more aggressively closing JIRAs. I think this is
more standard practice, from projects like TensorFlow/Keras, pandas,
etc to just automatically drop Issues that don't see activity for N
days. We won't do that, but, are probably on the other hand far too
lax in closing them.

Remember that JIRAs stay searchable and can be reopened, so it's not
like we lose much information.

I'd close anything that hasn't had activity in 2 years (?), as a start.
I like the idea of closing things that only affect an EOL release,
but, many items aren't marked, so may need to cast the net wider.

I think only then does it make sense to look at bothering to reproduce
or evaluate the 1000s that will still remain.

On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gu...@gmail.com> wrote:
>
> Hi all,
>
> I would like to propose to resolve all JIRAs that affects EOL releases - 2.2 and below. and affected version
> not specified. I was rather against this way and considered this as last resort in roughly 3 years ago
> when we discussed. Now I think we should go ahead with this. See below.
>
> I have been talking care of this for so long time almost every day those 3 years. The number of JIRAs
> keeps increasing and it does never go down. Now the number is going over 2500 JIRAs.
> Did you guys know? in JIRA, we can only go through page by page up to 1000 items. So, currently we're even
> having difficulties to go through every JIRA. We should manually filter out and check each.
> The number is going over the manageable size.
>
> I am not suggesting this without anything actually trying. This is what we have tried within my visibility:
>
>   1. In roughly 3 years ago, Sean tried to gather committers and even non-committers people to sort
>     out this number. At that time, we were only able to keep this number as is. After we lost this momentum,
>     it kept increasing back.
>   2. At least I scanned _all_ the previous JIRAs at least more than two times and resolved them. Roughly
>     once a year. The rest of them are mostly obsolete but not enough information to investigate further.
>   3. I strictly stick to "Contributing to JIRA Maintenance" https://spark.apache.org/contributing.html and
>     resolve JIRAs.
>   4. Promoting other people to comment on JIRA or actively resolve them.
>
> One of the facts I realised is the increasing number of committers doesn't virtually help this much (although
> it might be helpful if somebody active in JIRA becomes a committer.)
>
> One of the important thing I should note is that, it's now almost pretty difficult to reproduce and test the
> issues found in EOL releases. We should git clone, checkout, build and test. And then, see if that issue
> still exists in upstream, and fix. This is non-trivial overhead.
>
> Therefore, I would like to propose resolving _all_ the JIRAs that targets EOL releases - 2.2 and below.
> Please let me know if anyone has some concerns or objections.
>
> Thanks.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org


Re: Resolving all JIRAs affecting EOL releases

Posted by Hyukjin Kwon <gu...@gmail.com>.
Yea, more sophisticated condition is welcome. My only goal is to make it to
a manageable size.

I would go for the option that reduces more tickets - under 1000 OPEN (and
REOPEN) tickets so that we can at least go through in one go without coming
up with a non duplicating filter to go through.

On Wed, 15 May 2019, 19:33 Abdeali Kothari, <ab...@gmail.com>
wrote:

> Was thinking that getting an estimated statistic of the number of issues
> that would be closed if this is done would help.
>
> Open issues: 3882 (project = SPARK AND status in (Open, "In Progress",
> Reopened))
> Open + Does not affect 3.0+ = 2795
> Open + Does not affect 2.4+ = 2373
> Open + Does not affect 2.3+ = 1765
> Open + Does not affect 2.2+ = 1322
> Open + Does not affect 2.1+ = 967
> Open + Does not affect 2.0+ = 651
>
> Open + Does not affect 2.0+ + Priority in (Urgent, Blocker, Critical,
> High) [JQL1] = 838
> Open + Does not affect 2.0+ + Priority in (Urgent, Blocker, Critical,
> High, Major) = 206
> Open + Does not affect 2.2+ + Priority not in (Urgent, Blocker, Critical,
> High) [JQL2] = 1303
> Open + Does not affect 2.2+ + Priority not in (Urgent, Blocker, Critical,
> High, Major) = 397
> Open + Does not affect 2.3+ + Priority not in (Urgent, Blocker, Critical,
> High) = 1743
> Open + Does not affect 2.3+ + Priority not in (Urgent, Blocker, Critical,
> High, Major) = 550
>
> Resolving ALL seems a bit overkill to me.
> My current opinion seems like:
>  - Resolving "Open + Does not affect 2.0+" is something that should be
> done, as I doubt anyone would be looking at the 1.x versions anymore (651
> tasks)
>  - Resolving "Open + Does not affect 2.3+ + Priority not in (Urgent,
> Blocker, Critical, High, Major)" may be a good idea (an additional ~1k
> tasks)
> The issues with priority Urgent/Blocker/Critical should be triaged - as it
> may have something important.
>
>
> [JQL1]:
> project = SPARK
>  AND status in (Open, "In Progress", Reopened)
>  AND NOT (affectedVersion in versionMatch("^[2-3].*"))
>  AND priority NOT IN (Urgent, Blocker, Critical, High)
>
> [JQL2]:
> project = SPARK
>  AND status in (Open, "In Progress", Reopened)
>  AND NOT (affectedVersion in versionMatch("^3.*") OR affectedVersion in
> versionMatch("^2.4.*") OR affectedVersion in versionMatch("^2.3.*") OR
> affectedVersion in versionMatch("^2.2.*"))
>  AND priority NOT IN (Urgent, Blocker, Critical, High)
>
>
> On Wed, May 15, 2019, 14:55 Hyukjin Kwon <gu...@gmail.com> wrote:
>
>> Hi all,
>>
>> I would like to propose to resolve all JIRAs that affects EOL releases -
>> 2.2 and below. and affected version
>> not specified. I was rather against this way and considered this as last
>> resort in roughly 3 years ago
>> when we discussed. Now I think we should go ahead with this. See below.
>>
>> I have been talking care of this for so long time almost every day those
>> 3 years. The number of JIRAs
>> keeps increasing and it does never go down. Now the number is going over
>> 2500 JIRAs.
>> Did you guys know? in JIRA, we can only go through page by page up to
>> 1000 items. So, currently we're even
>> having difficulties to go through every JIRA. We should manually filter
>> out and check each.
>> The number is going over the manageable size.
>>
>> I am not suggesting this without anything actually trying. This is what
>> we have tried within my visibility:
>>
>>   1. In roughly 3 years ago, Sean tried to gather committers and even
>> non-committers people to sort
>>     out this number. At that time, we were only able to keep this number
>> as is. After we lost this momentum,
>>     it kept increasing back.
>>   2. At least I scanned _all_ the previous JIRAs at least more than two
>> times and resolved them. Roughly
>>     once a year. The rest of them are mostly obsolete but not enough
>> information to investigate further.
>>   3. I strictly stick to "Contributing to JIRA Maintenance"
>> https://spark.apache.org/contributing.html and
>>     resolve JIRAs.
>>   4. Promoting other people to comment on JIRA or actively resolve them.
>>
>> One of the facts I realised is the increasing number of committers
>> doesn't virtually help this much (although
>> it might be helpful if somebody active in JIRA becomes a committer.)
>>
>> One of the important thing I should note is that, it's now almost pretty
>> difficult to reproduce and test the
>> issues found in EOL releases. We should git clone, checkout, build and
>> test. And then, see if that issue
>> still exists in upstream, and fix. This is non-trivial overhead.
>>
>> Therefore, I would like to propose resolving _all_ the JIRAs that targets
>> EOL releases - 2.2 and below.
>> Please let me know if anyone has some concerns or objections.
>>
>> Thanks.
>>
>

Re: Resolving all JIRAs affecting EOL releases

Posted by Abdeali Kothari <ab...@gmail.com>.
Was thinking that getting an estimated statistic of the number of issues
that would be closed if this is done would help.

Open issues: 3882 (project = SPARK AND status in (Open, "In Progress",
Reopened))
Open + Does not affect 3.0+ = 2795
Open + Does not affect 2.4+ = 2373
Open + Does not affect 2.3+ = 1765
Open + Does not affect 2.2+ = 1322
Open + Does not affect 2.1+ = 967
Open + Does not affect 2.0+ = 651

Open + Does not affect 2.0+ + Priority in (Urgent, Blocker, Critical, High)
[JQL1] = 838
Open + Does not affect 2.0+ + Priority in (Urgent, Blocker, Critical, High,
Major) = 206
Open + Does not affect 2.2+ + Priority not in (Urgent, Blocker, Critical,
High) [JQL2] = 1303
Open + Does not affect 2.2+ + Priority not in (Urgent, Blocker, Critical,
High, Major) = 397
Open + Does not affect 2.3+ + Priority not in (Urgent, Blocker, Critical,
High) = 1743
Open + Does not affect 2.3+ + Priority not in (Urgent, Blocker, Critical,
High, Major) = 550

Resolving ALL seems a bit overkill to me.
My current opinion seems like:
 - Resolving "Open + Does not affect 2.0+" is something that should be
done, as I doubt anyone would be looking at the 1.x versions anymore (651
tasks)
 - Resolving "Open + Does not affect 2.3+ + Priority not in (Urgent,
Blocker, Critical, High, Major)" may be a good idea (an additional ~1k
tasks)
The issues with priority Urgent/Blocker/Critical should be triaged - as it
may have something important.


[JQL1]:
project = SPARK
 AND status in (Open, "In Progress", Reopened)
 AND NOT (affectedVersion in versionMatch("^[2-3].*"))
 AND priority NOT IN (Urgent, Blocker, Critical, High)

[JQL2]:
project = SPARK
 AND status in (Open, "In Progress", Reopened)
 AND NOT (affectedVersion in versionMatch("^3.*") OR affectedVersion in
versionMatch("^2.4.*") OR affectedVersion in versionMatch("^2.3.*") OR
affectedVersion in versionMatch("^2.2.*"))
 AND priority NOT IN (Urgent, Blocker, Critical, High)


On Wed, May 15, 2019, 14:55 Hyukjin Kwon <gu...@gmail.com> wrote:

> Hi all,
>
> I would like to propose to resolve all JIRAs that affects EOL releases -
> 2.2 and below. and affected version
> not specified. I was rather against this way and considered this as last
> resort in roughly 3 years ago
> when we discussed. Now I think we should go ahead with this. See below.
>
> I have been talking care of this for so long time almost every day those 3
> years. The number of JIRAs
> keeps increasing and it does never go down. Now the number is going over
> 2500 JIRAs.
> Did you guys know? in JIRA, we can only go through page by page up to 1000
> items. So, currently we're even
> having difficulties to go through every JIRA. We should manually filter
> out and check each.
> The number is going over the manageable size.
>
> I am not suggesting this without anything actually trying. This is what we
> have tried within my visibility:
>
>   1. In roughly 3 years ago, Sean tried to gather committers and even
> non-committers people to sort
>     out this number. At that time, we were only able to keep this number
> as is. After we lost this momentum,
>     it kept increasing back.
>   2. At least I scanned _all_ the previous JIRAs at least more than two
> times and resolved them. Roughly
>     once a year. The rest of them are mostly obsolete but not enough
> information to investigate further.
>   3. I strictly stick to "Contributing to JIRA Maintenance"
> https://spark.apache.org/contributing.html and
>     resolve JIRAs.
>   4. Promoting other people to comment on JIRA or actively resolve them.
>
> One of the facts I realised is the increasing number of committers doesn't
> virtually help this much (although
> it might be helpful if somebody active in JIRA becomes a committer.)
>
> One of the important thing I should note is that, it's now almost pretty
> difficult to reproduce and test the
> issues found in EOL releases. We should git clone, checkout, build and
> test. And then, see if that issue
> still exists in upstream, and fix. This is non-trivial overhead.
>
> Therefore, I would like to propose resolving _all_ the JIRAs that targets
> EOL releases - 2.2 and below.
> Please let me know if anyone has some concerns or objections.
>
> Thanks.
>