You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Danny McCormick <da...@google.com> on 2022/06/10 13:42:28 UTC

Clean Up GitHub Labels

Hey everyone,

After migrating over from Jira, our labels are somewhat messy and not as
helpful as they could be. Specifically, there are 2 sets of problems:


1. There is significant overlap between the labels imported from Jira and
the labels we already had in GitHub for our PRs. For example, there was
already a “Go” GitHub label, and as part of the migration we imported a
“sdk-go” label.


2. Because GitHub doesn’t provide an OR syntax in its searching, it is much
harder to search for things like “all io labels” because the io issues are
sharded across a number of io tags (e.g. io-java-aws, io-java-amqp,
io-py-avro, etc…). This applies to other areas like runner issues,
portability issues, and issues by language as well.

I put together a quick 1 page proposal on how we can remove the label
duplication and make searching easier by decomposing our labels into their
smallest components. Please let me know if you have any thoughts or
suggestions!
https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing

Thanks,
Danny

Re: Clean Up GitHub Labels

Posted by Danny McCormick <da...@google.com>.
Just to add on here, all elements of the above proposal have been
implemented as well. Thanks Robert for doing a bunch of manual deletions
here!

On Thu, Jun 16, 2022 at 12:54 PM Robert Burke <ro...@frantil.com> wrote:

> I've just helped Danny clear out the long tail of unused labels. We're
> down to 5 pages of GitHub labels (144 distinct labels)
>
> I've coloured the P0 and P1 labels red, P2, orange and P3 yellow since
> leaving them uncoloured seem wrong. It's trivial to change them, to a
> single colour if that's what we decide to do.
>
> We still have a fairly broad set of 1-5 issue labels.
>
> You can view the current state of labels here:
>
> https://github.com/apache/beam/labels
>
>
> On Wed, Jun 15, 2022, 1:11 PM Aizhamal Nurmamat kyzy <ai...@apache.org>
> wrote:
>
>> Thank you, Danny!
>>
>> On Wed, Jun 15, 2022 at 8:31 AM Danny McCormick <
>> dannymccormick@google.com> wrote:
>>
>>> Given the general consensus here, I put up a PR to enforce this for new
>>> issues here - https://github.com/apache/beam/pull/21888.
>>>
>>> Once that's in, I'll run a script to update all the existing labels to
>>> the new preferred scheme and we can delete the old labels that we don't
>>> need anymore.
>>>
>>> Thanks,
>>> Danny
>>>
>>> On Tue, Jun 14, 2022 at 4:53 PM Kenneth Knowles <ke...@apache.org> wrote:
>>>
>>>> +1 sounds good to me
>>>>
>>>> One thing I did a lot of when triaging Jiras was moving them from one
>>>> component to another, after which people who cared about those components
>>>> would go through them. Making the labels more straightforward for users
>>>> would streamline that.
>>>>
>>>> Kenn
>>>>
>>>> On Sun, Jun 12, 2022 at 9:04 PM Chamikara Jayalath <
>>>> chamikara@google.com> wrote:
>>>>
>>>>> +1 for this in general. Also, as noted in the proposal, decomposing
>>>>> labels should be done on a case by case basis since in some cases that
>>>>> might result in creating labels that do not have proper context.
>>>>>
>>>>> Thanks,
>>>>> Cham
>>>>>
>>>>> On Fri, Jun 10, 2022 at 8:35 AM Robert Burke <ro...@frantil.com>
>>>>> wrote:
>>>>>
>>>>>> +1. I like this consolidation proposal, but i also like thinking
>>>>>> through conjunctions. :)
>>>>>>
>>>>>> On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <
>>>>>> dannymccormick@google.com> wrote:
>>>>>>
>>>>>>> Hey everyone,
>>>>>>>
>>>>>>> After migrating over from Jira, our labels are somewhat messy and
>>>>>>> not as helpful as they could be. Specifically, there are 2 sets of problems:
>>>>>>>
>>>>>>>
>>>>>>> 1. There is significant overlap between the labels imported from
>>>>>>> Jira and the labels we already had in GitHub for our PRs. For example,
>>>>>>> there was already a “Go” GitHub label, and as part of the migration we
>>>>>>> imported a “sdk-go” label.
>>>>>>>
>>>>>>>
>>>>>>> 2. Because GitHub doesn’t provide an OR syntax in its searching, it
>>>>>>> is much harder to search for things like “all io labels” because the io
>>>>>>> issues are sharded across a number of io tags (e.g. io-java-aws,
>>>>>>> io-java-amqp, io-py-avro, etc…). This applies to other areas like runner
>>>>>>> issues, portability issues, and issues by language as well.
>>>>>>>
>>>>>>> I put together a quick 1 page proposal on how we can remove the
>>>>>>> label duplication and make searching easier by decomposing our labels into
>>>>>>> their smallest components. Please let me know if you have any thoughts or
>>>>>>> suggestions!
>>>>>>> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Danny
>>>>>>>
>>>>>>

Re: Clean Up GitHub Labels

Posted by Robert Burke <ro...@frantil.com>.
I've just helped Danny clear out the long tail of unused labels. We're down
to 5 pages of GitHub labels (144 distinct labels)

I've coloured the P0 and P1 labels red, P2, orange and P3 yellow since
leaving them uncoloured seem wrong. It's trivial to change them, to a
single colour if that's what we decide to do.

We still have a fairly broad set of 1-5 issue labels.

You can view the current state of labels here:

https://github.com/apache/beam/labels


On Wed, Jun 15, 2022, 1:11 PM Aizhamal Nurmamat kyzy <ai...@apache.org>
wrote:

> Thank you, Danny!
>
> On Wed, Jun 15, 2022 at 8:31 AM Danny McCormick <da...@google.com>
> wrote:
>
>> Given the general consensus here, I put up a PR to enforce this for new
>> issues here - https://github.com/apache/beam/pull/21888.
>>
>> Once that's in, I'll run a script to update all the existing labels to
>> the new preferred scheme and we can delete the old labels that we don't
>> need anymore.
>>
>> Thanks,
>> Danny
>>
>> On Tue, Jun 14, 2022 at 4:53 PM Kenneth Knowles <ke...@apache.org> wrote:
>>
>>> +1 sounds good to me
>>>
>>> One thing I did a lot of when triaging Jiras was moving them from one
>>> component to another, after which people who cared about those components
>>> would go through them. Making the labels more straightforward for users
>>> would streamline that.
>>>
>>> Kenn
>>>
>>> On Sun, Jun 12, 2022 at 9:04 PM Chamikara Jayalath <ch...@google.com>
>>> wrote:
>>>
>>>> +1 for this in general. Also, as noted in the proposal, decomposing
>>>> labels should be done on a case by case basis since in some cases that
>>>> might result in creating labels that do not have proper context.
>>>>
>>>> Thanks,
>>>> Cham
>>>>
>>>> On Fri, Jun 10, 2022 at 8:35 AM Robert Burke <ro...@frantil.com>
>>>> wrote:
>>>>
>>>>> +1. I like this consolidation proposal, but i also like thinking
>>>>> through conjunctions. :)
>>>>>
>>>>> On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <
>>>>> dannymccormick@google.com> wrote:
>>>>>
>>>>>> Hey everyone,
>>>>>>
>>>>>> After migrating over from Jira, our labels are somewhat messy and not
>>>>>> as helpful as they could be. Specifically, there are 2 sets of problems:
>>>>>>
>>>>>>
>>>>>> 1. There is significant overlap between the labels imported from Jira
>>>>>> and the labels we already had in GitHub for our PRs. For example, there was
>>>>>> already a “Go” GitHub label, and as part of the migration we imported a
>>>>>> “sdk-go” label.
>>>>>>
>>>>>>
>>>>>> 2. Because GitHub doesn’t provide an OR syntax in its searching, it
>>>>>> is much harder to search for things like “all io labels” because the io
>>>>>> issues are sharded across a number of io tags (e.g. io-java-aws,
>>>>>> io-java-amqp, io-py-avro, etc…). This applies to other areas like runner
>>>>>> issues, portability issues, and issues by language as well.
>>>>>>
>>>>>> I put together a quick 1 page proposal on how we can remove the label
>>>>>> duplication and make searching easier by decomposing our labels into their
>>>>>> smallest components. Please let me know if you have any thoughts or
>>>>>> suggestions!
>>>>>> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>>>>>>
>>>>>> Thanks,
>>>>>> Danny
>>>>>>
>>>>>

Re: Clean Up GitHub Labels

Posted by Aizhamal Nurmamat kyzy <ai...@apache.org>.
Thank you, Danny!

On Wed, Jun 15, 2022 at 8:31 AM Danny McCormick <da...@google.com>
wrote:

> Given the general consensus here, I put up a PR to enforce this for new
> issues here - https://github.com/apache/beam/pull/21888.
>
> Once that's in, I'll run a script to update all the existing labels to the
> new preferred scheme and we can delete the old labels that we don't need
> anymore.
>
> Thanks,
> Danny
>
> On Tue, Jun 14, 2022 at 4:53 PM Kenneth Knowles <ke...@apache.org> wrote:
>
>> +1 sounds good to me
>>
>> One thing I did a lot of when triaging Jiras was moving them from one
>> component to another, after which people who cared about those components
>> would go through them. Making the labels more straightforward for users
>> would streamline that.
>>
>> Kenn
>>
>> On Sun, Jun 12, 2022 at 9:04 PM Chamikara Jayalath <ch...@google.com>
>> wrote:
>>
>>> +1 for this in general. Also, as noted in the proposal, decomposing
>>> labels should be done on a case by case basis since in some cases that
>>> might result in creating labels that do not have proper context.
>>>
>>> Thanks,
>>> Cham
>>>
>>> On Fri, Jun 10, 2022 at 8:35 AM Robert Burke <ro...@frantil.com> wrote:
>>>
>>>> +1. I like this consolidation proposal, but i also like thinking
>>>> through conjunctions. :)
>>>>
>>>> On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <
>>>> dannymccormick@google.com> wrote:
>>>>
>>>>> Hey everyone,
>>>>>
>>>>> After migrating over from Jira, our labels are somewhat messy and not
>>>>> as helpful as they could be. Specifically, there are 2 sets of problems:
>>>>>
>>>>>
>>>>> 1. There is significant overlap between the labels imported from Jira
>>>>> and the labels we already had in GitHub for our PRs. For example, there was
>>>>> already a “Go” GitHub label, and as part of the migration we imported a
>>>>> “sdk-go” label.
>>>>>
>>>>>
>>>>> 2. Because GitHub doesn’t provide an OR syntax in its searching, it is
>>>>> much harder to search for things like “all io labels” because the io issues
>>>>> are sharded across a number of io tags (e.g. io-java-aws,
>>>>> io-java-amqp, io-py-avro, etc…). This applies to other areas like runner
>>>>> issues, portability issues, and issues by language as well.
>>>>>
>>>>> I put together a quick 1 page proposal on how we can remove the label
>>>>> duplication and make searching easier by decomposing our labels into their
>>>>> smallest components. Please let me know if you have any thoughts or
>>>>> suggestions!
>>>>> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>>>>>
>>>>> Thanks,
>>>>> Danny
>>>>>
>>>>

Re: Clean Up GitHub Labels

Posted by Danny McCormick <da...@google.com>.
Given the general consensus here, I put up a PR to enforce this for new
issues here - https://github.com/apache/beam/pull/21888.

Once that's in, I'll run a script to update all the existing labels to the
new preferred scheme and we can delete the old labels that we don't need
anymore.

Thanks,
Danny

On Tue, Jun 14, 2022 at 4:53 PM Kenneth Knowles <ke...@apache.org> wrote:

> +1 sounds good to me
>
> One thing I did a lot of when triaging Jiras was moving them from one
> component to another, after which people who cared about those components
> would go through them. Making the labels more straightforward for users
> would streamline that.
>
> Kenn
>
> On Sun, Jun 12, 2022 at 9:04 PM Chamikara Jayalath <ch...@google.com>
> wrote:
>
>> +1 for this in general. Also, as noted in the proposal, decomposing
>> labels should be done on a case by case basis since in some cases that
>> might result in creating labels that do not have proper context.
>>
>> Thanks,
>> Cham
>>
>> On Fri, Jun 10, 2022 at 8:35 AM Robert Burke <ro...@frantil.com> wrote:
>>
>>> +1. I like this consolidation proposal, but i also like thinking through
>>> conjunctions. :)
>>>
>>> On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <da...@google.com>
>>> wrote:
>>>
>>>> Hey everyone,
>>>>
>>>> After migrating over from Jira, our labels are somewhat messy and not
>>>> as helpful as they could be. Specifically, there are 2 sets of problems:
>>>>
>>>>
>>>> 1. There is significant overlap between the labels imported from Jira
>>>> and the labels we already had in GitHub for our PRs. For example, there was
>>>> already a “Go” GitHub label, and as part of the migration we imported a
>>>> “sdk-go” label.
>>>>
>>>>
>>>> 2. Because GitHub doesn’t provide an OR syntax in its searching, it is
>>>> much harder to search for things like “all io labels” because the io issues
>>>> are sharded across a number of io tags (e.g. io-java-aws,
>>>> io-java-amqp, io-py-avro, etc…). This applies to other areas like runner
>>>> issues, portability issues, and issues by language as well.
>>>>
>>>> I put together a quick 1 page proposal on how we can remove the label
>>>> duplication and make searching easier by decomposing our labels into their
>>>> smallest components. Please let me know if you have any thoughts or
>>>> suggestions!
>>>> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>>>>
>>>> Thanks,
>>>> Danny
>>>>
>>>

Re: Clean Up GitHub Labels

Posted by Kenneth Knowles <ke...@apache.org>.
+1 sounds good to me

One thing I did a lot of when triaging Jiras was moving them from one
component to another, after which people who cared about those components
would go through them. Making the labels more straightforward for users
would streamline that.

Kenn

On Sun, Jun 12, 2022 at 9:04 PM Chamikara Jayalath <ch...@google.com>
wrote:

> +1 for this in general. Also, as noted in the proposal, decomposing
> labels should be done on a case by case basis since in some cases that
> might result in creating labels that do not have proper context.
>
> Thanks,
> Cham
>
> On Fri, Jun 10, 2022 at 8:35 AM Robert Burke <ro...@frantil.com> wrote:
>
>> +1. I like this consolidation proposal, but i also like thinking through
>> conjunctions. :)
>>
>> On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <da...@google.com>
>> wrote:
>>
>>> Hey everyone,
>>>
>>> After migrating over from Jira, our labels are somewhat messy and not as
>>> helpful as they could be. Specifically, there are 2 sets of problems:
>>>
>>>
>>> 1. There is significant overlap between the labels imported from Jira
>>> and the labels we already had in GitHub for our PRs. For example, there was
>>> already a “Go” GitHub label, and as part of the migration we imported a
>>> “sdk-go” label.
>>>
>>>
>>> 2. Because GitHub doesn’t provide an OR syntax in its searching, it is
>>> much harder to search for things like “all io labels” because the io issues
>>> are sharded across a number of io tags (e.g. io-java-aws, io-java-amqp,
>>> io-py-avro, etc…). This applies to other areas like runner issues,
>>> portability issues, and issues by language as well.
>>>
>>> I put together a quick 1 page proposal on how we can remove the label
>>> duplication and make searching easier by decomposing our labels into their
>>> smallest components. Please let me know if you have any thoughts or
>>> suggestions!
>>> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>>>
>>> Thanks,
>>> Danny
>>>
>>

Re: Clean Up GitHub Labels

Posted by Chamikara Jayalath <ch...@google.com>.
+1 for this in general. Also, as noted in the proposal, decomposing
labels should be done on a case by case basis since in some cases that
might result in creating labels that do not have proper context.

Thanks,
Cham

On Fri, Jun 10, 2022 at 8:35 AM Robert Burke <ro...@frantil.com> wrote:

> +1. I like this consolidation proposal, but i also like thinking through
> conjunctions. :)
>
> On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <da...@google.com>
> wrote:
>
>> Hey everyone,
>>
>> After migrating over from Jira, our labels are somewhat messy and not as
>> helpful as they could be. Specifically, there are 2 sets of problems:
>>
>>
>> 1. There is significant overlap between the labels imported from Jira and
>> the labels we already had in GitHub for our PRs. For example, there was
>> already a “Go” GitHub label, and as part of the migration we imported a
>> “sdk-go” label.
>>
>>
>> 2. Because GitHub doesn’t provide an OR syntax in its searching, it is
>> much harder to search for things like “all io labels” because the io issues
>> are sharded across a number of io tags (e.g. io-java-aws, io-java-amqp,
>> io-py-avro, etc…). This applies to other areas like runner issues,
>> portability issues, and issues by language as well.
>>
>> I put together a quick 1 page proposal on how we can remove the label
>> duplication and make searching easier by decomposing our labels into their
>> smallest components. Please let me know if you have any thoughts or
>> suggestions!
>> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>>
>> Thanks,
>> Danny
>>
>

Re: Clean Up GitHub Labels

Posted by Robert Burke <ro...@frantil.com>.
+1. I like this consolidation proposal, but i also like thinking through
conjunctions. :)

On Fri, Jun 10, 2022, 6:42 AM Danny McCormick <da...@google.com>
wrote:

> Hey everyone,
>
> After migrating over from Jira, our labels are somewhat messy and not as
> helpful as they could be. Specifically, there are 2 sets of problems:
>
>
> 1. There is significant overlap between the labels imported from Jira and
> the labels we already had in GitHub for our PRs. For example, there was
> already a “Go” GitHub label, and as part of the migration we imported a
> “sdk-go” label.
>
>
> 2. Because GitHub doesn’t provide an OR syntax in its searching, it is
> much harder to search for things like “all io labels” because the io issues
> are sharded across a number of io tags (e.g. io-java-aws, io-java-amqp,
> io-py-avro, etc…). This applies to other areas like runner issues,
> portability issues, and issues by language as well.
>
> I put together a quick 1 page proposal on how we can remove the label
> duplication and make searching easier by decomposing our labels into their
> smallest components. Please let me know if you have any thoughts or
> suggestions!
> https://docs.google.com/document/d/14S5coM_vfRrwygoQ9_NClJWmY5s30_L_J5yCurLW-XU/edit?usp=sharing
>
> Thanks,
> Danny
>