You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Piotr Nowojski <pi...@ververica.com> on 2019/05/15 10:13:25 UTC

Re: [DISCUSS] Clean up and reorganize the JIRA components

Hi,

I would like to propose two changes:

1. Renaming “Runtime / Operators” to “Runtime / Task” or something like  “Runtime / Processing”. “Runtime / Operators” was confusing me, since it sounded like it covers concrete implementations of the operators, like “WindowOperator” or various join implementations.

2. I think we should add additional component for benchmarks and benchmarking infrastructure. While this is more complicated topic (because of the setup and how is it running), it should be on the same level as correctness tests. 

Piotrek

> On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
> 
> Thanks a lot Timo!
> 
> I will start a vote Chesnay!
> 
> On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org> wrote:
> 
>> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
>> component. It seems to be the biggest with 1229 Issues.
>> 
>> Thanks,
>> Timo
>> 
>> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
>>> I would prefer if you'd start a vote with a new cleaned up proposal.
>>> 
>>> On 18.02.2019 15:23, Robert Metzger wrote:
>>>> I added "Runtime / Configuration" to the proposal:
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
>>>> 
>>>> 
>>>> Since this discussion has been open for 10 days, I assume we have
>>>> reached
>>>> consensus here. I will soon start renaming components.
>>>> 
>>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <ch...@apache.org>
>>>> wrote:
>>>> 
>>>>> The only parent I can think of is "Infrastructure", but I don't quite
>>>>> like it :/
>>>>> 
>>>>> +1 for "Runtime / Configuration"; this is too general to be placed in
>>>>> coordination imo.
>>>>> 
>>>>> On 12.02.2019 18:25, Robert Metzger wrote:
>>>>>> Thanks a lot for your feedback Chesnay!
>>>>>> 
>>>>>> re build/travis/release: Do you have a good idea for a common
>>>>>> parent for
>>>>>> "Build System", "Travis" and "Release System"?
>>>>>> 
>>>>>> re legacy: Okay, I see your point. I will keep the Legacy Components
>>>>> prefix.
>>>>>> re library: I think I don't have a argument here. My proposal is
>>>>>> based on
>>>>>> what I felt as being right :) I added the "Library / " prefix to the
>>>>>> proposal.
>>>>>> 
>>>>>> re core/config: From the proposed components, I see the best match
>>>>>> with
>>>>>> "Runtime / Coordination", but I agree that this example is
>>>>>> difficult to
>>>>>> place into my proposed scheme. Do you think we should introduce
>>>>>> "Runtime
>>>>> /
>>>>>> Configuration" as a component?
>>>>>> 
>>>>>> 
>>>>>> I updated the proposal accordingly!
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <chesnay@apache.org
>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> re build/travis/release: No, I'm against merging build system, travis
>>>>>>> and release system.
>>>>>>> 
>>>>>>> re legacy: So going forward you're proposing to move dropped features
>>>>>>> into the legacy bucket and make it impossible to search for specific
>>>>>>> issues for that component? There's 0 overhead to having these
>>>>>>> components, so I really don't get the benefit here, but see the
>>>>> overhead.
>>>>>>> I don't buy the argument of "people will not open issues if the
>>>>>>> component doesn't exist", they will just leave the component field
>>>>>>> blank
>>>>>>> or add a random one (that would be wrong). In fact, if you had a
>>>>>>> storm/tez component (that users would adhere to) then it would be
>>>>>>> _easier_ to figure out whether an issue can be rejected right away.
>>>>>>> 
>>>>>>> re library: If you are against a library category, what's your
>>>>>>> argument
>>>>>>> for a connector category?
>>>>>>> 
>>>>>>> re tests: I don't mind "tests" being removed from tickets about test
>>>>>>> instabilities, but you specified the migration as "rename E2E tests"
>>>>>>> which is not equivalent.
>>>>>>> Under what category would you file modifications to
>>>>> flink-test-utils-junit?
>>>>>>> I would propose to not differentiate between e2e and other tests; I
>>>>>>> would go along with "Test infrastructure", and remove the major
>>>>>>> "Tests"
>>>>>>> category.
>>>>>>> 
>>>>>>> re core/config: As an example, where (under Runtime) would you
>>>>>>> place the
>>>>>>> introduction of the ConfigOption class?
>>>>>>> 
>>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
>>>>>>>> Thanks a lot for your feedback!
>>>>>>>> 
>>>>>>>> @Timo:
>>>>>>>> I've followed your suggestions and updated the proposed names in the
>>>>>>> wiki.
>>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
>>>>>>>> not much
>>>>>>>> knowledge) would not add this component at the moment, and put
>>>>>>>> the SQL
>>>>>>>> stuff into the respective connector component.
>>>>>>>> It is probably pretty difficult for a user to decide whether a but
>>>>>>> belongs
>>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does not
>>>>> work.
>>>>>>>> @Chesnay:
>>>>>>>> - You are suggesting to rename "Build System" to "Maven" and still
>>>>> merge
>>>>>>> it
>>>>>>>> with "Travis", "Release System" etc. as in the proposal?
>>>>>>>> 
>>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
>>>>>>>> changed the
>>>>>>>> proposal
>>>>>>>> 
>>>>>>>> - Re. "Documentation": Yes, I think that would be better in the long
>>>>> run.
>>>>>>>> We are already in a situation where there are groups within the
>>>>> community
>>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
>>>>>>>> connectors). Those groups will monitor their components, but it will
>>>>> be a
>>>>>>>> lot of overhead for them to monitor the "Documentation" component.
>>>>>>>> We can also try to assign documentation components to both
>>>>>>> "Documentation"
>>>>>>>> and the affected component, such as "Runtime / Metrics".
>>>>>>>> 
>>>>>>>> - Removed "Misc / " prefix.
>>>>>>>> 
>>>>>>>> - "Legacy Components": Usually legacy components usually have
>>>>>>>> very few
>>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
>>>>>>>> a bulk
>>>>>>>> edit feature :)
>>>>>>>> The benefit of having it generalized is that people will probably
>>>>>>>> not
>>>>> add
>>>>>>>> tickets to it.
>>>>>>>> 
>>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
>>>>>>> libraries
>>>>>>>> might grow in the future (like the Table API), then we need to
>>>>>>>> rename.
>>>>>>>> the "flink-libraries" module does contain stuff like the sql
>>>>>>>> client or
>>>>>>> the
>>>>>>>> python api, which are already covered by other components in my
>>>>> proposal
>>>>>>> --
>>>>>>>> so going with the maven module structure is not an argument here.
>>>>>>>> 
>>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
>>>>>>>> with the
>>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics, ..
>>>>>>> should
>>>>>>>> get visibility into "their" test instabilities through "their"
>>>>>>> components.
>>>>>>>> Not many people will feel responsible for the "Tests" component.
>>>>>>>> 
>>>>>>>> For "Core" and "Configuration", I will move the tickets to the
>>>>>>> appropriate
>>>>>>>> components in "Runtime /".
>>>>>>>> 
>>>>>>>> For "API / Scala": Good point. I will add that component.
>>>>>>>> 
>>>>>>>> How to do it? I will just go through the pain and do it.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Robert
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <chesnay@apache.org
>>> 
>>>>>>> wrote:
>>>>>>>>> Some concerns:
>>>>>>>>> 
>>>>>>>>> Travis and build system / release system are entirely different. I
>>>>> would
>>>>>>>>> even keep the release system away from the build-system, as it
>>>>>>>>> is more
>>>>>>>>> about the release scripts and documentation, while the latter is
>>>>>>>>> about
>>>>>>>>> maven. Actually I'd just rename build-system to maven.
>>>>>>>>> 
>>>>>>>>> Control Plane is a term I've never heard before in this context;
>>>>>>>>> I'd
>>>>>>>>> replace it with Coordination.
>>>>>>>>> 
>>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
>>>>> component".
>>>>>>>>> In other words, if I make a change to the metrics documentation I
>>>>>>>>> shouldn't use this component any more?
>>>>>>>>> 
>>>>>>>>> I don't see the benefit of a `Misc` major category. I'd attribute
>>>>>>>>> everything that doesn't have a major category implicitly to "Misc".
>>>>>>>>> 
>>>>>>>>> Not a fan of a generalized "Legacy components" category; this seems
>>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have to
>>>>>>>>> touch
>>>>>>>>> every JIRA for a component if we drop it.
>>>>>>>>> 
>>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
>>>>>>>>> 
>>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
>>>>>>>>> Infrastructure is not about fixing failing tests, which is what we
>>>>>>>>> partially used this component for so far.
>>>>>>>>> 
>>>>>>>>> I don't believe you can get rid of the generic "Tests" component;
>>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
>>>>>>>>> 
>>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
>>>>>>>>> listed any
>>>>>>>>> migration paths.
>>>>>>>>> 
>>>>>>>>> If there's a API / Python category there should also be a API /
>>>>>>>>> Scala
>>>>>>>>> category. This could also include the shala-shell. Note that the
>>>>>>>>> existing Scala API category is not mentioned anywhere in the
>>>>>>>>> document.
>>>>>>>>> 
>>>>>>>>> How do you actually want to do the migration?
>>>>>>>>> 
>>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
>>>>>>>>>> Hi Robert,
>>>>>>>>>> 
>>>>>>>>>> thanks for starting this discussion. I was also about to suggest
>>>>>>>>>> splitting the `Table API & SQL` component because it contains
>>>>>>>>>> already
>>>>>>>>>> more than 1000 issues.
>>>>>>>>>> 
>>>>>>>>>> My comments:
>>>>>>>>>> 
>>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term goal
>>>>>>>>>> might
>>>>>>>>>> not only be a CLI interface. I would keep the generic name "SQL
>>>>>>>>>> Client" for now. This is also what is written in FLIPs,
>>>>> presentations,
>>>>>>>>>> and documentation.
>>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is read-only
>>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner is
>>>>> more
>>>>>>>>>> generic.
>>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know what
>>>>>>>>>> Gelly means. This is the only component that has a "feature
>>>>>>>>>> name". I
>>>>>>>>>> don't know if we want to stick with that in the future.
>>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because SQL
>>>>>>>>>> connectors are tightly bound to SQL internals but also to the
>>>>>>>>>> connector itself.
>>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name is
>>>>> more
>>>>>>>>>> generic and reflects the efforts about Hive Metastore and catalog
>>>>>>>>>> integration that is currenlty taking place.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Timo
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
>>>>>>>>>>> Hi all,
>>>>>>>>>>> 
>>>>>>>>>>> I am currently trying to improve how the Flink community is
>>>>>>>>>>> handling
>>>>>>>>>>> incoming pull requests and JIRA tickets.
>>>>>>>>>>> 
>>>>>>>>>>> I've looked at how other big communities are handling such a high
>>>>>>>>>>> number of
>>>>>>>>>>> contributions, and I found that many are using GitHub labels
>>>>>>>>>>> extensively.
>>>>>>>>>>> An integral part of the label use is to tag PRs with the
>>>>>>>>>>> component /
>>>>>>>>>>> area
>>>>>>>>>>> they belong to. I think the most obvious and logical way of
>>>>>>>>>>> tagging
>>>>>>>>>>> the PRs
>>>>>>>>>>> is by using the JIRA components. This will force us to keep
>>>>>>>>>>> the JIRA
>>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
>>>>>>>>>>> I will soon start a separate discussion for the GitHub labels.
>>>>>>>>>>> 
>>>>>>>>>>> Let's first discuss the JIRA components.
>>>>>>>>>>> 
>>>>>>>>>>> I've created the following Wiki page with my proposal of the new
>>>>>>>>>>> component,
>>>>>>>>>>> and how to migrate from the existing components:
>>>>>>>>>>> 
>>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
>>>>> 
>>>>>>>>>>> Please comment here or directly in the Wiki to let me know
>>>>>>>>>>> what you
>>>>>>>>>>> think.
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Robert
>>>>>>>>>>> 
>>>>> 
>> 
>> 


Re: [DISCUSS] Clean up and reorganize the JIRA components

Posted by Robert Metzger <rm...@apache.org>.
Hi,
Thanks for your explanation.
I've added "Benchmarks" and renamed "Runtime / Operators".

On Mon, May 20, 2019 at 10:59 AM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> > Concrete operator implementations will then go into the "API /
> DataStream"?
> > (or "API / DataSet" or Table)
> > Afaik, there were some ideas to share operator implementations between
> > DataStream and Table
>
> Yes & yes. I think for now we could keep the concrete operators
> implementations under API / DataStream and we can split them out once we
> have true use case for that. Unless this is confusing for someone, in that
> case we could split it now to API / DataStream Operators.
>
> >> 2. I think we should add additional component for benchmarks and
> >> benchmarking infrastructure. While this is more complicated topic
> (because
> >> of the setup and how is it running), it should be on the same level as
> >> correctness tests.
> >>
> >
> > I'm not sure if it is a good idea to add a "Benchmarks" component into
> the
> > Flink JIRA. Afaik, the benchmarks are managed from here?
> > https://github.com/dataArtisans/flink/tree/benchmark-request <
> https://github.com/dataArtisans/flink/tree/benchmark-request>
>
> Not all of them, some of them are in apache/flink. And it might be a
> subject to change in the future. Ideally we should have benchmarking code
> in the same repository, if not for some licensing issues. Also if we ever
> implement full cluster benchmarks (not using JMH), they could also reside
> in the Flink repository.
>
> Regardless of that, does it matter where the benchmarks are? In my opinion
> the only thing that matters is that benchmarks are just another for of
> tests/verification, we have unit tests, integrations tests, end to end
> tests and also various level benchmarks. Why should those things be treated
> differently?
>
> > Doesn't it make sense to track issues with GH issues there?
> > Or asking more broadly, what types of issues would you see in that
> > component?
>
> Same kind of issues as for any other type of tests. For example:
> - release blocker Jira issue that benchmarks are broken and are not
> testing anything (from time to time we have to fix something in the
> benchmarking setup and also it happened couple of times, that benchmarks
> have discovered some release blocker regressions in the Flink)
> - Jira issue to fix some benchmark
> - Jira issue to implement a missing benchmark
> - …
>
> Piotrek
>
> > On 17 May 2019, at 14:41, Robert Metzger <rm...@apache.org> wrote:
> >
> > Hi,
> >
> > 1. Renaming “Runtime / Operators” to “Runtime / Task” or something like
> >> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it
> >> sounded like it covers concrete implementations of the operators, like
> >> “WindowOperator” or various join implementations.
> >>
> >
> > I'm fine with this renaming.
> > Concrete operator implementations will then go into the "API /
> DataStream"?
> > (or "API / DataSet" or Table)
> > Afaik, there were some ideas to share operator implementations between
> > DataStream and Table. If that's the case, we would have to find a good
> > components for that as well.
> >
> >
> >>
> >> 2. I think we should add additional component for benchmarks and
> >> benchmarking infrastructure. While this is more complicated topic
> (because
> >> of the setup and how is it running), it should be on the same level as
> >> correctness tests.
> >>
> >
> > I'm not sure if it is a good idea to add a "Benchmarks" component into
> the
> > Flink JIRA. Afaik, the benchmarks are managed from here?
> > https://github.com/dataArtisans/flink/tree/benchmark-request
> > Doesn't it make sense to track issues with GH issues there?
> > Or asking more broadly, what types of issues would you see in that
> > component?
> >
> >
> >>
> >> Piotrek
> >>
> >>> On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
> >>>
> >>> Thanks a lot Timo!
> >>>
> >>> I will start a vote Chesnay!
> >>>
> >>> On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org>
> >> wrote:
> >>>
> >>>> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
> >>>> component. It seems to be the biggest with 1229 Issues.
> >>>>
> >>>> Thanks,
> >>>> Timo
> >>>>
> >>>> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
> >>>>> I would prefer if you'd start a vote with a new cleaned up proposal.
> >>>>>
> >>>>> On 18.02.2019 15:23, Robert Metzger wrote:
> >>>>>> I added "Runtime / Configuration" to the proposal:
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>>>
> >>>>>>
> >>>>>> Since this discussion has been open for 10 days, I assume we have
> >>>>>> reached
> >>>>>> consensus here. I will soon start renaming components.
> >>>>>>
> >>>>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> The only parent I can think of is "Infrastructure", but I don't
> quite
> >>>>>>> like it :/
> >>>>>>>
> >>>>>>> +1 for "Runtime / Configuration"; this is too general to be placed
> in
> >>>>>>> coordination imo.
> >>>>>>>
> >>>>>>> On 12.02.2019 18:25, Robert Metzger wrote:
> >>>>>>>> Thanks a lot for your feedback Chesnay!
> >>>>>>>>
> >>>>>>>> re build/travis/release: Do you have a good idea for a common
> >>>>>>>> parent for
> >>>>>>>> "Build System", "Travis" and "Release System"?
> >>>>>>>>
> >>>>>>>> re legacy: Okay, I see your point. I will keep the Legacy
> Components
> >>>>>>> prefix.
> >>>>>>>> re library: I think I don't have a argument here. My proposal is
> >>>>>>>> based on
> >>>>>>>> what I felt as being right :) I added the "Library / " prefix to
> the
> >>>>>>>> proposal.
> >>>>>>>>
> >>>>>>>> re core/config: From the proposed components, I see the best match
> >>>>>>>> with
> >>>>>>>> "Runtime / Coordination", but I agree that this example is
> >>>>>>>> difficult to
> >>>>>>>> place into my proposed scheme. Do you think we should introduce
> >>>>>>>> "Runtime
> >>>>>>> /
> >>>>>>>> Configuration" as a component?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I updated the proposal accordingly!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <
> >> chesnay@apache.org
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> re build/travis/release: No, I'm against merging build system,
> >> travis
> >>>>>>>>> and release system.
> >>>>>>>>>
> >>>>>>>>> re legacy: So going forward you're proposing to move dropped
> >> features
> >>>>>>>>> into the legacy bucket and make it impossible to search for
> >> specific
> >>>>>>>>> issues for that component? There's 0 overhead to having these
> >>>>>>>>> components, so I really don't get the benefit here, but see the
> >>>>>>> overhead.
> >>>>>>>>> I don't buy the argument of "people will not open issues if the
> >>>>>>>>> component doesn't exist", they will just leave the component
> field
> >>>>>>>>> blank
> >>>>>>>>> or add a random one (that would be wrong). In fact, if you had a
> >>>>>>>>> storm/tez component (that users would adhere to) then it would be
> >>>>>>>>> _easier_ to figure out whether an issue can be rejected right
> away.
> >>>>>>>>>
> >>>>>>>>> re library: If you are against a library category, what's your
> >>>>>>>>> argument
> >>>>>>>>> for a connector category?
> >>>>>>>>>
> >>>>>>>>> re tests: I don't mind "tests" being removed from tickets about
> >> test
> >>>>>>>>> instabilities, but you specified the migration as "rename E2E
> >> tests"
> >>>>>>>>> which is not equivalent.
> >>>>>>>>> Under what category would you file modifications to
> >>>>>>> flink-test-utils-junit?
> >>>>>>>>> I would propose to not differentiate between e2e and other
> tests; I
> >>>>>>>>> would go along with "Test infrastructure", and remove the major
> >>>>>>>>> "Tests"
> >>>>>>>>> category.
> >>>>>>>>>
> >>>>>>>>> re core/config: As an example, where (under Runtime) would you
> >>>>>>>>> place the
> >>>>>>>>> introduction of the ConfigOption class?
> >>>>>>>>>
> >>>>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
> >>>>>>>>>> Thanks a lot for your feedback!
> >>>>>>>>>>
> >>>>>>>>>> @Timo:
> >>>>>>>>>> I've followed your suggestions and updated the proposed names in
> >> the
> >>>>>>>>> wiki.
> >>>>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
> >>>>>>>>>> not much
> >>>>>>>>>> knowledge) would not add this component at the moment, and put
> >>>>>>>>>> the SQL
> >>>>>>>>>> stuff into the respective connector component.
> >>>>>>>>>> It is probably pretty difficult for a user to decide whether a
> but
> >>>>>>>>> belongs
> >>>>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does
> >> not
> >>>>>>> work.
> >>>>>>>>>> @Chesnay:
> >>>>>>>>>> - You are suggesting to rename "Build System" to "Maven" and
> still
> >>>>>>> merge
> >>>>>>>>> it
> >>>>>>>>>> with "Travis", "Release System" etc. as in the proposal?
> >>>>>>>>>>
> >>>>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
> >>>>>>>>>> changed the
> >>>>>>>>>> proposal
> >>>>>>>>>>
> >>>>>>>>>> - Re. "Documentation": Yes, I think that would be better in the
> >> long
> >>>>>>> run.
> >>>>>>>>>> We are already in a situation where there are groups within the
> >>>>>>> community
> >>>>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
> >>>>>>>>>> connectors). Those groups will monitor their components, but it
> >> will
> >>>>>>> be a
> >>>>>>>>>> lot of overhead for them to monitor the "Documentation"
> component.
> >>>>>>>>>> We can also try to assign documentation components to both
> >>>>>>>>> "Documentation"
> >>>>>>>>>> and the affected component, such as "Runtime / Metrics".
> >>>>>>>>>>
> >>>>>>>>>> - Removed "Misc / " prefix.
> >>>>>>>>>>
> >>>>>>>>>> - "Legacy Components": Usually legacy components usually have
> >>>>>>>>>> very few
> >>>>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
> >>>>>>>>>> a bulk
> >>>>>>>>>> edit feature :)
> >>>>>>>>>> The benefit of having it generalized is that people will
> probably
> >>>>>>>>>> not
> >>>>>>> add
> >>>>>>>>>> tickets to it.
> >>>>>>>>>>
> >>>>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
> >>>>>>>>> libraries
> >>>>>>>>>> might grow in the future (like the Table API), then we need to
> >>>>>>>>>> rename.
> >>>>>>>>>> the "flink-libraries" module does contain stuff like the sql
> >>>>>>>>>> client or
> >>>>>>>>> the
> >>>>>>>>>> python api, which are already covered by other components in my
> >>>>>>> proposal
> >>>>>>>>> --
> >>>>>>>>>> so going with the maven module structure is not an argument
> here.
> >>>>>>>>>>
> >>>>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
> >>>>>>>>>> with the
> >>>>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics,
> >> ..
> >>>>>>>>> should
> >>>>>>>>>> get visibility into "their" test instabilities through "their"
> >>>>>>>>> components.
> >>>>>>>>>> Not many people will feel responsible for the "Tests" component.
> >>>>>>>>>>
> >>>>>>>>>> For "Core" and "Configuration", I will move the tickets to the
> >>>>>>>>> appropriate
> >>>>>>>>>> components in "Runtime /".
> >>>>>>>>>>
> >>>>>>>>>> For "API / Scala": Good point. I will add that component.
> >>>>>>>>>>
> >>>>>>>>>> How to do it? I will just go through the pain and do it.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Robert
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <
> >> chesnay@apache.org
> >>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>> Some concerns:
> >>>>>>>>>>>
> >>>>>>>>>>> Travis and build system / release system are entirely
> different.
> >> I
> >>>>>>> would
> >>>>>>>>>>> even keep the release system away from the build-system, as it
> >>>>>>>>>>> is more
> >>>>>>>>>>> about the release scripts and documentation, while the latter
> is
> >>>>>>>>>>> about
> >>>>>>>>>>> maven. Actually I'd just rename build-system to maven.
> >>>>>>>>>>>
> >>>>>>>>>>> Control Plane is a term I've never heard before in this
> context;
> >>>>>>>>>>> I'd
> >>>>>>>>>>> replace it with Coordination.
> >>>>>>>>>>>
> >>>>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
> >>>>>>> component".
> >>>>>>>>>>> In other words, if I make a change to the metrics
> documentation I
> >>>>>>>>>>> shouldn't use this component any more?
> >>>>>>>>>>>
> >>>>>>>>>>> I don't see the benefit of a `Misc` major category. I'd
> attribute
> >>>>>>>>>>> everything that doesn't have a major category implicitly to
> >> "Misc".
> >>>>>>>>>>>
> >>>>>>>>>>> Not a fan of a generalized "Legacy components" category; this
> >> seems
> >>>>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have
> to
> >>>>>>>>>>> touch
> >>>>>>>>>>> every JIRA for a component if we drop it.
> >>>>>>>>>>>
> >>>>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
> >>>>>>>>>>>
> >>>>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
> >>>>>>>>>>> Infrastructure is not about fixing failing tests, which is what
> >> we
> >>>>>>>>>>> partially used this component for so far.
> >>>>>>>>>>>
> >>>>>>>>>>> I don't believe you can get rid of the generic "Tests"
> component;
> >>>>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
> >>>>>>>>>>>
> >>>>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
> >>>>>>>>>>> listed any
> >>>>>>>>>>> migration paths.
> >>>>>>>>>>>
> >>>>>>>>>>> If there's a API / Python category there should also be a API /
> >>>>>>>>>>> Scala
> >>>>>>>>>>> category. This could also include the shala-shell. Note that
> the
> >>>>>>>>>>> existing Scala API category is not mentioned anywhere in the
> >>>>>>>>>>> document.
> >>>>>>>>>>>
> >>>>>>>>>>> How do you actually want to do the migration?
> >>>>>>>>>>>
> >>>>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
> >>>>>>>>>>>> Hi Robert,
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks for starting this discussion. I was also about to
> suggest
> >>>>>>>>>>>> splitting the `Table API & SQL` component because it contains
> >>>>>>>>>>>> already
> >>>>>>>>>>>> more than 1000 issues.
> >>>>>>>>>>>>
> >>>>>>>>>>>> My comments:
> >>>>>>>>>>>>
> >>>>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term
> goal
> >>>>>>>>>>>> might
> >>>>>>>>>>>> not only be a CLI interface. I would keep the generic name
> "SQL
> >>>>>>>>>>>> Client" for now. This is also what is written in FLIPs,
> >>>>>>> presentations,
> >>>>>>>>>>>> and documentation.
> >>>>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is
> >> read-only
> >>>>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner
> >> is
> >>>>>>> more
> >>>>>>>>>>>> generic.
> >>>>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know
> >> what
> >>>>>>>>>>>> Gelly means. This is the only component that has a "feature
> >>>>>>>>>>>> name". I
> >>>>>>>>>>>> don't know if we want to stick with that in the future.
> >>>>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because
> SQL
> >>>>>>>>>>>> connectors are tightly bound to SQL internals but also to the
> >>>>>>>>>>>> connector itself.
> >>>>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name
> >> is
> >>>>>>> more
> >>>>>>>>>>>> generic and reflects the efforts about Hive Metastore and
> >> catalog
> >>>>>>>>>>>> integration that is currenlty taking place.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Timo
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
> >>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I am currently trying to improve how the Flink community is
> >>>>>>>>>>>>> handling
> >>>>>>>>>>>>> incoming pull requests and JIRA tickets.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've looked at how other big communities are handling such a
> >> high
> >>>>>>>>>>>>> number of
> >>>>>>>>>>>>> contributions, and I found that many are using GitHub labels
> >>>>>>>>>>>>> extensively.
> >>>>>>>>>>>>> An integral part of the label use is to tag PRs with the
> >>>>>>>>>>>>> component /
> >>>>>>>>>>>>> area
> >>>>>>>>>>>>> they belong to. I think the most obvious and logical way of
> >>>>>>>>>>>>> tagging
> >>>>>>>>>>>>> the PRs
> >>>>>>>>>>>>> is by using the JIRA components. This will force us to keep
> >>>>>>>>>>>>> the JIRA
> >>>>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
> >>>>>>>>>>>>> I will soon start a separate discussion for the GitHub
> labels.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Let's first discuss the JIRA components.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've created the following Wiki page with my proposal of the
> >> new
> >>>>>>>>>>>>> component,
> >>>>>>>>>>>>> and how to migrate from the existing components:
> >>>>>>>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>>>>
> >>>>>>>>>>>>> Please comment here or directly in the Wiki to let me know
> >>>>>>>>>>>>> what you
> >>>>>>>>>>>>> think.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Robert
> >>>>>>>>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSS] Clean up and reorganize the JIRA components

Posted by Piotr Nowojski <pi...@ververica.com>.
Hi,

> Concrete operator implementations will then go into the "API / DataStream"?
> (or "API / DataSet" or Table)
> Afaik, there were some ideas to share operator implementations between
> DataStream and Table

Yes & yes. I think for now we could keep the concrete operators implementations under API / DataStream and we can split them out once we have true use case for that. Unless this is confusing for someone, in that case we could split it now to API / DataStream Operators.

>> 2. I think we should add additional component for benchmarks and
>> benchmarking infrastructure. While this is more complicated topic (because
>> of the setup and how is it running), it should be on the same level as
>> correctness tests.
>> 
> 
> I'm not sure if it is a good idea to add a "Benchmarks" component into the
> Flink JIRA. Afaik, the benchmarks are managed from here?
> https://github.com/dataArtisans/flink/tree/benchmark-request <https://github.com/dataArtisans/flink/tree/benchmark-request>

Not all of them, some of them are in apache/flink. And it might be a subject to change in the future. Ideally we should have benchmarking code in the same repository, if not for some licensing issues. Also if we ever implement full cluster benchmarks (not using JMH), they could also reside in the Flink repository. 

Regardless of that, does it matter where the benchmarks are? In my opinion the only thing that matters is that benchmarks are just another for of tests/verification, we have unit tests, integrations tests, end to end tests and also various level benchmarks. Why should those things be treated differently?

> Doesn't it make sense to track issues with GH issues there?
> Or asking more broadly, what types of issues would you see in that
> component?

Same kind of issues as for any other type of tests. For example:
- release blocker Jira issue that benchmarks are broken and are not testing anything (from time to time we have to fix something in the benchmarking setup and also it happened couple of times, that benchmarks have discovered some release blocker regressions in the Flink)
- Jira issue to fix some benchmark
- Jira issue to implement a missing benchmark
- …

Piotrek

> On 17 May 2019, at 14:41, Robert Metzger <rm...@apache.org> wrote:
> 
> Hi,
> 
> 1. Renaming “Runtime / Operators” to “Runtime / Task” or something like
>> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it
>> sounded like it covers concrete implementations of the operators, like
>> “WindowOperator” or various join implementations.
>> 
> 
> I'm fine with this renaming.
> Concrete operator implementations will then go into the "API / DataStream"?
> (or "API / DataSet" or Table)
> Afaik, there were some ideas to share operator implementations between
> DataStream and Table. If that's the case, we would have to find a good
> components for that as well.
> 
> 
>> 
>> 2. I think we should add additional component for benchmarks and
>> benchmarking infrastructure. While this is more complicated topic (because
>> of the setup and how is it running), it should be on the same level as
>> correctness tests.
>> 
> 
> I'm not sure if it is a good idea to add a "Benchmarks" component into the
> Flink JIRA. Afaik, the benchmarks are managed from here?
> https://github.com/dataArtisans/flink/tree/benchmark-request
> Doesn't it make sense to track issues with GH issues there?
> Or asking more broadly, what types of issues would you see in that
> component?
> 
> 
>> 
>> Piotrek
>> 
>>> On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
>>> 
>>> Thanks a lot Timo!
>>> 
>>> I will start a vote Chesnay!
>>> 
>>> On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org>
>> wrote:
>>> 
>>>> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
>>>> component. It seems to be the biggest with 1229 Issues.
>>>> 
>>>> Thanks,
>>>> Timo
>>>> 
>>>> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
>>>>> I would prefer if you'd start a vote with a new cleaned up proposal.
>>>>> 
>>>>> On 18.02.2019 15:23, Robert Metzger wrote:
>>>>>> I added "Runtime / Configuration" to the proposal:
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
>>>>>> 
>>>>>> 
>>>>>> Since this discussion has been open for 10 days, I assume we have
>>>>>> reached
>>>>>> consensus here. I will soon start renaming components.
>>>>>> 
>>>>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <chesnay@apache.org
>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> The only parent I can think of is "Infrastructure", but I don't quite
>>>>>>> like it :/
>>>>>>> 
>>>>>>> +1 for "Runtime / Configuration"; this is too general to be placed in
>>>>>>> coordination imo.
>>>>>>> 
>>>>>>> On 12.02.2019 18:25, Robert Metzger wrote:
>>>>>>>> Thanks a lot for your feedback Chesnay!
>>>>>>>> 
>>>>>>>> re build/travis/release: Do you have a good idea for a common
>>>>>>>> parent for
>>>>>>>> "Build System", "Travis" and "Release System"?
>>>>>>>> 
>>>>>>>> re legacy: Okay, I see your point. I will keep the Legacy Components
>>>>>>> prefix.
>>>>>>>> re library: I think I don't have a argument here. My proposal is
>>>>>>>> based on
>>>>>>>> what I felt as being right :) I added the "Library / " prefix to the
>>>>>>>> proposal.
>>>>>>>> 
>>>>>>>> re core/config: From the proposed components, I see the best match
>>>>>>>> with
>>>>>>>> "Runtime / Coordination", but I agree that this example is
>>>>>>>> difficult to
>>>>>>>> place into my proposed scheme. Do you think we should introduce
>>>>>>>> "Runtime
>>>>>>> /
>>>>>>>> Configuration" as a component?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I updated the proposal accordingly!
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <
>> chesnay@apache.org
>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> re build/travis/release: No, I'm against merging build system,
>> travis
>>>>>>>>> and release system.
>>>>>>>>> 
>>>>>>>>> re legacy: So going forward you're proposing to move dropped
>> features
>>>>>>>>> into the legacy bucket and make it impossible to search for
>> specific
>>>>>>>>> issues for that component? There's 0 overhead to having these
>>>>>>>>> components, so I really don't get the benefit here, but see the
>>>>>>> overhead.
>>>>>>>>> I don't buy the argument of "people will not open issues if the
>>>>>>>>> component doesn't exist", they will just leave the component field
>>>>>>>>> blank
>>>>>>>>> or add a random one (that would be wrong). In fact, if you had a
>>>>>>>>> storm/tez component (that users would adhere to) then it would be
>>>>>>>>> _easier_ to figure out whether an issue can be rejected right away.
>>>>>>>>> 
>>>>>>>>> re library: If you are against a library category, what's your
>>>>>>>>> argument
>>>>>>>>> for a connector category?
>>>>>>>>> 
>>>>>>>>> re tests: I don't mind "tests" being removed from tickets about
>> test
>>>>>>>>> instabilities, but you specified the migration as "rename E2E
>> tests"
>>>>>>>>> which is not equivalent.
>>>>>>>>> Under what category would you file modifications to
>>>>>>> flink-test-utils-junit?
>>>>>>>>> I would propose to not differentiate between e2e and other tests; I
>>>>>>>>> would go along with "Test infrastructure", and remove the major
>>>>>>>>> "Tests"
>>>>>>>>> category.
>>>>>>>>> 
>>>>>>>>> re core/config: As an example, where (under Runtime) would you
>>>>>>>>> place the
>>>>>>>>> introduction of the ConfigOption class?
>>>>>>>>> 
>>>>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
>>>>>>>>>> Thanks a lot for your feedback!
>>>>>>>>>> 
>>>>>>>>>> @Timo:
>>>>>>>>>> I've followed your suggestions and updated the proposed names in
>> the
>>>>>>>>> wiki.
>>>>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
>>>>>>>>>> not much
>>>>>>>>>> knowledge) would not add this component at the moment, and put
>>>>>>>>>> the SQL
>>>>>>>>>> stuff into the respective connector component.
>>>>>>>>>> It is probably pretty difficult for a user to decide whether a but
>>>>>>>>> belongs
>>>>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does
>> not
>>>>>>> work.
>>>>>>>>>> @Chesnay:
>>>>>>>>>> - You are suggesting to rename "Build System" to "Maven" and still
>>>>>>> merge
>>>>>>>>> it
>>>>>>>>>> with "Travis", "Release System" etc. as in the proposal?
>>>>>>>>>> 
>>>>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
>>>>>>>>>> changed the
>>>>>>>>>> proposal
>>>>>>>>>> 
>>>>>>>>>> - Re. "Documentation": Yes, I think that would be better in the
>> long
>>>>>>> run.
>>>>>>>>>> We are already in a situation where there are groups within the
>>>>>>> community
>>>>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
>>>>>>>>>> connectors). Those groups will monitor their components, but it
>> will
>>>>>>> be a
>>>>>>>>>> lot of overhead for them to monitor the "Documentation" component.
>>>>>>>>>> We can also try to assign documentation components to both
>>>>>>>>> "Documentation"
>>>>>>>>>> and the affected component, such as "Runtime / Metrics".
>>>>>>>>>> 
>>>>>>>>>> - Removed "Misc / " prefix.
>>>>>>>>>> 
>>>>>>>>>> - "Legacy Components": Usually legacy components usually have
>>>>>>>>>> very few
>>>>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
>>>>>>>>>> a bulk
>>>>>>>>>> edit feature :)
>>>>>>>>>> The benefit of having it generalized is that people will probably
>>>>>>>>>> not
>>>>>>> add
>>>>>>>>>> tickets to it.
>>>>>>>>>> 
>>>>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
>>>>>>>>> libraries
>>>>>>>>>> might grow in the future (like the Table API), then we need to
>>>>>>>>>> rename.
>>>>>>>>>> the "flink-libraries" module does contain stuff like the sql
>>>>>>>>>> client or
>>>>>>>>> the
>>>>>>>>>> python api, which are already covered by other components in my
>>>>>>> proposal
>>>>>>>>> --
>>>>>>>>>> so going with the maven module structure is not an argument here.
>>>>>>>>>> 
>>>>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
>>>>>>>>>> with the
>>>>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics,
>> ..
>>>>>>>>> should
>>>>>>>>>> get visibility into "their" test instabilities through "their"
>>>>>>>>> components.
>>>>>>>>>> Not many people will feel responsible for the "Tests" component.
>>>>>>>>>> 
>>>>>>>>>> For "Core" and "Configuration", I will move the tickets to the
>>>>>>>>> appropriate
>>>>>>>>>> components in "Runtime /".
>>>>>>>>>> 
>>>>>>>>>> For "API / Scala": Good point. I will add that component.
>>>>>>>>>> 
>>>>>>>>>> How to do it? I will just go through the pain and do it.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Robert
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <
>> chesnay@apache.org
>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>> Some concerns:
>>>>>>>>>>> 
>>>>>>>>>>> Travis and build system / release system are entirely different.
>> I
>>>>>>> would
>>>>>>>>>>> even keep the release system away from the build-system, as it
>>>>>>>>>>> is more
>>>>>>>>>>> about the release scripts and documentation, while the latter is
>>>>>>>>>>> about
>>>>>>>>>>> maven. Actually I'd just rename build-system to maven.
>>>>>>>>>>> 
>>>>>>>>>>> Control Plane is a term I've never heard before in this context;
>>>>>>>>>>> I'd
>>>>>>>>>>> replace it with Coordination.
>>>>>>>>>>> 
>>>>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
>>>>>>> component".
>>>>>>>>>>> In other words, if I make a change to the metrics documentation I
>>>>>>>>>>> shouldn't use this component any more?
>>>>>>>>>>> 
>>>>>>>>>>> I don't see the benefit of a `Misc` major category. I'd attribute
>>>>>>>>>>> everything that doesn't have a major category implicitly to
>> "Misc".
>>>>>>>>>>> 
>>>>>>>>>>> Not a fan of a generalized "Legacy components" category; this
>> seems
>>>>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have to
>>>>>>>>>>> touch
>>>>>>>>>>> every JIRA for a component if we drop it.
>>>>>>>>>>> 
>>>>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
>>>>>>>>>>> 
>>>>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
>>>>>>>>>>> Infrastructure is not about fixing failing tests, which is what
>> we
>>>>>>>>>>> partially used this component for so far.
>>>>>>>>>>> 
>>>>>>>>>>> I don't believe you can get rid of the generic "Tests" component;
>>>>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
>>>>>>>>>>> 
>>>>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
>>>>>>>>>>> listed any
>>>>>>>>>>> migration paths.
>>>>>>>>>>> 
>>>>>>>>>>> If there's a API / Python category there should also be a API /
>>>>>>>>>>> Scala
>>>>>>>>>>> category. This could also include the shala-shell. Note that the
>>>>>>>>>>> existing Scala API category is not mentioned anywhere in the
>>>>>>>>>>> document.
>>>>>>>>>>> 
>>>>>>>>>>> How do you actually want to do the migration?
>>>>>>>>>>> 
>>>>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
>>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>> 
>>>>>>>>>>>> thanks for starting this discussion. I was also about to suggest
>>>>>>>>>>>> splitting the `Table API & SQL` component because it contains
>>>>>>>>>>>> already
>>>>>>>>>>>> more than 1000 issues.
>>>>>>>>>>>> 
>>>>>>>>>>>> My comments:
>>>>>>>>>>>> 
>>>>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term goal
>>>>>>>>>>>> might
>>>>>>>>>>>> not only be a CLI interface. I would keep the generic name "SQL
>>>>>>>>>>>> Client" for now. This is also what is written in FLIPs,
>>>>>>> presentations,
>>>>>>>>>>>> and documentation.
>>>>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is
>> read-only
>>>>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner
>> is
>>>>>>> more
>>>>>>>>>>>> generic.
>>>>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know
>> what
>>>>>>>>>>>> Gelly means. This is the only component that has a "feature
>>>>>>>>>>>> name". I
>>>>>>>>>>>> don't know if we want to stick with that in the future.
>>>>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because SQL
>>>>>>>>>>>> connectors are tightly bound to SQL internals but also to the
>>>>>>>>>>>> connector itself.
>>>>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name
>> is
>>>>>>> more
>>>>>>>>>>>> generic and reflects the efforts about Hive Metastore and
>> catalog
>>>>>>>>>>>> integration that is currenlty taking place.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Timo
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I am currently trying to improve how the Flink community is
>>>>>>>>>>>>> handling
>>>>>>>>>>>>> incoming pull requests and JIRA tickets.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've looked at how other big communities are handling such a
>> high
>>>>>>>>>>>>> number of
>>>>>>>>>>>>> contributions, and I found that many are using GitHub labels
>>>>>>>>>>>>> extensively.
>>>>>>>>>>>>> An integral part of the label use is to tag PRs with the
>>>>>>>>>>>>> component /
>>>>>>>>>>>>> area
>>>>>>>>>>>>> they belong to. I think the most obvious and logical way of
>>>>>>>>>>>>> tagging
>>>>>>>>>>>>> the PRs
>>>>>>>>>>>>> is by using the JIRA components. This will force us to keep
>>>>>>>>>>>>> the JIRA
>>>>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
>>>>>>>>>>>>> I will soon start a separate discussion for the GitHub labels.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Let's first discuss the JIRA components.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've created the following Wiki page with my proposal of the
>> new
>>>>>>>>>>>>> component,
>>>>>>>>>>>>> and how to migrate from the existing components:
>>>>>>>>>>>>> 
>>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
>>>>>>> 
>>>>>>>>>>>>> Please comment here or directly in the Wiki to let me know
>>>>>>>>>>>>> what you
>>>>>>>>>>>>> think.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Robert
>>>>>>>>>>>>> 
>>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] Clean up and reorganize the JIRA components

Posted by Robert Metzger <rm...@apache.org>.
Hi,

1. Renaming “Runtime / Operators” to “Runtime / Task” or something like
> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it
> sounded like it covers concrete implementations of the operators, like
> “WindowOperator” or various join implementations.
>

I'm fine with this renaming.
Concrete operator implementations will then go into the "API / DataStream"?
(or "API / DataSet" or Table)
Afaik, there were some ideas to share operator implementations between
DataStream and Table. If that's the case, we would have to find a good
components for that as well.


>
> 2. I think we should add additional component for benchmarks and
> benchmarking infrastructure. While this is more complicated topic (because
> of the setup and how is it running), it should be on the same level as
> correctness tests.
>

I'm not sure if it is a good idea to add a "Benchmarks" component into the
Flink JIRA. Afaik, the benchmarks are managed from here?
https://github.com/dataArtisans/flink/tree/benchmark-request
Doesn't it make sense to track issues with GH issues there?
Or asking more broadly, what types of issues would you see in that
component?


>
> Piotrek
>
> > On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
> >
> > Thanks a lot Timo!
> >
> > I will start a vote Chesnay!
> >
> > On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org>
> wrote:
> >
> >> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
> >> component. It seems to be the biggest with 1229 Issues.
> >>
> >> Thanks,
> >> Timo
> >>
> >> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
> >>> I would prefer if you'd start a vote with a new cleaned up proposal.
> >>>
> >>> On 18.02.2019 15:23, Robert Metzger wrote:
> >>>> I added "Runtime / Configuration" to the proposal:
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>
> >>>>
> >>>> Since this discussion has been open for 10 days, I assume we have
> >>>> reached
> >>>> consensus here. I will soon start renaming components.
> >>>>
> >>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <chesnay@apache.org
> >
> >>>> wrote:
> >>>>
> >>>>> The only parent I can think of is "Infrastructure", but I don't quite
> >>>>> like it :/
> >>>>>
> >>>>> +1 for "Runtime / Configuration"; this is too general to be placed in
> >>>>> coordination imo.
> >>>>>
> >>>>> On 12.02.2019 18:25, Robert Metzger wrote:
> >>>>>> Thanks a lot for your feedback Chesnay!
> >>>>>>
> >>>>>> re build/travis/release: Do you have a good idea for a common
> >>>>>> parent for
> >>>>>> "Build System", "Travis" and "Release System"?
> >>>>>>
> >>>>>> re legacy: Okay, I see your point. I will keep the Legacy Components
> >>>>> prefix.
> >>>>>> re library: I think I don't have a argument here. My proposal is
> >>>>>> based on
> >>>>>> what I felt as being right :) I added the "Library / " prefix to the
> >>>>>> proposal.
> >>>>>>
> >>>>>> re core/config: From the proposed components, I see the best match
> >>>>>> with
> >>>>>> "Runtime / Coordination", but I agree that this example is
> >>>>>> difficult to
> >>>>>> place into my proposed scheme. Do you think we should introduce
> >>>>>> "Runtime
> >>>>> /
> >>>>>> Configuration" as a component?
> >>>>>>
> >>>>>>
> >>>>>> I updated the proposal accordingly!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> re build/travis/release: No, I'm against merging build system,
> travis
> >>>>>>> and release system.
> >>>>>>>
> >>>>>>> re legacy: So going forward you're proposing to move dropped
> features
> >>>>>>> into the legacy bucket and make it impossible to search for
> specific
> >>>>>>> issues for that component? There's 0 overhead to having these
> >>>>>>> components, so I really don't get the benefit here, but see the
> >>>>> overhead.
> >>>>>>> I don't buy the argument of "people will not open issues if the
> >>>>>>> component doesn't exist", they will just leave the component field
> >>>>>>> blank
> >>>>>>> or add a random one (that would be wrong). In fact, if you had a
> >>>>>>> storm/tez component (that users would adhere to) then it would be
> >>>>>>> _easier_ to figure out whether an issue can be rejected right away.
> >>>>>>>
> >>>>>>> re library: If you are against a library category, what's your
> >>>>>>> argument
> >>>>>>> for a connector category?
> >>>>>>>
> >>>>>>> re tests: I don't mind "tests" being removed from tickets about
> test
> >>>>>>> instabilities, but you specified the migration as "rename E2E
> tests"
> >>>>>>> which is not equivalent.
> >>>>>>> Under what category would you file modifications to
> >>>>> flink-test-utils-junit?
> >>>>>>> I would propose to not differentiate between e2e and other tests; I
> >>>>>>> would go along with "Test infrastructure", and remove the major
> >>>>>>> "Tests"
> >>>>>>> category.
> >>>>>>>
> >>>>>>> re core/config: As an example, where (under Runtime) would you
> >>>>>>> place the
> >>>>>>> introduction of the ConfigOption class?
> >>>>>>>
> >>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
> >>>>>>>> Thanks a lot for your feedback!
> >>>>>>>>
> >>>>>>>> @Timo:
> >>>>>>>> I've followed your suggestions and updated the proposed names in
> the
> >>>>>>> wiki.
> >>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
> >>>>>>>> not much
> >>>>>>>> knowledge) would not add this component at the moment, and put
> >>>>>>>> the SQL
> >>>>>>>> stuff into the respective connector component.
> >>>>>>>> It is probably pretty difficult for a user to decide whether a but
> >>>>>>> belongs
> >>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does
> not
> >>>>> work.
> >>>>>>>> @Chesnay:
> >>>>>>>> - You are suggesting to rename "Build System" to "Maven" and still
> >>>>> merge
> >>>>>>> it
> >>>>>>>> with "Travis", "Release System" etc. as in the proposal?
> >>>>>>>>
> >>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
> >>>>>>>> changed the
> >>>>>>>> proposal
> >>>>>>>>
> >>>>>>>> - Re. "Documentation": Yes, I think that would be better in the
> long
> >>>>> run.
> >>>>>>>> We are already in a situation where there are groups within the
> >>>>> community
> >>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
> >>>>>>>> connectors). Those groups will monitor their components, but it
> will
> >>>>> be a
> >>>>>>>> lot of overhead for them to monitor the "Documentation" component.
> >>>>>>>> We can also try to assign documentation components to both
> >>>>>>> "Documentation"
> >>>>>>>> and the affected component, such as "Runtime / Metrics".
> >>>>>>>>
> >>>>>>>> - Removed "Misc / " prefix.
> >>>>>>>>
> >>>>>>>> - "Legacy Components": Usually legacy components usually have
> >>>>>>>> very few
> >>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
> >>>>>>>> a bulk
> >>>>>>>> edit feature :)
> >>>>>>>> The benefit of having it generalized is that people will probably
> >>>>>>>> not
> >>>>> add
> >>>>>>>> tickets to it.
> >>>>>>>>
> >>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
> >>>>>>> libraries
> >>>>>>>> might grow in the future (like the Table API), then we need to
> >>>>>>>> rename.
> >>>>>>>> the "flink-libraries" module does contain stuff like the sql
> >>>>>>>> client or
> >>>>>>> the
> >>>>>>>> python api, which are already covered by other components in my
> >>>>> proposal
> >>>>>>> --
> >>>>>>>> so going with the maven module structure is not an argument here.
> >>>>>>>>
> >>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
> >>>>>>>> with the
> >>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics,
> ..
> >>>>>>> should
> >>>>>>>> get visibility into "their" test instabilities through "their"
> >>>>>>> components.
> >>>>>>>> Not many people will feel responsible for the "Tests" component.
> >>>>>>>>
> >>>>>>>> For "Core" and "Configuration", I will move the tickets to the
> >>>>>>> appropriate
> >>>>>>>> components in "Runtime /".
> >>>>>>>>
> >>>>>>>> For "API / Scala": Good point. I will add that component.
> >>>>>>>>
> >>>>>>>> How to do it? I will just go through the pain and do it.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Robert
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>>> wrote:
> >>>>>>>>> Some concerns:
> >>>>>>>>>
> >>>>>>>>> Travis and build system / release system are entirely different.
> I
> >>>>> would
> >>>>>>>>> even keep the release system away from the build-system, as it
> >>>>>>>>> is more
> >>>>>>>>> about the release scripts and documentation, while the latter is
> >>>>>>>>> about
> >>>>>>>>> maven. Actually I'd just rename build-system to maven.
> >>>>>>>>>
> >>>>>>>>> Control Plane is a term I've never heard before in this context;
> >>>>>>>>> I'd
> >>>>>>>>> replace it with Coordination.
> >>>>>>>>>
> >>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
> >>>>> component".
> >>>>>>>>> In other words, if I make a change to the metrics documentation I
> >>>>>>>>> shouldn't use this component any more?
> >>>>>>>>>
> >>>>>>>>> I don't see the benefit of a `Misc` major category. I'd attribute
> >>>>>>>>> everything that doesn't have a major category implicitly to
> "Misc".
> >>>>>>>>>
> >>>>>>>>> Not a fan of a generalized "Legacy components" category; this
> seems
> >>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have to
> >>>>>>>>> touch
> >>>>>>>>> every JIRA for a component if we drop it.
> >>>>>>>>>
> >>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
> >>>>>>>>>
> >>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
> >>>>>>>>> Infrastructure is not about fixing failing tests, which is what
> we
> >>>>>>>>> partially used this component for so far.
> >>>>>>>>>
> >>>>>>>>> I don't believe you can get rid of the generic "Tests" component;
> >>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
> >>>>>>>>>
> >>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
> >>>>>>>>> listed any
> >>>>>>>>> migration paths.
> >>>>>>>>>
> >>>>>>>>> If there's a API / Python category there should also be a API /
> >>>>>>>>> Scala
> >>>>>>>>> category. This could also include the shala-shell. Note that the
> >>>>>>>>> existing Scala API category is not mentioned anywhere in the
> >>>>>>>>> document.
> >>>>>>>>>
> >>>>>>>>> How do you actually want to do the migration?
> >>>>>>>>>
> >>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
> >>>>>>>>>> Hi Robert,
> >>>>>>>>>>
> >>>>>>>>>> thanks for starting this discussion. I was also about to suggest
> >>>>>>>>>> splitting the `Table API & SQL` component because it contains
> >>>>>>>>>> already
> >>>>>>>>>> more than 1000 issues.
> >>>>>>>>>>
> >>>>>>>>>> My comments:
> >>>>>>>>>>
> >>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term goal
> >>>>>>>>>> might
> >>>>>>>>>> not only be a CLI interface. I would keep the generic name "SQL
> >>>>>>>>>> Client" for now. This is also what is written in FLIPs,
> >>>>> presentations,
> >>>>>>>>>> and documentation.
> >>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is
> read-only
> >>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner
> is
> >>>>> more
> >>>>>>>>>> generic.
> >>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know
> what
> >>>>>>>>>> Gelly means. This is the only component that has a "feature
> >>>>>>>>>> name". I
> >>>>>>>>>> don't know if we want to stick with that in the future.
> >>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because SQL
> >>>>>>>>>> connectors are tightly bound to SQL internals but also to the
> >>>>>>>>>> connector itself.
> >>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name
> is
> >>>>> more
> >>>>>>>>>> generic and reflects the efforts about Hive Metastore and
> catalog
> >>>>>>>>>> integration that is currenlty taking place.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Timo
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
> >>>>>>>>>>> Hi all,
> >>>>>>>>>>>
> >>>>>>>>>>> I am currently trying to improve how the Flink community is
> >>>>>>>>>>> handling
> >>>>>>>>>>> incoming pull requests and JIRA tickets.
> >>>>>>>>>>>
> >>>>>>>>>>> I've looked at how other big communities are handling such a
> high
> >>>>>>>>>>> number of
> >>>>>>>>>>> contributions, and I found that many are using GitHub labels
> >>>>>>>>>>> extensively.
> >>>>>>>>>>> An integral part of the label use is to tag PRs with the
> >>>>>>>>>>> component /
> >>>>>>>>>>> area
> >>>>>>>>>>> they belong to. I think the most obvious and logical way of
> >>>>>>>>>>> tagging
> >>>>>>>>>>> the PRs
> >>>>>>>>>>> is by using the JIRA components. This will force us to keep
> >>>>>>>>>>> the JIRA
> >>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
> >>>>>>>>>>> I will soon start a separate discussion for the GitHub labels.
> >>>>>>>>>>>
> >>>>>>>>>>> Let's first discuss the JIRA components.
> >>>>>>>>>>>
> >>>>>>>>>>> I've created the following Wiki page with my proposal of the
> new
> >>>>>>>>>>> component,
> >>>>>>>>>>> and how to migrate from the existing components:
> >>>>>>>>>>>
> >>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>>
> >>>>>>>>>>> Please comment here or directly in the Wiki to let me know
> >>>>>>>>>>> what you
> >>>>>>>>>>> think.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Robert
> >>>>>>>>>>>
> >>>>>
> >>
> >>
>
>

Re: [DISCUSS] Clean up and reorganize the JIRA components

Posted by Piotr Nowojski <pi...@ververica.com>.
Just to clarify, by adding benchmark component I meant just admitting that we have some benchmarks both in the flink and flink-benchmarks repositories, and additional support infrastructure (machine executing benchmarks + Jenkins and Codespeed service) and to assign ownership of those components in a similar way as we are doing with Build System, Tests etc.

Piotrek

> On 16 May 2019, at 03:51, JingsongLee <lz...@aliyun.com.INVALID> wrote:
> 
> Big +1 to add benchmark component.
> 1.Many of our code changes now require benchmark. Having a benchmark component makes it much easier for us to align.
> 2.Running benchmark regularly can also prevent performance degradation caused by our code.
> 
> Best, JingsongLee
> 
> 
> ------------------------------------------------------------------
> From:Kurt Young <yk...@gmail.com>
> Send Time:2019年5月15日(星期三) 20:06
> To:dev <de...@flink.apache.org>
> Subject:Re: [DISCUSS] Clean up and reorganize the JIRA components
> 
> +1 to add benchmark component.
> 
> Best,
> Kurt
> 
> 
> On Wed, May 15, 2019 at 6:13 PM Piotr Nowojski <pi...@ververica.com> wrote:
> 
>> Hi,
>> 
>> I would like to propose two changes:
>> 
>> 1. Renaming “Runtime / Operators” to “Runtime / Task” or something like
>> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it
>> sounded like it covers concrete implementations of the operators, like
>> “WindowOperator” or various join implementations.
>> 
>> 2. I think we should add additional component for benchmarks and
>> benchmarking infrastructure. While this is more complicated topic (because
>> of the setup and how is it running), it should be on the same level as
>> correctness tests.
>> 
>> Piotrek
>> 
>>> On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
>>> 
>>> Thanks a lot Timo!
>>> 
>>> I will start a vote Chesnay!
>>> 
>>> On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org>
>> wrote:
>>> 
>>>> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
>>>> component. It seems to be the biggest with 1229 Issues.
>>>> 
>>>> Thanks,
>>>> Timo
>>>> 
>>>> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
>>>>> I would prefer if you'd start a vote with a new cleaned up proposal.
>>>>> 
>>>>> On 18.02.2019 15:23, Robert Metzger wrote:
>>>>>> I added "Runtime / Configuration" to the proposal:
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
>>>>>> 
>>>>>> 
>>>>>> Since this discussion has been open for 10 days, I assume we have
>>>>>> reached
>>>>>> consensus here. I will soon start renaming components.
>>>>>> 
>>>>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <chesnay@apache.org
>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> The only parent I can think of is "Infrastructure", but I don't quite
>>>>>>> like it :/
>>>>>>> 
>>>>>>> +1 for "Runtime / Configuration"; this is too general to be placed in
>>>>>>> coordination imo.
>>>>>>> 
>>>>>>> On 12.02.2019 18:25, Robert Metzger wrote:
>>>>>>>> Thanks a lot for your feedback Chesnay!
>>>>>>>> 
>>>>>>>> re build/travis/release: Do you have a good idea for a common
>>>>>>>> parent for
>>>>>>>> "Build System", "Travis" and "Release System"?
>>>>>>>> 
>>>>>>>> re legacy: Okay, I see your point. I will keep the Legacy Components
>>>>>>> prefix.
>>>>>>>> re library: I think I don't have a argument here. My proposal is
>>>>>>>> based on
>>>>>>>> what I felt as being right :) I added the "Library / " prefix to the
>>>>>>>> proposal.
>>>>>>>> 
>>>>>>>> re core/config: From the proposed components, I see the best match
>>>>>>>> with
>>>>>>>> "Runtime / Coordination", but I agree that this example is
>>>>>>>> difficult to
>>>>>>>> place into my proposed scheme. Do you think we should introduce
>>>>>>>> "Runtime
>>>>>>> /
>>>>>>>> Configuration" as a component?
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I updated the proposal accordingly!
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <
>> chesnay@apache.org
>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> re build/travis/release: No, I'm against merging build system,
>> travis
>>>>>>>>> and release system.
>>>>>>>>> 
>>>>>>>>> re legacy: So going forward you're proposing to move dropped
>> features
>>>>>>>>> into the legacy bucket and make it impossible to search for
>> specific
>>>>>>>>> issues for that component? There's 0 overhead to having these
>>>>>>>>> components, so I really don't get the benefit here, but see the
>>>>>>> overhead.
>>>>>>>>> I don't buy the argument of "people will not open issues if the
>>>>>>>>> component doesn't exist", they will just leave the component field
>>>>>>>>> blank
>>>>>>>>> or add a random one (that would be wrong). In fact, if you had a
>>>>>>>>> storm/tez component (that users would adhere to) then it would be
>>>>>>>>> _easier_ to figure out whether an issue can be rejected right away.
>>>>>>>>> 
>>>>>>>>> re library: If you are against a library category, what's your
>>>>>>>>> argument
>>>>>>>>> for a connector category?
>>>>>>>>> 
>>>>>>>>> re tests: I don't mind "tests" being removed from tickets about
>> test
>>>>>>>>> instabilities, but you specified the migration as "rename E2E
>> tests"
>>>>>>>>> which is not equivalent.
>>>>>>>>> Under what category would you file modifications to
>>>>>>> flink-test-utils-junit?
>>>>>>>>> I would propose to not differentiate between e2e and other tests; I
>>>>>>>>> would go along with "Test infrastructure", and remove the major
>>>>>>>>> "Tests"
>>>>>>>>> category.
>>>>>>>>> 
>>>>>>>>> re core/config: As an example, where (under Runtime) would you
>>>>>>>>> place the
>>>>>>>>> introduction of the ConfigOption class?
>>>>>>>>> 
>>>>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
>>>>>>>>>> Thanks a lot for your feedback!
>>>>>>>>>> 
>>>>>>>>>> @Timo:
>>>>>>>>>> I've followed your suggestions and updated the proposed names in
>> the
>>>>>>>>> wiki.
>>>>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
>>>>>>>>>> not much
>>>>>>>>>> knowledge) would not add this component at the moment, and put
>>>>>>>>>> the SQL
>>>>>>>>>> stuff into the respective connector component.
>>>>>>>>>> It is probably pretty difficult for a user to decide whether a but
>>>>>>>>> belongs
>>>>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does
>> not
>>>>>>> work.
>>>>>>>>>> @Chesnay:
>>>>>>>>>> - You are suggesting to rename "Build System" to "Maven" and still
>>>>>>> merge
>>>>>>>>> it
>>>>>>>>>> with "Travis", "Release System" etc. as in the proposal?
>>>>>>>>>> 
>>>>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
>>>>>>>>>> changed the
>>>>>>>>>> proposal
>>>>>>>>>> 
>>>>>>>>>> - Re. "Documentation": Yes, I think that would be better in the
>> long
>>>>>>> run.
>>>>>>>>>> We are already in a situation where there are groups within the
>>>>>>> community
>>>>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
>>>>>>>>>> connectors). Those groups will monitor their components, but it
>> will
>>>>>>> be a
>>>>>>>>>> lot of overhead for them to monitor the "Documentation" component.
>>>>>>>>>> We can also try to assign documentation components to both
>>>>>>>>> "Documentation"
>>>>>>>>>> and the affected component, such as "Runtime / Metrics".
>>>>>>>>>> 
>>>>>>>>>> - Removed "Misc / " prefix.
>>>>>>>>>> 
>>>>>>>>>> - "Legacy Components": Usually legacy components usually have
>>>>>>>>>> very few
>>>>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
>>>>>>>>>> a bulk
>>>>>>>>>> edit feature :)
>>>>>>>>>> The benefit of having it generalized is that people will probably
>>>>>>>>>> not
>>>>>>> add
>>>>>>>>>> tickets to it.
>>>>>>>>>> 
>>>>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
>>>>>>>>> libraries
>>>>>>>>>> might grow in the future (like the Table API), then we need to
>>>>>>>>>> rename.
>>>>>>>>>> the "flink-libraries" module does contain stuff like the sql
>>>>>>>>>> client or
>>>>>>>>> the
>>>>>>>>>> python api, which are already covered by other components in my
>>>>>>> proposal
>>>>>>>>> --
>>>>>>>>>> so going with the maven module structure is not an argument here.
>>>>>>>>>> 
>>>>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
>>>>>>>>>> with the
>>>>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics,
>> ..
>>>>>>>>> should
>>>>>>>>>> get visibility into "their" test instabilities through "their"
>>>>>>>>> components.
>>>>>>>>>> Not many people will feel responsible for the "Tests" component.
>>>>>>>>>> 
>>>>>>>>>> For "Core" and "Configuration", I will move the tickets to the
>>>>>>>>> appropriate
>>>>>>>>>> components in "Runtime /".
>>>>>>>>>> 
>>>>>>>>>> For "API / Scala": Good point. I will add that component.
>>>>>>>>>> 
>>>>>>>>>> How to do it? I will just go through the pain and do it.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Robert
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <
>> chesnay@apache.org
>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>>> Some concerns:
>>>>>>>>>>> 
>>>>>>>>>>> Travis and build system / release system are entirely different.
>> I
>>>>>>> would
>>>>>>>>>>> even keep the release system away from the build-system, as it
>>>>>>>>>>> is more
>>>>>>>>>>> about the release scripts and documentation, while the latter is
>>>>>>>>>>> about
>>>>>>>>>>> maven. Actually I'd just rename build-system to maven.
>>>>>>>>>>> 
>>>>>>>>>>> Control Plane is a term I've never heard before in this context;
>>>>>>>>>>> I'd
>>>>>>>>>>> replace it with Coordination.
>>>>>>>>>>> 
>>>>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
>>>>>>> component".
>>>>>>>>>>> In other words, if I make a change to the metrics documentation I
>>>>>>>>>>> shouldn't use this component any more?
>>>>>>>>>>> 
>>>>>>>>>>> I don't see the benefit of a `Misc` major category. I'd attribute
>>>>>>>>>>> everything that doesn't have a major category implicitly to
>> "Misc".
>>>>>>>>>>> 
>>>>>>>>>>> Not a fan of a generalized "Legacy components" category; this
>> seems
>>>>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have to
>>>>>>>>>>> touch
>>>>>>>>>>> every JIRA for a component if we drop it.
>>>>>>>>>>> 
>>>>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
>>>>>>>>>>> 
>>>>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
>>>>>>>>>>> Infrastructure is not about fixing failing tests, which is what
>> we
>>>>>>>>>>> partially used this component for so far.
>>>>>>>>>>> 
>>>>>>>>>>> I don't believe you can get rid of the generic "Tests" component;
>>>>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
>>>>>>>>>>> 
>>>>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
>>>>>>>>>>> listed any
>>>>>>>>>>> migration paths.
>>>>>>>>>>> 
>>>>>>>>>>> If there's a API / Python category there should also be a API /
>>>>>>>>>>> Scala
>>>>>>>>>>> category. This could also include the shala-shell. Note that the
>>>>>>>>>>> existing Scala API category is not mentioned anywhere in the
>>>>>>>>>>> document.
>>>>>>>>>>> 
>>>>>>>>>>> How do you actually want to do the migration?
>>>>>>>>>>> 
>>>>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
>>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>> 
>>>>>>>>>>>> thanks for starting this discussion. I was also about to suggest
>>>>>>>>>>>> splitting the `Table API & SQL` component because it contains
>>>>>>>>>>>> already
>>>>>>>>>>>> more than 1000 issues.
>>>>>>>>>>>> 
>>>>>>>>>>>> My comments:
>>>>>>>>>>>> 
>>>>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term goal
>>>>>>>>>>>> might
>>>>>>>>>>>> not only be a CLI interface. I would keep the generic name "SQL
>>>>>>>>>>>> Client" for now. This is also what is written in FLIPs,
>>>>>>> presentations,
>>>>>>>>>>>> and documentation.
>>>>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is
>> read-only
>>>>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner
>> is
>>>>>>> more
>>>>>>>>>>>> generic.
>>>>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know
>> what
>>>>>>>>>>>> Gelly means. This is the only component that has a "feature
>>>>>>>>>>>> name". I
>>>>>>>>>>>> don't know if we want to stick with that in the future.
>>>>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because SQL
>>>>>>>>>>>> connectors are tightly bound to SQL internals but also to the
>>>>>>>>>>>> connector itself.
>>>>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name
>> is
>>>>>>> more
>>>>>>>>>>>> generic and reflects the efforts about Hive Metastore and
>> catalog
>>>>>>>>>>>> integration that is currenlty taking place.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Timo
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I am currently trying to improve how the Flink community is
>>>>>>>>>>>>> handling
>>>>>>>>>>>>> incoming pull requests and JIRA tickets.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've looked at how other big communities are handling such a
>> high
>>>>>>>>>>>>> number of
>>>>>>>>>>>>> contributions, and I found that many are using GitHub labels
>>>>>>>>>>>>> extensively.
>>>>>>>>>>>>> An integral part of the label use is to tag PRs with the
>>>>>>>>>>>>> component /
>>>>>>>>>>>>> area
>>>>>>>>>>>>> they belong to. I think the most obvious and logical way of
>>>>>>>>>>>>> tagging
>>>>>>>>>>>>> the PRs
>>>>>>>>>>>>> is by using the JIRA components. This will force us to keep
>>>>>>>>>>>>> the JIRA
>>>>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
>>>>>>>>>>>>> I will soon start a separate discussion for the GitHub labels.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Let's first discuss the JIRA components.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I've created the following Wiki page with my proposal of the
>> new
>>>>>>>>>>>>> component,
>>>>>>>>>>>>> and how to migrate from the existing components:
>>>>>>>>>>>>> 
>>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
>>>>>>> 
>>>>>>>>>>>>> Please comment here or directly in the Wiki to let me know
>>>>>>>>>>>>> what you
>>>>>>>>>>>>> think.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Robert
>>>>>>>>>>>>> 
>>>>>>> 
>>>> 
>>>> 
>> 
>> 


Re: [DISCUSS] Clean up and reorganize the JIRA components

Posted by JingsongLee <lz...@aliyun.com.INVALID>.
Big +1 to add benchmark component.
1.Many of our code changes now require benchmark. Having a benchmark component makes it much easier for us to align.
2.Running benchmark regularly can also prevent performance degradation caused by our code.

Best, JingsongLee


------------------------------------------------------------------
From:Kurt Young <yk...@gmail.com>
Send Time:2019年5月15日(星期三) 20:06
To:dev <de...@flink.apache.org>
Subject:Re: [DISCUSS] Clean up and reorganize the JIRA components

+1 to add benchmark component.

Best,
Kurt


On Wed, May 15, 2019 at 6:13 PM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> I would like to propose two changes:
>
> 1. Renaming “Runtime / Operators” to “Runtime / Task” or something like
> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it
> sounded like it covers concrete implementations of the operators, like
> “WindowOperator” or various join implementations.
>
> 2. I think we should add additional component for benchmarks and
> benchmarking infrastructure. While this is more complicated topic (because
> of the setup and how is it running), it should be on the same level as
> correctness tests.
>
> Piotrek
>
> > On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
> >
> > Thanks a lot Timo!
> >
> > I will start a vote Chesnay!
> >
> > On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org>
> wrote:
> >
> >> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
> >> component. It seems to be the biggest with 1229 Issues.
> >>
> >> Thanks,
> >> Timo
> >>
> >> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
> >>> I would prefer if you'd start a vote with a new cleaned up proposal.
> >>>
> >>> On 18.02.2019 15:23, Robert Metzger wrote:
> >>>> I added "Runtime / Configuration" to the proposal:
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>
> >>>>
> >>>> Since this discussion has been open for 10 days, I assume we have
> >>>> reached
> >>>> consensus here. I will soon start renaming components.
> >>>>
> >>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <chesnay@apache.org
> >
> >>>> wrote:
> >>>>
> >>>>> The only parent I can think of is "Infrastructure", but I don't quite
> >>>>> like it :/
> >>>>>
> >>>>> +1 for "Runtime / Configuration"; this is too general to be placed in
> >>>>> coordination imo.
> >>>>>
> >>>>> On 12.02.2019 18:25, Robert Metzger wrote:
> >>>>>> Thanks a lot for your feedback Chesnay!
> >>>>>>
> >>>>>> re build/travis/release: Do you have a good idea for a common
> >>>>>> parent for
> >>>>>> "Build System", "Travis" and "Release System"?
> >>>>>>
> >>>>>> re legacy: Okay, I see your point. I will keep the Legacy Components
> >>>>> prefix.
> >>>>>> re library: I think I don't have a argument here. My proposal is
> >>>>>> based on
> >>>>>> what I felt as being right :) I added the "Library / " prefix to the
> >>>>>> proposal.
> >>>>>>
> >>>>>> re core/config: From the proposed components, I see the best match
> >>>>>> with
> >>>>>> "Runtime / Coordination", but I agree that this example is
> >>>>>> difficult to
> >>>>>> place into my proposed scheme. Do you think we should introduce
> >>>>>> "Runtime
> >>>>> /
> >>>>>> Configuration" as a component?
> >>>>>>
> >>>>>>
> >>>>>> I updated the proposal accordingly!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> re build/travis/release: No, I'm against merging build system,
> travis
> >>>>>>> and release system.
> >>>>>>>
> >>>>>>> re legacy: So going forward you're proposing to move dropped
> features
> >>>>>>> into the legacy bucket and make it impossible to search for
> specific
> >>>>>>> issues for that component? There's 0 overhead to having these
> >>>>>>> components, so I really don't get the benefit here, but see the
> >>>>> overhead.
> >>>>>>> I don't buy the argument of "people will not open issues if the
> >>>>>>> component doesn't exist", they will just leave the component field
> >>>>>>> blank
> >>>>>>> or add a random one (that would be wrong). In fact, if you had a
> >>>>>>> storm/tez component (that users would adhere to) then it would be
> >>>>>>> _easier_ to figure out whether an issue can be rejected right away.
> >>>>>>>
> >>>>>>> re library: If you are against a library category, what's your
> >>>>>>> argument
> >>>>>>> for a connector category?
> >>>>>>>
> >>>>>>> re tests: I don't mind "tests" being removed from tickets about
> test
> >>>>>>> instabilities, but you specified the migration as "rename E2E
> tests"
> >>>>>>> which is not equivalent.
> >>>>>>> Under what category would you file modifications to
> >>>>> flink-test-utils-junit?
> >>>>>>> I would propose to not differentiate between e2e and other tests; I
> >>>>>>> would go along with "Test infrastructure", and remove the major
> >>>>>>> "Tests"
> >>>>>>> category.
> >>>>>>>
> >>>>>>> re core/config: As an example, where (under Runtime) would you
> >>>>>>> place the
> >>>>>>> introduction of the ConfigOption class?
> >>>>>>>
> >>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
> >>>>>>>> Thanks a lot for your feedback!
> >>>>>>>>
> >>>>>>>> @Timo:
> >>>>>>>> I've followed your suggestions and updated the proposed names in
> the
> >>>>>>> wiki.
> >>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
> >>>>>>>> not much
> >>>>>>>> knowledge) would not add this component at the moment, and put
> >>>>>>>> the SQL
> >>>>>>>> stuff into the respective connector component.
> >>>>>>>> It is probably pretty difficult for a user to decide whether a but
> >>>>>>> belongs
> >>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does
> not
> >>>>> work.
> >>>>>>>> @Chesnay:
> >>>>>>>> - You are suggesting to rename "Build System" to "Maven" and still
> >>>>> merge
> >>>>>>> it
> >>>>>>>> with "Travis", "Release System" etc. as in the proposal?
> >>>>>>>>
> >>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
> >>>>>>>> changed the
> >>>>>>>> proposal
> >>>>>>>>
> >>>>>>>> - Re. "Documentation": Yes, I think that would be better in the
> long
> >>>>> run.
> >>>>>>>> We are already in a situation where there are groups within the
> >>>>> community
> >>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
> >>>>>>>> connectors). Those groups will monitor their components, but it
> will
> >>>>> be a
> >>>>>>>> lot of overhead for them to monitor the "Documentation" component.
> >>>>>>>> We can also try to assign documentation components to both
> >>>>>>> "Documentation"
> >>>>>>>> and the affected component, such as "Runtime / Metrics".
> >>>>>>>>
> >>>>>>>> - Removed "Misc / " prefix.
> >>>>>>>>
> >>>>>>>> - "Legacy Components": Usually legacy components usually have
> >>>>>>>> very few
> >>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
> >>>>>>>> a bulk
> >>>>>>>> edit feature :)
> >>>>>>>> The benefit of having it generalized is that people will probably
> >>>>>>>> not
> >>>>> add
> >>>>>>>> tickets to it.
> >>>>>>>>
> >>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
> >>>>>>> libraries
> >>>>>>>> might grow in the future (like the Table API), then we need to
> >>>>>>>> rename.
> >>>>>>>> the "flink-libraries" module does contain stuff like the sql
> >>>>>>>> client or
> >>>>>>> the
> >>>>>>>> python api, which are already covered by other components in my
> >>>>> proposal
> >>>>>>> --
> >>>>>>>> so going with the maven module structure is not an argument here.
> >>>>>>>>
> >>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
> >>>>>>>> with the
> >>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics,
> ..
> >>>>>>> should
> >>>>>>>> get visibility into "their" test instabilities through "their"
> >>>>>>> components.
> >>>>>>>> Not many people will feel responsible for the "Tests" component.
> >>>>>>>>
> >>>>>>>> For "Core" and "Configuration", I will move the tickets to the
> >>>>>>> appropriate
> >>>>>>>> components in "Runtime /".
> >>>>>>>>
> >>>>>>>> For "API / Scala": Good point. I will add that component.
> >>>>>>>>
> >>>>>>>> How to do it? I will just go through the pain and do it.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Robert
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>>> wrote:
> >>>>>>>>> Some concerns:
> >>>>>>>>>
> >>>>>>>>> Travis and build system / release system are entirely different.
> I
> >>>>> would
> >>>>>>>>> even keep the release system away from the build-system, as it
> >>>>>>>>> is more
> >>>>>>>>> about the release scripts and documentation, while the latter is
> >>>>>>>>> about
> >>>>>>>>> maven. Actually I'd just rename build-system to maven.
> >>>>>>>>>
> >>>>>>>>> Control Plane is a term I've never heard before in this context;
> >>>>>>>>> I'd
> >>>>>>>>> replace it with Coordination.
> >>>>>>>>>
> >>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
> >>>>> component".
> >>>>>>>>> In other words, if I make a change to the metrics documentation I
> >>>>>>>>> shouldn't use this component any more?
> >>>>>>>>>
> >>>>>>>>> I don't see the benefit of a `Misc` major category. I'd attribute
> >>>>>>>>> everything that doesn't have a major category implicitly to
> "Misc".
> >>>>>>>>>
> >>>>>>>>> Not a fan of a generalized "Legacy components" category; this
> seems
> >>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have to
> >>>>>>>>> touch
> >>>>>>>>> every JIRA for a component if we drop it.
> >>>>>>>>>
> >>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
> >>>>>>>>>
> >>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
> >>>>>>>>> Infrastructure is not about fixing failing tests, which is what
> we
> >>>>>>>>> partially used this component for so far.
> >>>>>>>>>
> >>>>>>>>> I don't believe you can get rid of the generic "Tests" component;
> >>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
> >>>>>>>>>
> >>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
> >>>>>>>>> listed any
> >>>>>>>>> migration paths.
> >>>>>>>>>
> >>>>>>>>> If there's a API / Python category there should also be a API /
> >>>>>>>>> Scala
> >>>>>>>>> category. This could also include the shala-shell. Note that the
> >>>>>>>>> existing Scala API category is not mentioned anywhere in the
> >>>>>>>>> document.
> >>>>>>>>>
> >>>>>>>>> How do you actually want to do the migration?
> >>>>>>>>>
> >>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
> >>>>>>>>>> Hi Robert,
> >>>>>>>>>>
> >>>>>>>>>> thanks for starting this discussion. I was also about to suggest
> >>>>>>>>>> splitting the `Table API & SQL` component because it contains
> >>>>>>>>>> already
> >>>>>>>>>> more than 1000 issues.
> >>>>>>>>>>
> >>>>>>>>>> My comments:
> >>>>>>>>>>
> >>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term goal
> >>>>>>>>>> might
> >>>>>>>>>> not only be a CLI interface. I would keep the generic name "SQL
> >>>>>>>>>> Client" for now. This is also what is written in FLIPs,
> >>>>> presentations,
> >>>>>>>>>> and documentation.
> >>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is
> read-only
> >>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner
> is
> >>>>> more
> >>>>>>>>>> generic.
> >>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know
> what
> >>>>>>>>>> Gelly means. This is the only component that has a "feature
> >>>>>>>>>> name". I
> >>>>>>>>>> don't know if we want to stick with that in the future.
> >>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because SQL
> >>>>>>>>>> connectors are tightly bound to SQL internals but also to the
> >>>>>>>>>> connector itself.
> >>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name
> is
> >>>>> more
> >>>>>>>>>> generic and reflects the efforts about Hive Metastore and
> catalog
> >>>>>>>>>> integration that is currenlty taking place.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Timo
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
> >>>>>>>>>>> Hi all,
> >>>>>>>>>>>
> >>>>>>>>>>> I am currently trying to improve how the Flink community is
> >>>>>>>>>>> handling
> >>>>>>>>>>> incoming pull requests and JIRA tickets.
> >>>>>>>>>>>
> >>>>>>>>>>> I've looked at how other big communities are handling such a
> high
> >>>>>>>>>>> number of
> >>>>>>>>>>> contributions, and I found that many are using GitHub labels
> >>>>>>>>>>> extensively.
> >>>>>>>>>>> An integral part of the label use is to tag PRs with the
> >>>>>>>>>>> component /
> >>>>>>>>>>> area
> >>>>>>>>>>> they belong to. I think the most obvious and logical way of
> >>>>>>>>>>> tagging
> >>>>>>>>>>> the PRs
> >>>>>>>>>>> is by using the JIRA components. This will force us to keep
> >>>>>>>>>>> the JIRA
> >>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
> >>>>>>>>>>> I will soon start a separate discussion for the GitHub labels.
> >>>>>>>>>>>
> >>>>>>>>>>> Let's first discuss the JIRA components.
> >>>>>>>>>>>
> >>>>>>>>>>> I've created the following Wiki page with my proposal of the
> new
> >>>>>>>>>>> component,
> >>>>>>>>>>> and how to migrate from the existing components:
> >>>>>>>>>>>
> >>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>>
> >>>>>>>>>>> Please comment here or directly in the Wiki to let me know
> >>>>>>>>>>> what you
> >>>>>>>>>>> think.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Robert
> >>>>>>>>>>>
> >>>>>
> >>
> >>
>
>

Re: [DISCUSS] Clean up and reorganize the JIRA components

Posted by Kurt Young <yk...@gmail.com>.
+1 to add benchmark component.

Best,
Kurt


On Wed, May 15, 2019 at 6:13 PM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> I would like to propose two changes:
>
> 1. Renaming “Runtime / Operators” to “Runtime / Task” or something like
> “Runtime / Processing”. “Runtime / Operators” was confusing me, since it
> sounded like it covers concrete implementations of the operators, like
> “WindowOperator” or various join implementations.
>
> 2. I think we should add additional component for benchmarks and
> benchmarking infrastructure. While this is more complicated topic (because
> of the setup and how is it running), it should be on the same level as
> correctness tests.
>
> Piotrek
>
> > On 20 Feb 2019, at 10:53, Robert Metzger <rm...@apache.org> wrote:
> >
> > Thanks a lot Timo!
> >
> > I will start a vote Chesnay!
> >
> > On Wed, Feb 20, 2019 at 10:11 AM Timo Walther <tw...@apache.org>
> wrote:
> >
> >> +1 for the vote. Btw I can help cleaning up the "Table API & SQL"
> >> component. It seems to be the biggest with 1229 Issues.
> >>
> >> Thanks,
> >> Timo
> >>
> >> Am 20.02.19 um 10:09 schrieb Chesnay Schepler:
> >>> I would prefer if you'd start a vote with a new cleaned up proposal.
> >>>
> >>> On 18.02.2019 15:23, Robert Metzger wrote:
> >>>> I added "Runtime / Configuration" to the proposal:
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>
> >>>>
> >>>> Since this discussion has been open for 10 days, I assume we have
> >>>> reached
> >>>> consensus here. I will soon start renaming components.
> >>>>
> >>>> On Wed, Feb 13, 2019 at 10:51 AM Chesnay Schepler <chesnay@apache.org
> >
> >>>> wrote:
> >>>>
> >>>>> The only parent I can think of is "Infrastructure", but I don't quite
> >>>>> like it :/
> >>>>>
> >>>>> +1 for "Runtime / Configuration"; this is too general to be placed in
> >>>>> coordination imo.
> >>>>>
> >>>>> On 12.02.2019 18:25, Robert Metzger wrote:
> >>>>>> Thanks a lot for your feedback Chesnay!
> >>>>>>
> >>>>>> re build/travis/release: Do you have a good idea for a common
> >>>>>> parent for
> >>>>>> "Build System", "Travis" and "Release System"?
> >>>>>>
> >>>>>> re legacy: Okay, I see your point. I will keep the Legacy Components
> >>>>> prefix.
> >>>>>> re library: I think I don't have a argument here. My proposal is
> >>>>>> based on
> >>>>>> what I felt as being right :) I added the "Library / " prefix to the
> >>>>>> proposal.
> >>>>>>
> >>>>>> re core/config: From the proposed components, I see the best match
> >>>>>> with
> >>>>>> "Runtime / Coordination", but I agree that this example is
> >>>>>> difficult to
> >>>>>> place into my proposed scheme. Do you think we should introduce
> >>>>>> "Runtime
> >>>>> /
> >>>>>> Configuration" as a component?
> >>>>>>
> >>>>>>
> >>>>>> I updated the proposal accordingly!
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Feb 12, 2019 at 12:19 PM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> re build/travis/release: No, I'm against merging build system,
> travis
> >>>>>>> and release system.
> >>>>>>>
> >>>>>>> re legacy: So going forward you're proposing to move dropped
> features
> >>>>>>> into the legacy bucket and make it impossible to search for
> specific
> >>>>>>> issues for that component? There's 0 overhead to having these
> >>>>>>> components, so I really don't get the benefit here, but see the
> >>>>> overhead.
> >>>>>>> I don't buy the argument of "people will not open issues if the
> >>>>>>> component doesn't exist", they will just leave the component field
> >>>>>>> blank
> >>>>>>> or add a random one (that would be wrong). In fact, if you had a
> >>>>>>> storm/tez component (that users would adhere to) then it would be
> >>>>>>> _easier_ to figure out whether an issue can be rejected right away.
> >>>>>>>
> >>>>>>> re library: If you are against a library category, what's your
> >>>>>>> argument
> >>>>>>> for a connector category?
> >>>>>>>
> >>>>>>> re tests: I don't mind "tests" being removed from tickets about
> test
> >>>>>>> instabilities, but you specified the migration as "rename E2E
> tests"
> >>>>>>> which is not equivalent.
> >>>>>>> Under what category would you file modifications to
> >>>>> flink-test-utils-junit?
> >>>>>>> I would propose to not differentiate between e2e and other tests; I
> >>>>>>> would go along with "Test infrastructure", and remove the major
> >>>>>>> "Tests"
> >>>>>>> category.
> >>>>>>>
> >>>>>>> re core/config: As an example, where (under Runtime) would you
> >>>>>>> place the
> >>>>>>> introduction of the ConfigOption class?
> >>>>>>>
> >>>>>>> On 11.02.2019 11:31, Robert Metzger wrote:
> >>>>>>>> Thanks a lot for your feedback!
> >>>>>>>>
> >>>>>>>> @Timo:
> >>>>>>>> I've followed your suggestions and updated the proposed names in
> the
> >>>>>>> wiki.
> >>>>>>>> Regarding a new "SQL/Connectors" component: I (with admittedly
> >>>>>>>> not much
> >>>>>>>> knowledge) would not add this component at the moment, and put
> >>>>>>>> the SQL
> >>>>>>>> stuff into the respective connector component.
> >>>>>>>> It is probably pretty difficult for a user to decide whether a but
> >>>>>>> belongs
> >>>>>>>> to "SQL/Connector" to "Connectors/Kafka" when Kafka in SQL does
> not
> >>>>> work.
> >>>>>>>> @Chesnay:
> >>>>>>>> - You are suggesting to rename "Build System" to "Maven" and still
> >>>>> merge
> >>>>>>> it
> >>>>>>>> with "Travis", "Release System" etc. as in the proposal?
> >>>>>>>>
> >>>>>>>> - "Runtime / Control Plan" vs "Runtime / Coordination" -- I
> >>>>>>>> changed the
> >>>>>>>> proposal
> >>>>>>>>
> >>>>>>>> - Re. "Documentation": Yes, I think that would be better in the
> long
> >>>>> run.
> >>>>>>>> We are already in a situation where there are groups within the
> >>>>> community
> >>>>>>>> focusing on certain areas of the code (such as SQL, the runtime,
> >>>>>>>> connectors). Those groups will monitor their components, but it
> will
> >>>>> be a
> >>>>>>>> lot of overhead for them to monitor the "Documentation" component.
> >>>>>>>> We can also try to assign documentation components to both
> >>>>>>> "Documentation"
> >>>>>>>> and the affected component, such as "Runtime / Metrics".
> >>>>>>>>
> >>>>>>>> - Removed "Misc / " prefix.
> >>>>>>>>
> >>>>>>>> - "Legacy Components": Usually legacy components usually have
> >>>>>>>> very few
> >>>>>>>> tickets. "Flink on Tez" has 13, "Storm Compat" ~30, and JIRA has
> >>>>>>>> a bulk
> >>>>>>>> edit feature :)
> >>>>>>>> The benefit of having it generalized is that people will probably
> >>>>>>>> not
> >>>>> add
> >>>>>>>> tickets to it.
> >>>>>>>>
> >>>>>>>> - "Libraries /" prefix: I don't think that it is necessary. Some
> >>>>>>> libraries
> >>>>>>>> might grow in the future (like the Table API), then we need to
> >>>>>>>> rename.
> >>>>>>>> the "flink-libraries" module does contain stuff like the sql
> >>>>>>>> client or
> >>>>>>> the
> >>>>>>>> python api, which are already covered by other components in my
> >>>>> proposal
> >>>>>>> --
> >>>>>>>> so going with the maven module structure is not an argument here.
> >>>>>>>>
> >>>>>>>> - "End to end infrastructure" and "Tests: The same argument as
> >>>>>>>> with the
> >>>>>>>> "Documentation" applies here. The maintainers of Kafka, Metrics,
> ..
> >>>>>>> should
> >>>>>>>> get visibility into "their" test instabilities through "their"
> >>>>>>> components.
> >>>>>>>> Not many people will feel responsible for the "Tests" component.
> >>>>>>>>
> >>>>>>>> For "Core" and "Configuration", I will move the tickets to the
> >>>>>>> appropriate
> >>>>>>>> components in "Runtime /".
> >>>>>>>>
> >>>>>>>> For "API / Scala": Good point. I will add that component.
> >>>>>>>>
> >>>>>>>> How to do it? I will just go through the pain and do it.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Robert
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, Feb 8, 2019 at 2:40 PM Chesnay Schepler <
> chesnay@apache.org
> >>>
> >>>>>>> wrote:
> >>>>>>>>> Some concerns:
> >>>>>>>>>
> >>>>>>>>> Travis and build system / release system are entirely different.
> I
> >>>>> would
> >>>>>>>>> even keep the release system away from the build-system, as it
> >>>>>>>>> is more
> >>>>>>>>> about the release scripts and documentation, while the latter is
> >>>>>>>>> about
> >>>>>>>>> maven. Actually I'd just rename build-system to maven.
> >>>>>>>>>
> >>>>>>>>> Control Plane is a term I've never heard before in this context;
> >>>>>>>>> I'd
> >>>>>>>>> replace it with Coordination.
> >>>>>>>>>
> >>>>>>>>> The "Documentation" descriptions refers to it as a "Fallback
> >>>>> component".
> >>>>>>>>> In other words, if I make a change to the metrics documentation I
> >>>>>>>>> shouldn't use this component any more?
> >>>>>>>>>
> >>>>>>>>> I don't see the benefit of a `Misc` major category. I'd attribute
> >>>>>>>>> everything that doesn't have a major category implicitly to
> "Misc".
> >>>>>>>>>
> >>>>>>>>> Not a fan of a generalized "Legacy components" category; this
> seems
> >>>>>>>>> unnecessary. It's also a bit weird going forward as we'd have to
> >>>>>>>>> touch
> >>>>>>>>> every JIRA for a component if we drop it.
> >>>>>>>>>
> >>>>>>>>> How come gelly/CEP don't have a Major category (libraries?)
> >>>>>>>>>
> >>>>>>>>> "End to end infrastructure" is not equivalent to "E2E tests".
> >>>>>>>>> Infrastructure is not about fixing failing tests, which is what
> we
> >>>>>>>>> partially used this component for so far.
> >>>>>>>>>
> >>>>>>>>> I don't believe you can get rid of the generic "Tests" component;
> >>>>>>>>> consider any changes to the `flink-test-utils-junit` module.
> >>>>>>>>>
> >>>>>>>>> You propose deleting "Core" and "Configuration" but haven't
> >>>>>>>>> listed any
> >>>>>>>>> migration paths.
> >>>>>>>>>
> >>>>>>>>> If there's a API / Python category there should also be a API /
> >>>>>>>>> Scala
> >>>>>>>>> category. This could also include the shala-shell. Note that the
> >>>>>>>>> existing Scala API category is not mentioned anywhere in the
> >>>>>>>>> document.
> >>>>>>>>>
> >>>>>>>>> How do you actually want to do the migration?
> >>>>>>>>>
> >>>>>>>>> On 08.02.2019 13:13, Timo Walther wrote:
> >>>>>>>>>> Hi Robert,
> >>>>>>>>>>
> >>>>>>>>>> thanks for starting this discussion. I was also about to suggest
> >>>>>>>>>> splitting the `Table API & SQL` component because it contains
> >>>>>>>>>> already
> >>>>>>>>>> more than 1000 issues.
> >>>>>>>>>>
> >>>>>>>>>> My comments:
> >>>>>>>>>>
> >>>>>>>>>> - Rename "SQL/Shell" to "SQL/Client" because the long-term goal
> >>>>>>>>>> might
> >>>>>>>>>> not only be a CLI interface. I would keep the generic name "SQL
> >>>>>>>>>> Client" for now. This is also what is written in FLIPs,
> >>>>> presentations,
> >>>>>>>>>> and documentation.
> >>>>>>>>>> - Rename "SQL/Query Planner" to "SQL/Planner" a query is
> read-only
> >>>>>>>>>> operation but we support things like INSERT INTO etc.. Planner
> is
> >>>>> more
> >>>>>>>>>> generic.
> >>>>>>>>>> - Rename "Gelly" to "Graph Processing". New users don't know
> what
> >>>>>>>>>> Gelly means. This is the only component that has a "feature
> >>>>>>>>>> name". I
> >>>>>>>>>> don't know if we want to stick with that in the future.
> >>>>>>>>>> - Not sure about this: Introduce a "SQL/Connectors"? Because SQL
> >>>>>>>>>> connectors are tightly bound to SQL internals but also to the
> >>>>>>>>>> connector itself.
> >>>>>>>>>> - Rename "Connectors/HCatalog" to "Connectors/Hive". This name
> is
> >>>>> more
> >>>>>>>>>> generic and reflects the efforts about Hive Metastore and
> catalog
> >>>>>>>>>> integration that is currenlty taking place.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Timo
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Am 08.02.19 um 12:39 schrieb Robert Metzger:
> >>>>>>>>>>> Hi all,
> >>>>>>>>>>>
> >>>>>>>>>>> I am currently trying to improve how the Flink community is
> >>>>>>>>>>> handling
> >>>>>>>>>>> incoming pull requests and JIRA tickets.
> >>>>>>>>>>>
> >>>>>>>>>>> I've looked at how other big communities are handling such a
> high
> >>>>>>>>>>> number of
> >>>>>>>>>>> contributions, and I found that many are using GitHub labels
> >>>>>>>>>>> extensively.
> >>>>>>>>>>> An integral part of the label use is to tag PRs with the
> >>>>>>>>>>> component /
> >>>>>>>>>>> area
> >>>>>>>>>>> they belong to. I think the most obvious and logical way of
> >>>>>>>>>>> tagging
> >>>>>>>>>>> the PRs
> >>>>>>>>>>> is by using the JIRA components. This will force us to keep
> >>>>>>>>>>> the JIRA
> >>>>>>>>>>> tickets well-organized, if we want the PRs to be organized :)
> >>>>>>>>>>> I will soon start a separate discussion for the GitHub labels.
> >>>>>>>>>>>
> >>>>>>>>>>> Let's first discuss the JIRA components.
> >>>>>>>>>>>
> >>>>>>>>>>> I've created the following Wiki page with my proposal of the
> new
> >>>>>>>>>>> component,
> >>>>>>>>>>> and how to migrate from the existing components:
> >>>>>>>>>>>
> >>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/Proposal+for+new+JIRA+Components
> >>>>>
> >>>>>>>>>>> Please comment here or directly in the Wiki to let me know
> >>>>>>>>>>> what you
> >>>>>>>>>>> think.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Robert
> >>>>>>>>>>>
> >>>>>
> >>
> >>
>
>