You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by jincheng sun <su...@gmail.com> on 2019/05/22 11:31:25 UTC

[DISCUSS] Correct the flink pom `artifactId` config before flink 1.9 release

Hi all,

I find a shell issue in `verify_scala_suffixes.sh`(line 145) as follows:
```
grep "${module}_\d\+\.\d\+</artifactId>" "{}"
```
This code want to find out all modules that the module's `artifactId`  with
a `scala_binary_version` suffix.
but the problem is our all `artifactId` value is in the pattern of
`XXX_${scala.binary.version}`, such as:
```
<artifactId>flink-tests_${scala.binary.version}</artifactId>
```
then the find out always empty, so this check did not take effect.

When I correct the script as follows:

```
grep "${module}_\\${scala.binary.version}</artifactId>" "{}"
```
we find there more than 10 modules have incorrect `artifactId` config. as
follows:

1.flink-connector-hive
2.flink-fs-tests
3.flink-queryable-state-client-java
4.flink-sql-connector-elasticsearch6
5.flink-sql-connector-kafka
6.flink-sql-connector-kafka-0.10
7.flink-sql-connector-kafka-0.11
8.flink-sql-connector-kafka-0.9
9.flink-table-api-scala
10.flink-tests
11.flink-yarn-tests

And to fix this issue, we need a big change, such as:
    - correct the `artifactId` config.
    - update the dependency of relation modules.
    - release the connector into the repo.
    - update some of the doc for `connector.zh.md` and `connector.md`.
    - others

From the points of my view, It's better to fix this issue before the flink
1.9 release.

What do you think?

NOTE: Please remind me if I have a missing above!

1. The script code change:
https://github.com/sunjincheng121/flink/commit/736f16a8a76aaef1018cc754f0effec119e43120
2. The CI test result: https://api.travis-ci.org/v3/job/535719615/log.txt

Regards,
Jincheng

Re: [DISCUSS] Correct the flink pom `artifactId` config before flink 1.9 release

Posted by Chesnay Schepler <ch...@apache.org>.
You do have a point, but your output is mostly incorrect.

There are only 3 modules that have a suffix, which don't need it:

  *

    flink-connector-hive

  *

    flink-queryable-state-client-java

  *

    flink-table-api-scala

The remaining do need it since they have dependencies with a 
scala-suffix (mostly on runtime and streaming-java).

Your change does make sense to me, but there's likely another issue in 
the preceding logic that determines which module is scala-free.
flink-tests for example should not be considered scala-free, since it 
relies on flink-runtime which contains scala and hence the scala-lang 
dependencies, but it apparently is given that your change detects something.

On 22/05/2019 13:31, jincheng sun wrote:
> Hi all,
>
> I find a shell issue in `verify_scala_suffixes.sh`(line 145) as follows:
> ```
> grep "${module}_\d\+\.\d\+</artifactId>" "{}"
> ```
> This code want to find out all modules that the module's `artifactId`  with
> a `scala_binary_version` suffix.
> but the problem is our all `artifactId` value is in the pattern of
> `XXX_${scala.binary.version}`, such as:
> ```
> <artifactId>flink-tests_${scala.binary.version}</artifactId>
> ```
> then the find out always empty, so this check did not take effect.
>
> When I correct the script as follows:
>
> ```
> grep "${module}_\\${scala.binary.version}</artifactId>" "{}"
> ```
> we find there more than 10 modules have incorrect `artifactId` config. as
> follows:
>
> 1.flink-connector-hive
> 2.flink-fs-tests
> 3.flink-queryable-state-client-java
> 4.flink-sql-connector-elasticsearch6
> 5.flink-sql-connector-kafka
> 6.flink-sql-connector-kafka-0.10
> 7.flink-sql-connector-kafka-0.11
> 8.flink-sql-connector-kafka-0.9
> 9.flink-table-api-scala
> 10.flink-tests
> 11.flink-yarn-tests
>
> And to fix this issue, we need a big change, such as:
>      - correct the `artifactId` config.
>      - update the dependency of relation modules.
>      - release the connector into the repo.
>      - update some of the doc for `connector.zh.md` and `connector.md`.
>      - others
>
> >From the points of my view, It's better to fix this issue before the flink
> 1.9 release.
>
> What do you think?
>
> NOTE: Please remind me if I have a missing above!
>
> 1. The script code change:
> https://github.com/sunjincheng121/flink/commit/736f16a8a76aaef1018cc754f0effec119e43120
> 2. The CI test result: https://api.travis-ci.org/v3/job/535719615/log.txt
>
> Regards,
> Jincheng
>


Re: [DISCUSS] Correct the flink pom `artifactId` config before flink 1.9 release

Posted by jincheng sun <su...@gmail.com>.
Oh, Thanks for your quick response, I have copied the content to JIRA.

Chesnay Schepler <ch...@apache.org> 于2019年5月23日周四 下午7:03写道:

> Please post your mail as a comment into the JIRA to consolidate the
> discussion there.
>
> On 23/05/2019 12:55, jincheng sun wrote:
> > Thank you for confirming this is an issue,and thanks a lot for your
> double
> > check. Due to the script check logic is incorrect, most of the outputs
> are
> > incorrect.
> >
> > You are right and I just pointed out the problem of the scala-free check
> > and not mentioned the final solution which I think we should discuss it.
> >
> > I tried to understand you and have modified the script a bit and have got
> > the same results as you mentioned(In my local).
> > The main changes are as follows:
> > 1) Adds test modules which should be checked:
> > !flink-fs-tests,!flink-yarn-tests,!flink-tests
> > 2) Marks the modules as infected which depend on scala trasitively or
> > depend on modules suffixed with `_{scala_version}`
> >
> > I think we should discuss the rule of how to check whether a module is
> > scala-free or not.
> > If I understand you correctly, the rule in your mind may be as follows:
> >
> > 1) All the modules should check the dependencies(excluding the
> dependencies
> > introduced by the test code). Regarding to modules flink-fs-tests,
> >     flink-yarn-tests and flink-tests, the dependencies introduced by the
> > test code should also be checked.
> > 2) The checking rule is whether a module depends on scala trasitively or
> > depends on modules suffixed with `_{scala_version}`.
> > Following the above rules, we can get results you mentioned(Only 3
> modules
> > with incorrect artifact id).
> >
> > Open question:
> >
> > Currently, all the test code are also released into the repository, such
> as
> >
> http://central.maven.org/maven2/org/apache/flink/flink-avro/1.8.0/flink-avro-1.8.0-tests.jar
> > .
> > Users can also depend on these jars. My question is why we need to check
> > the test dependencies for modules flink-fs-tests, flink-yarn-tests and
> > flink-tests,
> > but not check the test dependencies for other modules?
> >
> > The solution:
> > If we follow the rule you mentioned, the change is as follows:
> > 1) Correct the check logic for the script
> > 2) Correct the artifact id for modules:  flink-connector-hive,
> > flink-queryable-state-client-java
> > 3) Add the scala dependencies for the module flink-table-api-scala due we
> > plan to add the scala code(I discussed with Timo and Aljoscha)
> >
> > If we should check other test code for other modules, maybe we need more
> > changes. But it depends on the above open question.
> >
> > I have already opened the JIRA:
> > https://issues.apache.org/jira/browse/FLINK-12602
> >
> > Feel free to discuss the solution in this mail thread or in the JIRA.
> >
> > Best,
> > Jincheng
> >
> > Chesnay Schepler <ch...@apache.org> 于2019年5月22日周三 下午8:50写道:
> >
> >> You do have a point, but your output is mostly incorrect.
> >>
> >> There are only 3 modules that have a suffix, which don't need it:
> >>
> >>     -
> >>
> >>     flink-connector-hive
> >>
> >>     -
> >>
> >>     flink-queryable-state-client-java
> >>
> >>     -
> >>
> >>     flink-table-api-scala
> >>
> >>
> >> The remaining do need it since they have dependencies with a
> scala-suffix
> >> (mostly on runtime and streaming-java).
> >>
> >> Your change does make sense to me, but there's likely another issue in
> the
> >> preceding logic that determines which module is scala-free.
> >> flink-tests for example should not be considered scala-free, since it
> >> relies on flink-runtime which contains scala and hence the scala-lang
> >> dependencies, but it apparently is given that your change detects
> something.
> >>
> >> In any case, please open a JIRA:
> >>
> >> On 22/05/2019 13:31, jincheng sun wrote:
> >>
> >> Hi all,
> >>
> >> I find a shell issue in `verify_scala_suffixes.sh`(line 145) as follows:
> >> ```
> >> grep "${module}_\d\+\.\d\+</artifactId>" "{}"
> >> ```
> >> This code want to find out all modules that the module's `artifactId`
> with
> >> a `scala_binary_version` suffix.
> >> but the problem is our all `artifactId` value is in the pattern of
> >> `XXX_${scala.binary.version}`, such as:
> >> ```
> >> <artifactId>flink-tests_${scala.binary.version}</artifactId>
> >> ```
> >> then the find out always empty, so this check did not take effect.
> >>
> >> When I correct the script as follows:
> >>
> >> ```
> >> grep "${module}_\\${scala.binary.version}</artifactId>" "{}"
> >> ```
> >> we find there more than 10 modules have incorrect `artifactId` config.
> as
> >> follows:
> >>
> >> 1.flink-connector-hive
> >> 2.flink-fs-tests
> >> 3.flink-queryable-state-client-java
> >> 4.flink-sql-connector-elasticsearch6
> >> 5.flink-sql-connector-kafka
> >> 6.flink-sql-connector-kafka-0.10
> >> 7.flink-sql-connector-kafka-0.11
> >> 8.flink-sql-connector-kafka-0.9
> >> 9.flink-table-api-scala
> >> 10.flink-tests
> >> 11.flink-yarn-tests
> >>
> >> And to fix this issue, we need a big change, such as:
> >>      - correct the `artifactId` config.
> >>      - update the dependency of relation modules.
> >>      - release the connector into the repo.
> >>      - update some of the doc for `connector.zh.md` and `connector.md`.
> >>      - others
> >>
> >> >From the points of my view, It's better to fix this issue before the
> flink
> >> 1.9 release.
> >>
> >> What do you think?
> >>
> >> NOTE: Please remind me if I have a missing above!
> >>
> >> 1. The script code change:
> https://github.com/sunjincheng121/flink/commit/736f16a8a76aaef1018cc754f0effec119e43120
> >> 2. The CI test result:
> https://api.travis-ci.org/v3/job/535719615/log.txt
> >>
> >> Regards,
> >> Jincheng
> >>
> >>
> >>
> >>
>
>

Re: [DISCUSS] Correct the flink pom `artifactId` config before flink 1.9 release

Posted by Chesnay Schepler <ch...@apache.org>.
Please post your mail as a comment into the JIRA to consolidate the 
discussion there.

On 23/05/2019 12:55, jincheng sun wrote:
> Thank you for confirming this is an issue,and thanks a lot for your double
> check. Due to the script check logic is incorrect, most of the outputs are
> incorrect.
>
> You are right and I just pointed out the problem of the scala-free check
> and not mentioned the final solution which I think we should discuss it.
>
> I tried to understand you and have modified the script a bit and have got
> the same results as you mentioned(In my local).
> The main changes are as follows:
> 1) Adds test modules which should be checked:
> !flink-fs-tests,!flink-yarn-tests,!flink-tests
> 2) Marks the modules as infected which depend on scala trasitively or
> depend on modules suffixed with `_{scala_version}`
>
> I think we should discuss the rule of how to check whether a module is
> scala-free or not.
> If I understand you correctly, the rule in your mind may be as follows:
>
> 1) All the modules should check the dependencies(excluding the dependencies
> introduced by the test code). Regarding to modules flink-fs-tests,
>     flink-yarn-tests and flink-tests, the dependencies introduced by the
> test code should also be checked.
> 2) The checking rule is whether a module depends on scala trasitively or
> depends on modules suffixed with `_{scala_version}`.
> Following the above rules, we can get results you mentioned(Only 3 modules
> with incorrect artifact id).
>
> Open question:
>
> Currently, all the test code are also released into the repository, such as
> http://central.maven.org/maven2/org/apache/flink/flink-avro/1.8.0/flink-avro-1.8.0-tests.jar
> .
> Users can also depend on these jars. My question is why we need to check
> the test dependencies for modules flink-fs-tests, flink-yarn-tests and
> flink-tests,
> but not check the test dependencies for other modules?
>
> The solution:
> If we follow the rule you mentioned, the change is as follows:
> 1) Correct the check logic for the script
> 2) Correct the artifact id for modules:  flink-connector-hive,
> flink-queryable-state-client-java
> 3) Add the scala dependencies for the module flink-table-api-scala due we
> plan to add the scala code(I discussed with Timo and Aljoscha)
>
> If we should check other test code for other modules, maybe we need more
> changes. But it depends on the above open question.
>
> I have already opened the JIRA:
> https://issues.apache.org/jira/browse/FLINK-12602
>
> Feel free to discuss the solution in this mail thread or in the JIRA.
>
> Best,
> Jincheng
>
> Chesnay Schepler <ch...@apache.org> 于2019年5月22日周三 下午8:50写道:
>
>> You do have a point, but your output is mostly incorrect.
>>
>> There are only 3 modules that have a suffix, which don't need it:
>>
>>     -
>>
>>     flink-connector-hive
>>
>>     -
>>
>>     flink-queryable-state-client-java
>>
>>     -
>>
>>     flink-table-api-scala
>>
>>
>> The remaining do need it since they have dependencies with a scala-suffix
>> (mostly on runtime and streaming-java).
>>
>> Your change does make sense to me, but there's likely another issue in the
>> preceding logic that determines which module is scala-free.
>> flink-tests for example should not be considered scala-free, since it
>> relies on flink-runtime which contains scala and hence the scala-lang
>> dependencies, but it apparently is given that your change detects something.
>>
>> In any case, please open a JIRA:
>>
>> On 22/05/2019 13:31, jincheng sun wrote:
>>
>> Hi all,
>>
>> I find a shell issue in `verify_scala_suffixes.sh`(line 145) as follows:
>> ```
>> grep "${module}_\d\+\.\d\+</artifactId>" "{}"
>> ```
>> This code want to find out all modules that the module's `artifactId`  with
>> a `scala_binary_version` suffix.
>> but the problem is our all `artifactId` value is in the pattern of
>> `XXX_${scala.binary.version}`, such as:
>> ```
>> <artifactId>flink-tests_${scala.binary.version}</artifactId>
>> ```
>> then the find out always empty, so this check did not take effect.
>>
>> When I correct the script as follows:
>>
>> ```
>> grep "${module}_\\${scala.binary.version}</artifactId>" "{}"
>> ```
>> we find there more than 10 modules have incorrect `artifactId` config. as
>> follows:
>>
>> 1.flink-connector-hive
>> 2.flink-fs-tests
>> 3.flink-queryable-state-client-java
>> 4.flink-sql-connector-elasticsearch6
>> 5.flink-sql-connector-kafka
>> 6.flink-sql-connector-kafka-0.10
>> 7.flink-sql-connector-kafka-0.11
>> 8.flink-sql-connector-kafka-0.9
>> 9.flink-table-api-scala
>> 10.flink-tests
>> 11.flink-yarn-tests
>>
>> And to fix this issue, we need a big change, such as:
>>      - correct the `artifactId` config.
>>      - update the dependency of relation modules.
>>      - release the connector into the repo.
>>      - update some of the doc for `connector.zh.md` and `connector.md`.
>>      - others
>>
>> >From the points of my view, It's better to fix this issue before the flink
>> 1.9 release.
>>
>> What do you think?
>>
>> NOTE: Please remind me if I have a missing above!
>>
>> 1. The script code change:https://github.com/sunjincheng121/flink/commit/736f16a8a76aaef1018cc754f0effec119e43120
>> 2. The CI test result: https://api.travis-ci.org/v3/job/535719615/log.txt
>>
>> Regards,
>> Jincheng
>>
>>
>>
>>


Re: [DISCUSS] Correct the flink pom `artifactId` config before flink 1.9 release

Posted by jincheng sun <su...@gmail.com>.
Thank you for confirming this is an issue,and thanks a lot for your double
check. Due to the script check logic is incorrect, most of the outputs are
incorrect.

You are right and I just pointed out the problem of the scala-free check
and not mentioned the final solution which I think we should discuss it.

I tried to understand you and have modified the script a bit and have got
the same results as you mentioned(In my local).
The main changes are as follows:
1) Adds test modules which should be checked:
!flink-fs-tests,!flink-yarn-tests,!flink-tests
2) Marks the modules as infected which depend on scala trasitively or
depend on modules suffixed with `_{scala_version}`

I think we should discuss the rule of how to check whether a module is
scala-free or not.
If I understand you correctly, the rule in your mind may be as follows:

1) All the modules should check the dependencies(excluding the dependencies
introduced by the test code). Regarding to modules flink-fs-tests,
   flink-yarn-tests and flink-tests, the dependencies introduced by the
test code should also be checked.
2) The checking rule is whether a module depends on scala trasitively or
depends on modules suffixed with `_{scala_version}`.
Following the above rules, we can get results you mentioned(Only 3 modules
with incorrect artifact id).

Open question:

Currently, all the test code are also released into the repository, such as
http://central.maven.org/maven2/org/apache/flink/flink-avro/1.8.0/flink-avro-1.8.0-tests.jar
.
Users can also depend on these jars. My question is why we need to check
the test dependencies for modules flink-fs-tests, flink-yarn-tests and
flink-tests,
but not check the test dependencies for other modules?

The solution:
If we follow the rule you mentioned, the change is as follows:
1) Correct the check logic for the script
2) Correct the artifact id for modules:  flink-connector-hive,
flink-queryable-state-client-java
3) Add the scala dependencies for the module flink-table-api-scala due we
plan to add the scala code(I discussed with Timo and Aljoscha)

If we should check other test code for other modules, maybe we need more
changes. But it depends on the above open question.

I have already opened the JIRA:
https://issues.apache.org/jira/browse/FLINK-12602

Feel free to discuss the solution in this mail thread or in the JIRA.

Best,
Jincheng

Chesnay Schepler <ch...@apache.org> 于2019年5月22日周三 下午8:50写道:

> You do have a point, but your output is mostly incorrect.
>
> There are only 3 modules that have a suffix, which don't need it:
>
>    -
>
>    flink-connector-hive
>
>    -
>
>    flink-queryable-state-client-java
>
>    -
>
>    flink-table-api-scala
>
>
> The remaining do need it since they have dependencies with a scala-suffix
> (mostly on runtime and streaming-java).
>
> Your change does make sense to me, but there's likely another issue in the
> preceding logic that determines which module is scala-free.
> flink-tests for example should not be considered scala-free, since it
> relies on flink-runtime which contains scala and hence the scala-lang
> dependencies, but it apparently is given that your change detects something.
>
> In any case, please open a JIRA:
>
> On 22/05/2019 13:31, jincheng sun wrote:
>
> Hi all,
>
> I find a shell issue in `verify_scala_suffixes.sh`(line 145) as follows:
> ```
> grep "${module}_\d\+\.\d\+</artifactId>" "{}"
> ```
> This code want to find out all modules that the module's `artifactId`  with
> a `scala_binary_version` suffix.
> but the problem is our all `artifactId` value is in the pattern of
> `XXX_${scala.binary.version}`, such as:
> ```
> <artifactId>flink-tests_${scala.binary.version}</artifactId>
> ```
> then the find out always empty, so this check did not take effect.
>
> When I correct the script as follows:
>
> ```
> grep "${module}_\\${scala.binary.version}</artifactId>" "{}"
> ```
> we find there more than 10 modules have incorrect `artifactId` config. as
> follows:
>
> 1.flink-connector-hive
> 2.flink-fs-tests
> 3.flink-queryable-state-client-java
> 4.flink-sql-connector-elasticsearch6
> 5.flink-sql-connector-kafka
> 6.flink-sql-connector-kafka-0.10
> 7.flink-sql-connector-kafka-0.11
> 8.flink-sql-connector-kafka-0.9
> 9.flink-table-api-scala
> 10.flink-tests
> 11.flink-yarn-tests
>
> And to fix this issue, we need a big change, such as:
>     - correct the `artifactId` config.
>     - update the dependency of relation modules.
>     - release the connector into the repo.
>     - update some of the doc for `connector.zh.md` and `connector.md`.
>     - others
>
> >From the points of my view, It's better to fix this issue before the flink
> 1.9 release.
>
> What do you think?
>
> NOTE: Please remind me if I have a missing above!
>
> 1. The script code change:https://github.com/sunjincheng121/flink/commit/736f16a8a76aaef1018cc754f0effec119e43120
> 2. The CI test result: https://api.travis-ci.org/v3/job/535719615/log.txt
>
> Regards,
> Jincheng
>
>
>
>

Re: [DISCUSS] Correct the flink pom `artifactId` config before flink 1.9 release

Posted by Chesnay Schepler <ch...@apache.org>.
You do have a point, but your output is mostly incorrect.

There are only 3 modules that have a suffix, which don't need it:

  *

    flink-connector-hive

  *

    flink-queryable-state-client-java

  *

    flink-table-api-scala

The remaining do need it since they have dependencies with a 
scala-suffix (mostly on runtime and streaming-java).

Your change does make sense to me, but there's likely another issue in 
the preceding logic that determines which module is scala-free.
flink-tests for example should not be considered scala-free, since it 
relies on flink-runtime which contains scala and hence the scala-lang 
dependencies, but it apparently is given that your change detects something.

In any case, please open a JIRA:

On 22/05/2019 13:31, jincheng sun wrote:
> Hi all,
>
> I find a shell issue in `verify_scala_suffixes.sh`(line 145) as follows:
> ```
> grep "${module}_\d\+\.\d\+</artifactId>" "{}"
> ```
> This code want to find out all modules that the module's `artifactId`  with
> a `scala_binary_version` suffix.
> but the problem is our all `artifactId` value is in the pattern of
> `XXX_${scala.binary.version}`, such as:
> ```
> <artifactId>flink-tests_${scala.binary.version}</artifactId>
> ```
> then the find out always empty, so this check did not take effect.
>
> When I correct the script as follows:
>
> ```
> grep "${module}_\\${scala.binary.version}</artifactId>" "{}"
> ```
> we find there more than 10 modules have incorrect `artifactId` config. as
> follows:
>
> 1.flink-connector-hive
> 2.flink-fs-tests
> 3.flink-queryable-state-client-java
> 4.flink-sql-connector-elasticsearch6
> 5.flink-sql-connector-kafka
> 6.flink-sql-connector-kafka-0.10
> 7.flink-sql-connector-kafka-0.11
> 8.flink-sql-connector-kafka-0.9
> 9.flink-table-api-scala
> 10.flink-tests
> 11.flink-yarn-tests
>
> And to fix this issue, we need a big change, such as:
>      - correct the `artifactId` config.
>      - update the dependency of relation modules.
>      - release the connector into the repo.
>      - update some of the doc for `connector.zh.md` and `connector.md`.
>      - others
>
> >From the points of my view, It's better to fix this issue before the flink
> 1.9 release.
>
> What do you think?
>
> NOTE: Please remind me if I have a missing above!
>
> 1. The script code change:
> https://github.com/sunjincheng121/flink/commit/736f16a8a76aaef1018cc754f0effec119e43120
> 2. The CI test result: https://api.travis-ci.org/v3/job/535719615/log.txt
>
> Regards,
> Jincheng
>