You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oozie.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2011/09/08 06:49:09 UTC

[jira] [Created] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
--------------------------------------------------------------

                 Key: OOZIE-197
                 URL: https://issues.apache.org/jira/browse/OOZIE-197
             Project: Oozie
          Issue Type: Bug
            Reporter: Hadoop QA


Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).

Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.

For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.

Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.

The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:

dependency: ${pigGroupId}:pig:0.7.0

 exclude: org.apache.hadoop:hadoop-core 

 exclude: com.yahoo.hadoop:hadoop-core

NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.

This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.

Cloudera is in the process of normalizing all its groupIds to use the original ones.

Apache is not affected by this as they have the original groupIds for Hadoop/Pig.

Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.

For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OOZIE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hadoop QA resolved OOZIE-197.
-----------------------------

    Resolution: Fixed

> GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
> --------------------------------------------------------------
>
>                 Key: OOZIE-197
>                 URL: https://issues.apache.org/jira/browse/OOZIE-197
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).
> Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.
> For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.
> Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.
> The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:
> dependency: ${pigGroupId}:pig:0.7.0
>  exclude: org.apache.hadoop:hadoop-core 
>  exclude: com.yahoo.hadoop:hadoop-core
> NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.
> This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.
> Cloudera is in the process of normalizing all its groupIds to use the original ones.
> Apache is not affected by this as they have the original groupIds for Hadoop/Pig.
> Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.
> For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101794#comment-13101794 ] 

Hadoop QA commented on OOZIE-197:
---------------------------------

tucu00 remarked:
Well, it depends, hadoop-core JARs produced by Apache, Yahoo, Cloudera, etc. are  after all different versions of the same component.

And that is how Maven groupId:artifactId:version management is designed to work.

If central repositories don't accept a particular version, as a developer, you can always add the corresponding repository to your repository list.

Currently for Oozie (forget Cloudera distribution) this is a pain because it has to deal with Apache and Yahoo Hadoop components for Hadoop and Pig. And this complicates significantly the build process. 

And this complexity is not only to Oozie developers but to Oozie users as well as they have to do multiple exclusions for the 'same' component just because it may come from Apache or Yahoo.

Bottom line, Maven is designed to handled this efficiently via versions.

That is why this request.

> GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
> --------------------------------------------------------------
>
>                 Key: OOZIE-197
>                 URL: https://issues.apache.org/jira/browse/OOZIE-197
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).
> Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.
> For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.
> Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.
> The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:
> dependency: ${pigGroupId}:pig:0.7.0
>  exclude: org.apache.hadoop:hadoop-core 
>  exclude: com.yahoo.hadoop:hadoop-core
> NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.
> This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.
> Cloudera is in the process of normalizing all its groupIds to use the original ones.
> Apache is not affected by this as they have the original groupIds for Hadoop/Pig.
> Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.
> For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099879#comment-13099879 ] 

Hadoop QA commented on OOZIE-197:
---------------------------------

omalley remarked:
It seems dangerous to have multiple orgs all pushing into the same namespace. Furthermore, I don't think most of the forges that sync to the central repository will let you submit for other groups.

> GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
> --------------------------------------------------------------
>
>                 Key: OOZIE-197
>                 URL: https://issues.apache.org/jira/browse/OOZIE-197
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).
> Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.
> For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.
> Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.
> The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:
> dependency: ${pigGroupId}:pig:0.7.0
>  exclude: org.apache.hadoop:hadoop-core 
>  exclude: com.yahoo.hadoop:hadoop-core
> NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.
> This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.
> Cloudera is in the process of normalizing all its groupIds to use the original ones.
> Apache is not affected by this as they have the original groupIds for Hadoop/Pig.
> Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.
> For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101795#comment-13101795 ] 

Hadoop QA commented on OOZIE-197:
---------------------------------

tucu00 remarked:
Any further thoughts on this one?

> GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
> --------------------------------------------------------------
>
>                 Key: OOZIE-197
>                 URL: https://issues.apache.org/jira/browse/OOZIE-197
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).
> Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.
> For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.
> Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.
> The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:
> dependency: ${pigGroupId}:pig:0.7.0
>  exclude: org.apache.hadoop:hadoop-core 
>  exclude: com.yahoo.hadoop:hadoop-core
> NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.
> This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.
> Cloudera is in the process of normalizing all its groupIds to use the original ones.
> Apache is not affected by this as they have the original groupIds for Hadoop/Pig.
> Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.
> For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

Posted by "Roman Shaposhnik (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OOZIE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Roman Shaposhnik reopened OOZIE-197:
------------------------------------


> GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
> --------------------------------------------------------------
>
>                 Key: OOZIE-197
>                 URL: https://issues.apache.org/jira/browse/OOZIE-197
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).
> Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.
> For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.
> Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.
> The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:
> dependency: ${pigGroupId}:pig:0.7.0
>  exclude: org.apache.hadoop:hadoop-core 
>  exclude: com.yahoo.hadoop:hadoop-core
> NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.
> This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.
> Cloudera is in the process of normalizing all its groupIds to use the original ones.
> Apache is not affected by this as they have the original groupIds for Hadoop/Pig.
> Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.
> For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (OOZIE-197) GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/OOZIE-197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101796#comment-13101796 ] 

Hadoop QA commented on OOZIE-197:
---------------------------------

cdouglas remarked:
I didn't find much guidance in the Maven documentation on using versions to distinguish vendors. However, from docs on version syntax:
http://docs.codehaus.org/display/MAVEN/Dependency+Mediation+and+Conflict+Resolution

And a proposed (but old) extension to handle vendors, where the repo is cited as sufficient to distinguish sources:
http://docs.codehaus.org/display/MAVEN/Extending+Maven+2.0+Dependencies

It looks like it could could be a good solution. Since the Apache project uses &lt;major>.&lt;minor>.&lt;patch> version strings, other groups could use the optional &lt;qualifier> to distinguish them, though neither y0.22.0 nor 0.22.0-y would work well with the range syntax, and I'm not sure whether the default version comparison will handle non-numeric characters in the major version.

I'm unfamiliar with the forge case. Would it prevent jars from Yahoo or Cloudera from being aggregated (as the namespace wouldn't match the source), or are there other issues?

> GH-226: Standardize on groupId/artifactId for Hadoop/Pig/Oozie
> --------------------------------------------------------------
>
>                 Key: OOZIE-197
>                 URL: https://issues.apache.org/jira/browse/OOZIE-197
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Currently we are assuming that JARs for Hadoop/Pig coming from Apache/Yahoo/Cloudera have different groupIds (org.apache., com.yahoo., com.cloudera.*).
> Instead using different groupIds, the different JAR providers (Apache, Yahoo, Cloudera, etc) should use the groupId and use the version to specify the JAR provider.
> For example, under the proposed model the groupId for Hadoop JARs would be org.apache.hadoop, for Pig org.apache.pig, for Oozie com.yahoo.oozie.
> Then, the versions would indicate the origin if different than the original provider. For example, for Apache Hadoop a version would be 0.22.0 while for Yahoo the corresponding version would be y0.22.0.
> The main reason for this standardization is to allow developers using these JARs to effectively manage exclusions. For example, today, somebody using a Pig JAR wanting to exclude the dependent Hadoop JARs must do:
> dependency: ${pigGroupId}:pig:0.7.0
>  exclude: org.apache.hadoop:hadoop-core 
>  exclude: com.yahoo.hadoop:hadoop-core
> NOTE: Oozie does this, pig groupId is parameterized and hadoop-core must be excluded from the possible groups. Furthermore, Cloudera must add to its POMs a 3rd exclusion for com.cloudera.hadoop:hadoop-core.
> This does not only affect Oozie but anybody developing applications for Hadoop/Pig using Maven or Ivy.
> Cloudera is in the process of normalizing all its groupIds to use the original ones.
> Apache is not affected by this as they have the original groupIds for Hadoop/Pig.
> Yahoo should change the groupsIds for the Hadoop/Pig JARs they publish.
> For Oozie we should keep com.yahoo.oozie.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira