You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-issues@hadoop.apache.org by "Owen O'Malley (JIRA)" <ji...@apache.org> on 2011/06/16 18:43:47 UTC

[jira] [Created] (MAPREDUCE-2600) MR-279: simplify the jars

MR-279: simplify the jars 
--------------------------

                 Key: MAPREDUCE-2600
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
            Reporter: Owen O'Malley


Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Robert Joseph Evans (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109532#comment-13109532 ] 

Robert Joseph Evans commented on MAPREDUCE-2600:
------------------------------------------------

There has been no discussion on this for over a month.  Does that mean the the issue is decided and that we are not going to reduce the number of jars?  If not and we do want to change it we should do it sooner rather then later because it will be a big refactor and disrupt development again.  I personally think that we have had enough movement in the code layout and would prefer not to rock the boat any more.  Maven and ivy seem to be handling the transitive dependency resolution just fine already so I don't see a big reason to make the change.

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083220#comment-13083220 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

bq. I don't see how it makes development easier or faster to have lots of little directories.

Smaller module means smaller code base to start for a typical feature and much faster to recompile if one doesn't use an IDE. i.e, just mvn clean install in the module directory. yarn only have 5 modules (including an integration test module), the mapreduce runtime has 6 modules. Is this "lots of little directories" that's out of control?

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109645#comment-13109645 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

I talked to Owen at his office a few weeks ago before my vacation. I recall that we agreed that the ideal modules/jars separation would be: yarn-client/server and mapreduce-client/server (mapreduce-server would contain the jobhistory server). But like he mentioned here, he's not pushing the change for 0.23, as the current layout works and he's not working on it :)

We also talked about the dependencies issue between shuffle and nodemanager: NodeManager loads a specific version of ShuffleHandler that depends on a specific version of mapreduce-client-core for a specific version of ShuffleHeader. Even though the current separation is made possible via a service plugin mechanism, the undesirable dependency still exists. The solution of that problem is to have a [generic shuffle service|MAPREDUCE-3060].

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083235#comment-13083235 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

bq. 59 JARs for a project seems a bit too much, it seems that JARs are being used instead of Java packages to separate class.

No we don't have 59 jars or 59 source roots. We only have 11 source-root/modules, the rest are dependencies. In fact, we do separate modules mostly at package boundaries.

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Lu reassigned MAPREDUCE-2600:
----------------------------------

    Assignee: Luke Lu

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Mahadev konar (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122517#comment-13122517 ] 

Mahadev konar commented on MAPREDUCE-2600:
------------------------------------------

I'd suggest skipping this, too late :).
                
> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083026#comment-13083026 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

bq. If no of jars is the problem, can we just merge the jars at build time the way we want. Using maven shade plugin or some such ?

I agree with Sharad, the current modules layout is fine. It makes working on individual features faster and easier. People who complain about number of source roots should improve their IDE fu and/or use a better IDE, IMO :)

According to recent conversations with people involved, I got the impression that it's just a packaging issue, i.e., having 3 combined jars plus dependencies in the distribution tar ball. yarn-client, yarn-servers and hadoop-mapreduce. So just some maven-shade-plugin fu would suffice.

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Scott Carey (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173416#comment-13173416 ] 

Scott Carey commented on MAPREDUCE-2600:
----------------------------------------

I am a little late to the party here but:
{quote}
It is a big issue for downstream users. Projects that use Hadoop already pick up a lot of jars and increasing the set when all of the versions are the same is a problem. We'll also have users using different versions of the jars, which won't be useful.

Having a source structure that requires an IDE to use isn't making the code easy for people to browse, use and modify. It will also become a maintenance problem as the dependency graph between the components change.

Yes, you can munge the results together into a single jar as part of the build, but I don't see how it makes development easier or faster to have lots of little directories. 
{quote}

I disagree.  It is a huge issue as a downstream user when the jar granularity is not fine enough.  You don't have to manually pick each jar, so the total number is not the issue.  If set up correctly a user will only pick the _one_ or maybe _two_ jars needed for their use case and maven/ivy/etc pulls in the transitive dependencies for you with the correct versions.  It is a MUCH bigger risk if as a user I don't have the ability to build the package I want that _excludes_ the stuff I don't need without a lot of trouble.  It is not the _number_ of jars that is the problem, it is the total _size_ of all of them and the likelihood of version mismatches with transitive dependencies.  The current issue is not that projects that use Hadoop 'pick up a lot of jars' it is that they 'pick up a lot of jars that are not needed at all'.

A few 'top level' jars that are useful for various use cases as single points of inclusion would be perfect.  This does not imply few jars total, it implies a few that you choose to declare for your use cases -- they can pull in any number of other shared hadoop that are required for those use cases, it doesn't matter if they are 'the same version', the user does not need to know since maven handles that and maven best practices make many jars with the 'same version' a non-issue. 

A user pulls in a mapreduce client jar, and that might also pull in a couple 'common' jars from the same project.  That is the intended best practice of maven.  If the mapreduce client jar were to bundle common stuff in it, and that same common stuff were bundled in say, an hdfs-client jar, then you risk all sorts of trouble as a downstream user with multiple colliding classes on your classpath, the inability to have the tooling (maven) detect and deal with conflicts appropriately, etc.  If it were to bundle stuff that is not useful as a client, that would bloat client application jars and potentially pull in useless transitive dependencies.

If the jars are reduced into only a few big blobs, it will end up more like the absolutely atrocious maven dependency management in 0.20.205 and 0.22.x. where a user who just wants to build a mapreduce program pulls in 20MB of downstream jars that are not needed unless they manually exclude them.

Having more source trees is a slight development burden, but enforces the right encapsulation and organization of dependencies.  One of the benefits of organizing modules in maven is that the end result almost always leads to more clear code boundaries and better architectural separation of concerns.  It also helps define API boundaries and prevent creating leaky abstractions / apis by accident.

{quote}
Thinking more on it. I am inclined to keeping the modules separate as it is currently, instead of combining the source tree.
I am counting the no of modules to be 10-12. So the source tree should not be 59 or am I missing something.

The separate modules do help identify the boundaries more clearly and help in enforcing those. Separation just based on java packages is loose. I know this based on the unnecessary pain I went thru when I was working on the project split 2 years ago. In future, refactoring code or doing things like rewriting NM in C++ will be least intrusive with the current module structure.

If no of jars is the problem, can we just merge the jars at build time the way we want. Using maven shade plugin or some such ?
{quote}

I agree.  You can use the shade plugin to make a few 'fat' jars for some use cases that live _along side_ the normal artifacts that do not embed any dependencies.


Please, please don't put any jars in a maven repo that bundle dependencies unless they are attached artifacts and not the primary artifact.
Please, please declare the dependencies properly, using 'optional' or 'provided' scope as appropriate to prevent downstream users from pulling in artifacts transitively that a client user does not need.  
I believe that too few jars is worst than too many, when the two items above are done correctly (e.g. maven best practices are followed).  Then as a downstream user, I can easily select the features I want, and trust that the dependencies that are pulled in to my project transitively as a consequence of say, pulling in a mapreduce client jar, are only the jars needed as a mapreduce client and not the entire freaking hadoop framework or any other extra unnecessary baggage.


                
> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (Commented) (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13118166#comment-13118166 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

I'd like to get some consensus on this issue. What do people think? This is a fairly large source code reorg that's potentially disruptive. We'd better do it sooner than later or not do it at all.
                
> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083244#comment-13083244 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

OK 12 modules as of now:
# yarn-api
# yarn-common
# yarn-server-common
# yarn-server-nodemanager
# yarn-server-resourcemanager
# yarn-server-tests (an integration test module)
# hadoop-mapreduce-client-core
# hadoop-mapreduce-client-common
# hadoop-mapreduce-client-shuffle (shuffle plugin for node manager)
# hadoop-mapreduce-client-app (MR app master)
# hadoop-mapreduce-client-hs  (MR job history server)
# hadoop-mapreduce-client-jobclient


> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083225#comment-13083225 ] 

Alejandro Abdelnur commented on MAPREDUCE-2600:
-----------------------------------------------

59 JARs for a project seems a bit too much, it seems that JARs are being used instead of Java packages to separate class.

IMO, a more logical set of JARs is along the lines of Owen described when opening the JIRA: api, client, server, utils

Even if IDEs can handle several source roots, 59 becomes cumbersome. Plus, from Maven side, that means the reactor will do much more work to resolve module dependency, thus slowing down the build.

Finally, I advice against merging JARs into one, this complicates significantly troubleshooting.


> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082916#comment-13082916 ] 

Sharad Agarwal commented on MAPREDUCE-2600:
-------------------------------------------

I would prefer to break mr-client into two instead of one:
 - mr-client -> jobclient and other user libraries
 - mr-runtime -> MR ApplicationMaster, MapTask, ReduceTask etc.

This will ensure clear separation between user facing libraries and the runtime.

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Sharad Agarwal (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082931#comment-13082931 ] 

Sharad Agarwal commented on MAPREDUCE-2600:
-------------------------------------------

Thinking more on it. I am inclined to keeping the modules separate as it is currently, instead of combining the source tree.
I am counting the no of modules to be 10-12. So the source tree should not be 59 or am I missing something.

The separate modules do help identify the boundaries more clearly and help in *enforcing* those. Separation just based on java packages is loose. I know this based on the *unnecessary* pain I went thru when I was working on the project split 2 years ago. In future, refactoring code or doing things like rewriting NM in C++ will be least intrusive with the current module structure.

If no of jars is the problem, can we just merge the jars at build time the way we want. Using maven shade plugin or some such ?

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Amol Kekre (Updated) (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amol Kekre updated MAPREDUCE-2600:
----------------------------------

          Component/s: mrv2
    Affects Version/s: 0.23.0
    
> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083292#comment-13083292 ] 

Alejandro Abdelnur commented on MAPREDUCE-2600:
-----------------------------------------------

That seems better :).

I'm not familiar with MR2 code distribution, but where do we find out the MapReduce APIs? That should be a separate JAR, just the MR interface, no?

Also, when using the client API, do I have to define the dependency for one artifact and I'm done (all the other come as transitive dependencies and are implementation specific not exposed to the user)?


> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083521#comment-13083521 ] 

Luke Lu commented on MAPREDUCE-2600:
------------------------------------

bq. when using the client API, do I have to define the dependency for one artifact and I'm done (all the other come as transitive dependencies and are implementation specific not exposed to the user)?

If you just need to use the client API, dependency on org.apache.hadoop:hadoop-mapreduce-client-jobclient would suffice. If it's not so, we need to fix it :)



> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13050559#comment-13050559 ] 

Owen O'Malley commented on MAPREDUCE-2600:
------------------------------------------

I'd propose that we have:

mr-client/* -> src/java, src/test
yarn/yarn-api,yarn-common -> yarn/client
yarn/yarn-server/* -> yarn/server

so that we end up withyarn-client, yarn-server, and mapreduce jars. Of course the Java package structure will still separate the different servers from each other.

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2600) MR-279: simplify the jars

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083202#comment-13083202 ] 

Owen O'Malley commented on MAPREDUCE-2600:
------------------------------------------

It is a big issue for downstream users. Projects that use Hadoop already pick up a lot of jars and increasing the set when all of the versions are the same is a problem. We'll also have users using different versions of the jars, which won't be useful.

Having a source structure that requires an IDE to use isn't making the code easy for people to browse, use and modify. It will also become a maintenance problem as the dependency graph between the components change.

Yes, you can munge the results together into a single jar as part of the build, but I don't see how it makes development easier or faster to have lots of little directories. 

That said, I don't have cycles to do the work right now. If no one else does either, we can postpone the debate.

> MR-279: simplify the jars 
> --------------------------
>
>                 Key: MAPREDUCE-2600
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2600
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Luke Lu
>
> Currently the MR-279 mapreduce project generates 59 jars from 59 source roots, which can be dramatically simplified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira