You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Luke Lu (JIRA)" <ji...@apache.org> on 2011/05/10 00:55:03 UTC

[jira] [Created] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

MR-279: mr app should not depend on hard-coded version of shuffle
-----------------------------------------------------------------

                 Key: MAPREDUCE-2480
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2
            Reporter: Luke Lu
            Assignee: Luke Lu
             Fix For: 0.23.0


The following commit introduced a dependency of shuffle with hard-coded version for mr app:
{noformat}
commit 6f69742140516be7493c9a9177b81d0516cc9539
Author: Vinod Kumar Vavilapalli <vi...@apache.org>
Date:   Wed May 4 06:53:52 2011 +0000

    Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
{noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031033#comment-13031033 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-2480:
----------------------------------------------------

bq. However a bigger issue is whether app (a user land module) should depend on a specific implementation shuffle (loaded into node manager). I think it should not. The MAPREDUCE_SERVICE_ID should probably resides in a MRServices class in mr common. I have the impression (based on previous conversations) that Chris was gonna refactor the NM service stuff later. It's probably OK as is now.

No, that isn't quite correct. An app really SHOULD depend on a specific implementation of shuffle. An app is tied to a specific version of shuffle be it the request format or the authentication data. Shuffle Service isn't really 'known' to the NM, it is an abstract service started by NM.

> MR-279: mr app should not depend on hard-coded version of shuffle
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Luke Lu
>            Assignee: Luke Lu
>             Fix For: 0.23.0
>
>         Attachments: mr-2480-shuffle-version-v1.patch
>
>
> The following commit introduced a dependency of shuffle with hard-coded version for mr app:
> {noformat}
> commit 6f69742140516be7493c9a9177b81d0516cc9539
> Author: Vinod Kumar Vavilapalli <vi...@apache.org>
> Date:   Wed May 4 06:53:52 2011 +0000
>     Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031281#comment-13031281 ] 

Luke Lu commented on MAPREDUCE-2480:
------------------------------------

bq. To further clarify, the dependency, ideally, should not be on the implementation but rather on some kind of interface exposed by the shuffle service.

That's exactly my point. It should not depend on a specific implementation class but rather an SPI that both the app and the shuffle implementation depend on. In this case, the SPI is the service id at the moment. If you rename the ShuffleHandler to ShuffleHandlerImpl the code would look wrong too.

Now you have a runtime classpath dependency on a specific shuffle implementation for mr app. If you replace the jar with a different class and update the config, the NM will happily load the service but the mr app won't even run due to class not found exception. 

> MR-279: mr app should not depend on hard-coded version of shuffle
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Luke Lu
>            Assignee: Luke Lu
>             Fix For: 0.23.0
>
>         Attachments: mr-2480-shuffle-version-v1.patch
>
>
> The following commit introduced a dependency of shuffle with hard-coded version for mr app:
> {noformat}
> commit 6f69742140516be7493c9a9177b81d0516cc9539
> Author: Vinod Kumar Vavilapalli <vi...@apache.org>
> Date:   Wed May 4 06:53:52 2011 +0000
>     Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13031035#comment-13031035 ] 

Vinod Kumar Vavilapalli commented on MAPREDUCE-2480:
----------------------------------------------------

To further clarify, the dependency, ideally, should not be on the implementation but rather on some kind of interface exposed by the shuffle service. Today there isn't any interface separate from the implementation and hence the dependency on the implementation itself.

> MR-279: mr app should not depend on hard-coded version of shuffle
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Luke Lu
>            Assignee: Luke Lu
>             Fix For: 0.23.0
>
>         Attachments: mr-2480-shuffle-version-v1.patch
>
>
> The following commit introduced a dependency of shuffle with hard-coded version for mr app:
> {noformat}
> commit 6f69742140516be7493c9a9177b81d0516cc9539
> Author: Vinod Kumar Vavilapalli <vi...@apache.org>
> Date:   Wed May 4 06:53:52 2011 +0000
>     Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

Posted by "Mahadev konar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mahadev konar resolved MAPREDUCE-2480.
--------------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

I just committed this to MR-279 branch. thanks luke!

> MR-279: mr app should not depend on hard-coded version of shuffle
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Luke Lu
>            Assignee: Luke Lu
>             Fix For: 0.23.0
>
>         Attachments: mr-2480-shuffle-version-v1.patch
>
>
> The following commit introduced a dependency of shuffle with hard-coded version for mr app:
> {noformat}
> commit 6f69742140516be7493c9a9177b81d0516cc9539
> Author: Vinod Kumar Vavilapalli <vi...@apache.org>
> Date:   Wed May 4 06:53:52 2011 +0000
>     Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Lu updated MAPREDUCE-2480:
-------------------------------

    Attachment: mr-2480-shuffle-version-v1.patch

The v1 patch fix the hard-code version.

However a bigger issue is whether app (a user land module) should depend on a specific implementation shuffle (loaded into node manager). I think it should not.  The MAPREDUCE_SERVICE_ID should probably resides in a MRServices class in mr common. I have the impression (based on previous conversations) that Chris was gonna refactor the NM service stuff later. It's probably OK as is now.

> MR-279: mr app should not depend on hard-coded version of shuffle
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Luke Lu
>            Assignee: Luke Lu
>             Fix For: 0.23.0
>
>         Attachments: mr-2480-shuffle-version-v1.patch
>
>
> The following commit introduced a dependency of shuffle with hard-coded version for mr app:
> {noformat}
> commit 6f69742140516be7493c9a9177b81d0516cc9539
> Author: Vinod Kumar Vavilapalli <vi...@apache.org>
> Date:   Wed May 4 06:53:52 2011 +0000
>     Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2480) MR-279: mr app should not depend on hard-coded version of shuffle

Posted by "Luke Lu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030966#comment-13030966 ] 

Luke Lu commented on MAPREDUCE-2480:
------------------------------------

Being able to specify hadoop-mapreduce.version and yarn.version is required by CI builds. To reproduce the problem:
{noformat}
$ mvn -Dhadoop-mapreduce.version=1.0-test -Dyarn.version=1.0-test clean install -P-cbuild -DskipTests
{noformat}

{noformat}
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

Missing:
----------
1) org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:1.0-SNAPSHOT

  Try downloading the file manually from the project website.

  Then, install it using the command: 
      mvn install:install-file -DgroupId=org.apache.hadoop -DartifactId=hadoop-mapreduce-client-shuffle -Dversion=1.0-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file

  Alternatively, if you host your own repository you can deploy the file there: 
      mvn deploy:deploy-file -DgroupId=org.apache.hadoop -DartifactId=hadoop-mapreduce-client-shuffle -Dversion=1.0-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

  Path to dependency: 
        1) org.apache.hadoop:hadoop-mapreduce-client-app:jar:1.0-test
        2) org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:1.0-SNAPSHOT

----------
1 required artifact is missing.

for artifact: 
  org.apache.hadoop:hadoop-mapreduce-client-app:jar:1.0-test
{noformat}

> MR-279: mr app should not depend on hard-coded version of shuffle
> -----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2480
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2480
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>            Reporter: Luke Lu
>            Assignee: Luke Lu
>             Fix For: 0.23.0
>
>
> The following commit introduced a dependency of shuffle with hard-coded version for mr app:
> {noformat}
> commit 6f69742140516be7493c9a9177b81d0516cc9539
> Author: Vinod Kumar Vavilapalli <vi...@apache.org>
> Date:   Wed May 4 06:53:52 2011 +0000
>     Adding user log handling for YARN. Making NM put the user-logs on DFS and providing log-dump tools. Contributed by Vinod Kumar Vavilapalli.
> {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira