You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Alejandro Abdelnur (JIRA)" <ji...@apache.org> on 2012/07/10 02:16:34 UTC

[jira] [Created] (MAPREDUCE-4417) add support for encrypted shuffle

Alejandro Abdelnur created MAPREDUCE-4417:
---------------------------------------------

             Summary: add support for encrypted shuffle
                 Key: MAPREDUCE-4417
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
             Project: Hadoop Map/Reduce
          Issue Type: New Feature
          Components: mrv2
    Affects Versions: 2.0.0-alpha
            Reporter: Alejandro Abdelnur
            Assignee: Alejandro Abdelnur
             Fix For: 2.0.1-alpha


Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 

When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422764#comment-13422764 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

findbugs warning is in *org.apache.hadoop.fs.FileUtil.symLink* which is not related to this JIRA work.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420912#comment-13420912 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

@eric14, you are correct, on its own is an incomplete feature and it will require more work. I've posted the patch for branch-1 in case somebody is interested. 
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

getting (or trying) to get rid of findbugs. looking into the other errors that I don't see them failing locally
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch
    
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420876#comment-13420876 ] 

eric baldeschwieler commented on MAPREDUCE-4417:
------------------------------------------------

PS I thought we had a process for adding things to 1, which was to propose them during next release planning.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419583#comment-13419583 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537412/MAPREDUCE-4417.patch
  against trunk revision .

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2640//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

@todd, thanks for the detailed review.

I've integrated most of your comments.

* The javadoc style for 'Returns BLAH' and then '@return  BLAH' is Sun javadoc sytle.

* keystore type is case insensitive, 'jks' is the same as 'JKS'. Still I've lowercased that javadoc.

* the ReloadingX509TrustManager will work with an empty keystore if the keystore file is not avail at initialization time, and if the keystore file becomes available later one, it will be loaded. WARNs are logged while the file is not present, so it won't go unnoticed.

* added a init()/destroy() methods where appropriate to be able to shutdown the reload thread gracefully.

* If reload() fails to reload the new keystore, it assumes there are not certs and runs empty until the next reload attempt. Seems a safer assumption that continuing running with obsolete keys.

* While hadoop.ssl.enabled only applies to shuffle, the intention is to use it for the rest of the HTTP endpoints. Thus, a single know would enable SSL. That is why the name of the property and its location (in core-default.xml)

* Regarding having it per job, This would require having shuffler serving both HTTP and HTTPS and denying the endpoint the job is not configured to use. This would require the shuffler to have access to that piece of job configuration. I'd say it is out of scope of this patch, and it could be a future improvement.

* In the TestSSLFactory, the Assert.fail() statements, are sections the test should not make it; they are used for negative tests.

* Client certs are disabled by default. If they are per job, yes they could be shipped via DC. This would require a alternate implementation of the KeyStoresFactory, thus the mechanism is already in place.

                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

patch addressing test-patch complains.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416532#comment-13416532 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536861/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2614//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2614//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2614//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420832#comment-13420832 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537583/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 4 new or modified test files.

    -1 javac.  The applied patch generated 2049 javac compiler warnings (more than the trunk's current 2007 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2649//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2649//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2649//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2649//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417-branch-1.patch

backport for branch-1.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417307#comment-13417307 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537017/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2622//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2622//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2622//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416644#comment-13416644 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536884/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2615//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2615//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2615//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

adding to docs a note about client certs. And a missed hunk to ShuffleHandler.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419678#comment-13419678 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537424/MAPREDUCE-4417.patch
  against trunk revision .

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2642//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422711#comment-13422711 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537902/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 4 new or modified test files.

    -1 javac.  The applied patch generated 2049 javac compiler warnings (more than the trunk's current 2007 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2661//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2661//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2661//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2661//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417908#comment-13417908 ] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

- the reformatting of ssl-client.xml.example and ssl-server.xml.example makes it a little hard to read the diff. Is it necessary to reindent, etc?

- Style:
{code}
+ * if the trust certificates keystore file changes, the trustmanager
+ * is refreshed with the new trust certificate entries (using a
+ * {@link ReloadingX509TrustManager} trustmanager).
{code}

Formatting can be improved here - eg TrustManager is a java class, so should probably {@link} it to make it clear you're talking about a specific class and not an abstract concept. (as someone who doesn't know the SSL APIs well, it would make it easier to read)

----

{code}
+    SSLFactory.Mode mode =
+      SSLFactory.Mode.valueOf(conf.get(SSLFactory.SSL_FACTORY_MODE));
{code}
Why are we passing the factory mode through the configuration, instead of just making it a parameter for init()? It seems a little fragile/unnecessary, and a bit confusing since it's not a parameter that the user sets.

----

{code}
+    String keystoreType =
+      conf.get(resolvePropertyName(mode, SSL_KEYSTORE_TYPE_TPL), "jks");
{code}

What's jks? You also use the term "jks" in the conf files, but I don't know what it refers to (again, as an SSL n00b). Improvements:
-- in the config file where you say "default value is 'jks'", add "which enables the blah blah type key store" and some reference to what it means?
-- in the code, add a constant SSL_KEYSTORE_TYPE_DEFAULT, and javadoc with a pointer to what jks is.


----
{code}
+      String keystoreLocation = conf.get(
+        resolvePropertyName(mode, SSL_KEYSTORE_LOCATION_TPL), "");
+      keystorePassword = conf.get(
+        resolvePropertyName(mode, SSL_KEYSTORE_PASSWORD_TPL), "").toCharArray();
+
+      LOG.debug(mode.toString() + " KeyStore: " + keystoreLocation);
+
+      InputStream is = new FileInputStream(keystoreLocation);
{code}

If this property isn't set, you'll end up passing an empty string to the FileInputStream constructor, which will end up giving a hard-to-diagnose message. Check whether {{keystoreLocation.isEmpty()}}, and if it is, throw an appropriate exception including the config name.

Same goes for {{trustStoreLocation}}

----

Style nit: you have several javadocs for getters which are redundant, eg:
{code}
+  /**
+   * Returns the trustmanagers for trusted certificates.
+   *
+   * @return the trustmanagers for trusted certificates.
+   */
{code}
No need to repeat yourself twice - just have the @return line in the javadoc and not the line above it, IMO.

----
{code}
+   * @param type type of truststore file, typically 'JKS'.
{code}
Elsewhere in the code you have "jks" (lower case). Is it case sensitive?

----
{code}
+    } catch (Exception ex) {
+      trustManagerRef.set(null);
+      LOG.warn("Could not load truststore, using empty one : " + ex.toString(),
+               ex);
+    }
{code}
Why should you use an empty one? If the user configures a path to a trust store, and then starts up but the store can't be found, I don't think we should ignore their config. Better to bail out on startup. Then all of the null checks later on in this file could be removed.

----
{code}
+      FileInputStream in = new FileInputStream(file);
+      try {
+        ks.load(in, password.toCharArray());
+        lastLoaded = file.lastModified();
{code}
I think you need to set {{lastLoaded}} _before_ opening the file. Otherwise there's a race where you can miss a change to the file.

----
{code}
+    } catch (Exception ex) {
+      throw new RuntimeException(ex);
+    }
{code}
Maybe use {{Throwables.propagateIfPossible}} here to propagate IOException and GeneralSecurityException first? Seems strange to throw RTE for an IOE when you declare that the method throws IOE.

----
{code}
+  @SuppressWarnings({"InfiniteLoopStatement"})
+  public void run() {
{code}
There's really no way we can get a cleanup hook here to stop the thread at shutdown?

----
{code}
+        } catch (Exception ex) {
+          trustManagerRef.set(null);
+          LOG.warn("Could not load truststore, using empty one : " +
+                   ex.toString(),  ex);
+        }
{code}

If it fails to reload, why not stick to the previous version of the reference instead of falling back to empty?

----
{code}
+ * This SSLFactory uses a {@link ReloadingX509TrustManager} intance,
+ * which reloads public keys if the truststore file changes.
+ * <p/>
+ * This factory is used to configure HTTPS in Hadoop HTTP based endpoints, both
+ * client & server.
{code}
Typo: 'intance'
Style: don't abbreviate the word "and" as '&' -- it's invalid javadoc and also just harder to read.

----
{code}
+  public enum Mode { CLIENT, SERVER }
{code}

This can be {{static}} right? Also, since it's an inner class, you need an {{@InterfaceAudience.Private}} on it, too, or else it shows up in the public javadoc. (unfortunately the annotation doesn't get inherited from its outer class)

----
{code}
+  public static final String SSL_ENABLED =
+    "hadoop.ssl.enabled";
...
+  public static final String KEYSTORES_FACTORY_CLASS =
+    "hadoop.ssl.keystores.factory.class";
{code}
Style: rename all these constants to end in {{_KEY}} so it's clear it's the conf keys and not the values themselves.

----
{code}
+    Configuration sslConf = new Configuration(false);
+    sslConf.setBoolean(SSL_REQUIRE_CLIENT_CERT, requireClientCert);
+    String sslConfResource;
+    if (mode == Mode.CLIENT) {
+        sslConfResource = conf.get(SSL_CLIENT_CONF, "ssl-client.xml");
+    } else {
+      sslConfResource = conf.get(SSL_SERVER_CONF, "ssl-server.xml");
+    }
+    sslConf.addResource(sslConfResource);
{code}
Move this into a private method {{readSslConfiguration(mode)}}? Also, indentation is off in one line here.

----
- Extract the creation of SSLHostnameVerifier into a new method, as well.

----
{code}
+<property>
+  <name>hadoop.ssl.enabled</name>
+  <value>false</value>
+  <description>Whether encrypted shuffle is enabled</description>
+</property>
{code}
If this is specific to encrypted shuffle, the name should reflect that, and it should be in mapred-default.xml, not core-default.xml

I wonder: is there a use case for having this setting per-job in some clusters? Either way, it should definitely be an MR config and not a core config.

----
{code}
+    The keystores factory to use for retriving certificates.
{code}
Typo: retriving

----
{code}
+
+public class KeyStoreUtil {
{code}
Rename to KeyStoreTestUtil, since this is a test-only class.

{code}
+    FileOutputStream out = new FileOutputStream(filename);
+    ks.store(out, password.toCharArray());
+    out.close();
{code}
Need try..finally. A few other places later in this same file that need this fix.

----

{code}
+    // Wait so that the file modification time is different
+    Thread.sleep((tm.getReloadInterval() + 2) * 1000);
{code}
You have this in a bunch of places in the test - but if you set the last modified time of the file, as you do elsewhere in the test, then you shouldn't have to sleep, except for waiting for it to _notice_ the reload. If you change the reload interval to be specified in millis instead of seconds, then you could set it to 10ms or so for the tests and these tests would run a lot faster.

----
- In TestSSLFactory, you use Assert.fail() in a lot of places after catching an Exception. Instead, just let the exception fall through which will fail the test, with the advantage that we'll actually have the stack trace of the exception instead of an unexplained failure message. In the cases where you expect an exception, use {{GenericTestUtils.assertExceptionContains}} to check the text.

----
{code}
+        writeFuture = ch.write(new ChunkedFile(spill, info.startOffset,
+                                               info.partLength, 8192));
{code}
What's 8192 here? Need a constant or config. If it's a buffer size, I'd think 64K or 128K would probably perform better, based on my general experience with java IO.

----

- In the docs, under the ssh-client configuration, it references ssl-server.xml in one spot.
- Typo: "trutsstore" in one place.
- Typo: "will incurs in a significant"

----

General comment: what's the point of client certificates here? They're not a secret, since all users share them. I would think they'd need to be shipped with the job in the distributed-cache, if the use case is for cross-cluster authentication in tools like distcp, since different users may want to distcp from different clusters, and also have different access controls.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415848#comment-13415848 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536752/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2115 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle:

                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.mapred.TestFileInputFormatPathFilter
                  org.apache.hadoop.mapred.TestFieldSelection
                  org.apache.hadoop.mapred.TestBlockLimits
                  org.apache.hadoop.mapred.TestMiniMRClasspath
                  org.apache.hadoop.mapred.TestTextOutputFormat
                  org.apache.hadoop.mapred.TestSequenceFileInputFormat
                  org.apache.hadoop.mapreduce.TestMROutputFormat
                  org.apache.hadoop.mapreduce.lib.chain.TestChainErrors
                  org.apache.hadoop.mapreduce.lib.aggregate.TestMapReduceAggregates
                  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
                  org.apache.hadoop.mapred.jobcontrol.TestJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileAsTextInputFormat
                  org.apache.hadoop.mapred.TestMiniMRBringup
                  org.apache.hadoop.mapreduce.TestValueIterReset
                  org.apache.hadoop.mapreduce.lib.output.TestMRMultipleOutputs
                  org.apache.hadoop.mapred.TestMiniMRChildTask
                  org.apache.hadoop.mapred.TestMapRed
                  org.apache.hadoop.mapred.lib.TestMultipleOutputs
                  org.apache.hadoop.mapred.TestReporter
                  org.apache.hadoop.mapred.TestCollect
                  org.apache.hadoop.mapred.TestReduceFetch
                  org.apache.hadoop.mapred.TestNetworkedJob
                  org.apache.hadoop.mapred.TestTaskCommit
                  org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter
                  org.apache.hadoop.mapred.TestClusterMRNotification
                  org.apache.hadoop.mapreduce.TestMapReduce
                  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
                  org.apache.hadoop.mapred.TestJobCounters
                  org.apache.hadoop.mapreduce.lib.db.TestDataDrivenDBInputFormat
                  org.apache.hadoop.mapred.TestMiniMRClientCluster
                  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
                  org.apache.hadoop.mapreduce.lib.input.TestMultipleInputs
                  org.apache.hadoop.mapred.TestFileOutputCommitter
                  org.apache.hadoop.mapred.TestLazyOutput
                  org.apache.hadoop.mapred.TestLocalMRNotification
                  org.apache.hadoop.mapred.TestJobCleanup
                  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
                  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
                  org.apache.hadoop.mapred.lib.TestMultithreadedMapRunner
                  org.apache.hadoop.mapreduce.lib.chain.TestSingleElementChain
                  org.apache.hadoop.mapred.TestLineRecordReader
                  org.apache.hadoop.mapred.TestUserDefinedCounters
                  org.apache.hadoop.mapred.TestMapOutputType
                  org.apache.hadoop.mapred.lib.aggregate.TestAggregates
                  org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat
                  org.apache.hadoop.mapreduce.lib.input.TestLineRecordReader
                  org.apache.hadoop.mapred.lib.TestChainMapReduce
                  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
                  org.apache.hadoop.mapred.join.TestDatamerge
                  org.apache.hadoop.io.TestSequenceFileMergeProgress
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileAsBinaryInputFormat
                  org.apache.hadoop.mapred.jobcontrol.TestLocalJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileInputFilter
                  org.apache.hadoop.mapred.TestJavaSerialization
                  org.apache.hadoop.mapred.lib.TestKeyFieldBasedComparator
                  org.apache.hadoop.mapreduce.lib.join.TestJoinDatamerge
                  org.apache.hadoop.mapred.lib.TestLineInputFormat
                  org.apache.hadoop.mapreduce.lib.fieldsel.TestMRFieldSelection
                  org.apache.hadoop.mapred.TestJobSysDirWithDFS
                  org.apache.hadoop.mapred.TestComparators
                  org.apache.hadoop.mapreduce.lib.input.TestNLineInputFormat
                  org.apache.hadoop.mapred.TestMultipleTextOutputFormat
                  org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedComparator
                  org.apache.hadoop.mapreduce.lib.output.TestMRSequenceFileAsBinaryOutputFormat
                  org.apache.hadoop.mapreduce.lib.map.TestMultithreadedMapper
                  org.apache.hadoop.mapred.TestJobName
                  org.apache.hadoop.mapred.TestFileOutputFormat
                  org.apache.hadoop.mapreduce.security.TestJHSSecurity
                  org.apache.hadoop.mapred.TestMapProgress
                  org.apache.hadoop.mapreduce.lib.chain.TestMapReduceChain

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2608//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2608//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2608//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418907#comment-13418907 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537273/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2048 javac compiler warnings (more than the trunk's current 2006 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2631//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2631//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2631//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2631//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

addressing Tom's comments and making the keystores pluggable (to later enable other mechanisms -such as jobtoken- to generate certificates on the fly).
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416637#comment-13416637 ] 

Tom White commented on MAPREDUCE-4417:
--------------------------------------

This looks good to me (although, as Alejandro mentioned, I have worked on an earlier version of this, so someone else should review it too). A few minor things I noticed:

* SSLFactory is in a mapreduce package, but in the common project. Just move it to org.apache.hadoop.security.ssl?
* Mark SSLFactory.resolvePropertyName with the VisibleForTesting annotation.
* ReloadingX509TrustManager allows 'this' to escape in its constructor. Perhaps give it a separate initialization method to start the reloader.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420245#comment-13420245 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537503/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 4 new or modified test files.

    -1 javac.  The applied patch generated 2049 javac compiler warnings (more than the trunk's current 2007 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2645//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2645//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2645//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2645//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419674#comment-13419674 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537419/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2048 javac compiler warnings (more than the trunk's current 2006 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat
                  org.apache.hadoop.mapred.TestClusterMapReduceTestCase

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2641//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2641//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2641//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2641//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422561#comment-13422561 ] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

bq. -1 findbugs. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.
Are you going to update the findbugs exclude file for this warning? We can't commit until this comes back clean.


Otherwise the latest trunk patch looks good.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415721#comment-13415721 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536713/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2115 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle:

                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.mapred.TestFileInputFormatPathFilter
                  org.apache.hadoop.mapred.TestFieldSelection
                  org.apache.hadoop.mapred.TestBlockLimits
                  org.apache.hadoop.mapred.TestMiniMRClasspath
                  org.apache.hadoop.mapred.TestTextOutputFormat
                  org.apache.hadoop.mapred.TestSequenceFileInputFormat
                  org.apache.hadoop.mapreduce.TestMROutputFormat
                  org.apache.hadoop.mapreduce.lib.chain.TestChainErrors
                  org.apache.hadoop.mapreduce.lib.aggregate.TestMapReduceAggregates
                  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
                  org.apache.hadoop.mapred.jobcontrol.TestJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileAsTextInputFormat
                  org.apache.hadoop.mapred.TestMiniMRBringup
                  org.apache.hadoop.mapreduce.TestValueIterReset
                  org.apache.hadoop.mapreduce.lib.output.TestMRMultipleOutputs
                  org.apache.hadoop.mapred.TestMiniMRChildTask
                  org.apache.hadoop.mapred.TestMapRed
                  org.apache.hadoop.mapred.lib.TestMultipleOutputs
                  org.apache.hadoop.mapred.TestReporter
                  org.apache.hadoop.mapred.TestCollect
                  org.apache.hadoop.mapred.TestReduceFetch
                  org.apache.hadoop.mapred.TestNetworkedJob
                  org.apache.hadoop.mapred.TestTaskCommit
                  org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter
                  org.apache.hadoop.mapred.TestClusterMRNotification
                  org.apache.hadoop.mapreduce.TestMapReduce
                  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
                  org.apache.hadoop.mapred.TestJobCounters
                  org.apache.hadoop.mapreduce.lib.db.TestDataDrivenDBInputFormat
                  org.apache.hadoop.mapred.TestMiniMRClientCluster
                  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
                  org.apache.hadoop.mapreduce.lib.input.TestMultipleInputs
                  org.apache.hadoop.mapred.TestFileOutputCommitter
                  org.apache.hadoop.mapred.TestLazyOutput
                  org.apache.hadoop.mapred.TestLocalMRNotification
                  org.apache.hadoop.mapred.TestJobCleanup
                  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
                  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
                  org.apache.hadoop.mapred.lib.TestMultithreadedMapRunner
                  org.apache.hadoop.mapreduce.lib.chain.TestSingleElementChain
                  org.apache.hadoop.mapred.TestLineRecordReader
                  org.apache.hadoop.mapred.TestUserDefinedCounters
                  org.apache.hadoop.mapred.TestMapOutputType
                  org.apache.hadoop.mapred.lib.aggregate.TestAggregates
                  org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat
                  org.apache.hadoop.mapreduce.lib.input.TestLineRecordReader
                  org.apache.hadoop.mapred.lib.TestChainMapReduce
                  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
                  org.apache.hadoop.mapred.join.TestDatamerge
                  org.apache.hadoop.io.TestSequenceFileMergeProgress
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileAsBinaryInputFormat
                  org.apache.hadoop.mapred.jobcontrol.TestLocalJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileInputFilter
                  org.apache.hadoop.mapred.TestJavaSerialization
                  org.apache.hadoop.mapred.lib.TestKeyFieldBasedComparator
                  org.apache.hadoop.mapreduce.lib.join.TestJoinDatamerge
                  org.apache.hadoop.mapred.lib.TestLineInputFormat
                  org.apache.hadoop.mapreduce.lib.fieldsel.TestMRFieldSelection
                  org.apache.hadoop.mapred.TestJobSysDirWithDFS
                  org.apache.hadoop.mapred.TestComparators
                  org.apache.hadoop.mapreduce.lib.input.TestNLineInputFormat
                  org.apache.hadoop.mapred.TestMultipleTextOutputFormat
                  org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedComparator
                  org.apache.hadoop.mapreduce.lib.output.TestMRSequenceFileAsBinaryOutputFormat
                  org.apache.hadoop.mapreduce.lib.map.TestMultithreadedMapper
                  org.apache.hadoop.mapred.TestJobName
                  org.apache.hadoop.mapred.TestFileOutputFormat
                  org.apache.hadoop.mapreduce.security.TestJHSSecurity
                  org.apache.hadoop.mapred.TestMapProgress
                  org.apache.hadoop.mapreduce.lib.chain.TestMapReduceChain

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2603//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2603//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2603//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2603//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2603//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

and not taking care of most of the javac warnings. There are a bunch of them because of direct use of SunX509 classes, but there is not alternative for this.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

rebasing patch to trunk and resolving conflict in index documentation page. Also, simplified some initialization logic in the testcases.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

Thanks Todd, attached patch that takes care of all your comments but the last one. So test-patch runs. I'll update the docs to explain the client cert stuff properly.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416405#comment-13416405 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536844/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle:

                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2613//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2613//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2613//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 2.1.0-alpha)
                   2.2.0-alpha
     Hadoop Flags: Reviewed
           Status: Resolved  (was: Patch Available)

committed to trunk and branch-2.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.2.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411666#comment-13411666 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

@eric14, 

The driving use case is to avoid data spoofing while on the wire.

Agree, encrypting data at both sides is the obvious follow up to this JIRA in order to have end to end over the wire confidentiality.

In current Hadoop, as you suggest, you can use compression codecs to do encryption on both sides.

However, you can not do that for the shuffle. Thus this JIRA to tackle the shuffle case first.

Of course, this functionality would be disabled by default, even if Kerberos security is enabled. You'll need to set another knob to enable shuffle encryption.

Hope this clarifies.


                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Component/s: security
    
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

previously missed Tom's last suggestion (forgot to 'git add -u' before creating the patch)
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

rebasing patch as the revert for MAPREDUCE-4423 created some conflicts.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Status: Patch Available  (was: Open)
    
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418839#comment-13418839 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537262/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2048 javac compiler warnings (more than the trunk's current 2006 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapred.TestMiniMRClientCluster
                  org.apache.hadoop.mapreduce.v2.TestMROldApiJobs
                  org.apache.hadoop.mapred.TestJobCounters
                  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
                  org.apache.hadoop.mapred.TestJobName
                  org.apache.hadoop.mapreduce.v2.TestMiniMRProxyUser
                  org.apache.hadoop.mapred.TestClusterMRNotification
                  org.apache.hadoop.mapred.TestReduceFetch
                  org.apache.hadoop.mapreduce.TestChild
                  org.apache.hadoop.mapred.TestLazyOutput
                  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
                  org.apache.hadoop.mapreduce.v2.TestMRJobs
                  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
                  org.apache.hadoop.mapreduce.v2.TestMRJobsWithHistoryService
                  org.apache.hadoop.mapred.TestJobSysDirWithDFS
                  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
                  org.apache.hadoop.mapred.TestMiniMRClasspath
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2629//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2629//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2629//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2629//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413498#comment-13413498 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

When looking at encryption on the wire for the shuffle the alternatives that popped up where transport encryption (HTTPS) and data/spills encryption (doable via a codec).

Using HTTPS requires improving the Fetcher/ShuffleHandler (Netty/JDK-URL) to use HTTPS and configuring certificates. It is a well understood/standard/proven technology and gives you end to end confidentiality, integrity, server authentication (and optionally client authentication), in an out of box manner without room to get things wrong. The server certificates private keys are out of reach from job tasks (they are used by the NM, similar to Kerberos keytabs). 

Using a codec, requires (leveraging a existing plugin point) a compression codec implementation that adds cipher-streams wrappers to the original streams and in addition could delegate to a real compression codec (in order not to lose compression if doing encryption). This requires us choosing a Cipher implementation by hand (which I'm not an expert on) and I'm not sure which one would be the best choice and what are the weaknesses of each one of them (http://en.wikipedia.org/wiki/Stream_cipher#Comparison_Of_Stream_Ciphers). Using a cipher on its own will provide confidentiality but it would not provide integrity or man-in-the-middle protection (unless we end up implementing something like TLS). In addition, both ends are controlled by job tasks, thus it becomes the responsibility of the user to create/distribute/protect the secrets that are basis of confidentiality. In addition, with the codec approach the HTTP shuffle requests/response headers go in the clear which could enable a man-in-the-middle attach.

                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422825#comment-13422825 ] 

Tom White commented on MAPREDUCE-4417:
--------------------------------------

+1 for the latest patch.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

patch with complete implementation. Introducing an SSLFactory class in common so it can be used by follow up HADOOP-8581. Patch does not have documentation yet. I'll work on that next.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

And now with testcases passing (the encrypted shuffle testcase was leaving a core-site.xml that was being picked up by other testcases)

Forgot to mention before, all this work is based on an initial implementation by Tom White.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418912#comment-13418912 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

test failure seems unrelated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416642#comment-13416642 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

Thanks for the review Tom. I'll integrate your changes. After a chat with Devaraj and other with you, I'll do some refactoring in how the keystores are produced to enable plugin alterante implementations (a follow up JIRA will be for creating the certs in teh keystore using the jobtoken secrets).
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418970#comment-13418970 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537291/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2048 javac compiler warnings (more than the trunk's current 2006 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2632//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2632//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2632//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2632//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420789#comment-13420789 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

The patch for branch-1, opens an additional port with SSL just for encrypted shuffle, the shuffle servlet (MapOutputServlet) refuses to serve shuffle over the clear HTTP endpoint if SSL is enable:

{code}
      if (shuffleSsl && !request.isSecure()) {
        response.sendError(HttpServletResponse.SC_FORBIDDEN,
          "Encrypted Shuffle is enabled, shuffle is only served over HTTPS");
        return;
      }
{code}

                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416754#comment-13416754 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536907/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.security.ssl.TestSSLFactory
                  org.apache.hadoop.mapreduce.security.ssl.TestEncryptedShuffle
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2617//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2617//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2617//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411198#comment-13411198 ] 

eric baldeschwieler commented on MAPREDUCE-4417:
------------------------------------------------

What is the driving use case?  

I'd suggest that anyone who wants the data encrypted on the wire, will want it encrypted at rest on both sides as well.  The data is as vulnerable there.  

I wonder if we can come up with an approach that just introduces new plugins and doesn't add any hadoop code?  The right thing is probably to use the compression codecs to encrypt on the way to disk.

thoughts?

                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422634#comment-13422634 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

yep, updating findbugs exclusion as part of the commit.

thx
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420998#comment-13420998 ] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

bq. PS I thought we had a process for adding things to 1, which was to propose them during next release planning.

That process was proposed but it hasn't been followed at all. There have been plenty of new features going into branch-1 without prior discussion.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411696#comment-13411696 ] 

eric baldeschwieler commented on MAPREDUCE-4417:
------------------------------------------------

Anyone I've talked to who has been concerned about over the wire has also raise the on disk issue.  So, would it be better to put the encryption where we write to disk, where we already compress?  It seems like this might be less invasive and would be more complete.

It has downsides if you do lots of spills, but it is much more complete.  The compaction issue can be addressed by collation work folks are already playing with down the road.

---

Do you already have an HDFS solution in place?  This only covers a fraction of the data traffic.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411728#comment-13411728 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

@eric14, my bad you can use a codec in the shuffle, when looking into this I've discarded that option, let me remember exactly why and I'll follow up.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422775#comment-13422775 ] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

Sorry, that findbugs was my issue due to an accidental commit - I reverted it. +1 from my side assuming Tom and co are still good with it.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

@todd, addressing your last to comments. THX!
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419504#comment-13419504 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12537381/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2048 javac compiler warnings (more than the trunk's current 2006 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.mapreduce.v2.TestMRJobs
                  org.apache.hadoop.mapreduce.v2.TestSpeculativeExecution
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2638//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2638//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2638//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2638//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

patch now includes findbugs exclusion.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy closed MAPREDUCE-4417.
------------------------------------

    
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.2-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "eric baldeschwieler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420875#comment-13420875 ] 

eric baldeschwieler commented on MAPREDUCE-4417:
------------------------------------------------

Why would we add this to 1?  I understand mainline, but not anything else at this point.  I don't think this is a complete approach.  It's going to take additional work to finish this.  Why add the complexity to a stabilized code line?
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415614#comment-13415614 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536699/MAPREDUCE-4417.patch
  against trunk revision .

    -1 @author.  The patch appears to contain 2 @author tags which the Hadoop community has agreed to not allow in code contributions.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2115 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    -1 findbugs.  The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle:

                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.mapreduce.security.ssl.TestEncryptedShuffle

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2602//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2602//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2602//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-shuffle.html
Findbugs warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2602//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-mapreduce-client-core.html
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2602//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2602//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

And now with documentation.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

fixing testcases that failed after latest refactoring due to incorrect setup.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

minor corrections to the docs for trunk.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417-branch-1.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416406#comment-13416406 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

test failures seem unrelated. javac warnings are because of use of SunX509 classes.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415899#comment-13415899 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536772/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle:

                  org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
                  org.apache.hadoop.io.file.tfile.TestTFileByteArrays
                  org.apache.hadoop.mapred.TestFileInputFormatPathFilter
                  org.apache.hadoop.mapred.TestFieldSelection
                  org.apache.hadoop.mapred.TestBlockLimits
                  org.apache.hadoop.mapred.TestMiniMRClasspath
                  org.apache.hadoop.mapred.TestTextOutputFormat
                  org.apache.hadoop.mapred.TestSequenceFileInputFormat
                  org.apache.hadoop.mapreduce.TestMROutputFormat
                  org.apache.hadoop.mapreduce.lib.chain.TestChainErrors
                  org.apache.hadoop.mapreduce.lib.aggregate.TestMapReduceAggregates
                  org.apache.hadoop.mapred.TestMiniMRWithDFSWithDistinctUsers
                  org.apache.hadoop.mapred.jobcontrol.TestJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileAsTextInputFormat
                  org.apache.hadoop.mapred.TestMiniMRBringup
                  org.apache.hadoop.mapreduce.TestValueIterReset
                  org.apache.hadoop.mapreduce.lib.output.TestMRMultipleOutputs
                  org.apache.hadoop.mapred.TestMiniMRChildTask
                  org.apache.hadoop.mapred.TestMapRed
                  org.apache.hadoop.mapred.lib.TestMultipleOutputs
                  org.apache.hadoop.mapred.TestReporter
                  org.apache.hadoop.mapred.TestCollect
                  org.apache.hadoop.mapred.TestReduceFetch
                  org.apache.hadoop.mapred.TestNetworkedJob
                  org.apache.hadoop.mapred.TestTaskCommit
                  org.apache.hadoop.mapreduce.lib.output.TestFileOutputCommitter
                  org.apache.hadoop.mapred.TestClusterMRNotification
                  org.apache.hadoop.mapreduce.TestMapReduce
                  org.apache.hadoop.mapred.TestReduceFetchFromPartialMem
                  org.apache.hadoop.mapred.TestJobCounters
                  org.apache.hadoop.mapreduce.lib.db.TestDataDrivenDBInputFormat
                  org.apache.hadoop.mapred.TestMiniMRClientCluster
                  org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter
                  org.apache.hadoop.mapreduce.lib.input.TestMultipleInputs
                  org.apache.hadoop.mapred.TestFileOutputCommitter
                  org.apache.hadoop.mapred.TestLazyOutput
                  org.apache.hadoop.mapred.TestLocalMRNotification
                  org.apache.hadoop.mapred.TestJobCleanup
                  org.apache.hadoop.mapreduce.TestMapReduceLazyOutput
                  org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath
                  org.apache.hadoop.mapred.lib.TestMultithreadedMapRunner
                  org.apache.hadoop.mapreduce.lib.chain.TestSingleElementChain
                  org.apache.hadoop.mapred.TestLineRecordReader
                  org.apache.hadoop.mapred.TestUserDefinedCounters
                  org.apache.hadoop.mapred.TestMapOutputType
                  org.apache.hadoop.mapred.lib.aggregate.TestAggregates
                  org.apache.hadoop.mapreduce.lib.jobcontrol.TestMapReduceJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat
                  org.apache.hadoop.mapreduce.lib.input.TestLineRecordReader
                  org.apache.hadoop.mapred.lib.TestChainMapReduce
                  org.apache.hadoop.mapred.TestClusterMapReduceTestCase
                  org.apache.hadoop.mapred.join.TestDatamerge
                  org.apache.hadoop.io.TestSequenceFileMergeProgress
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileAsBinaryInputFormat
                  org.apache.hadoop.mapred.jobcontrol.TestLocalJobControl
                  org.apache.hadoop.mapreduce.lib.input.TestMRSequenceFileInputFilter
                  org.apache.hadoop.mapred.TestJavaSerialization
                  org.apache.hadoop.mapred.lib.TestKeyFieldBasedComparator
                  org.apache.hadoop.mapreduce.lib.join.TestJoinDatamerge
                  org.apache.hadoop.mapred.lib.TestLineInputFormat
                  org.apache.hadoop.mapreduce.lib.fieldsel.TestMRFieldSelection
                  org.apache.hadoop.mapred.TestJobSysDirWithDFS
                  org.apache.hadoop.mapred.TestComparators
                  org.apache.hadoop.mapreduce.lib.input.TestNLineInputFormat
                  org.apache.hadoop.mapred.TestMultipleTextOutputFormat
                  org.apache.hadoop.mapreduce.lib.partition.TestMRKeyFieldBasedComparator
                  org.apache.hadoop.mapreduce.lib.output.TestMRSequenceFileAsBinaryOutputFormat
                  org.apache.hadoop.mapreduce.lib.map.TestMultithreadedMapper
                  org.apache.hadoop.mapred.TestJobName
                  org.apache.hadoop.mapred.TestFileOutputFormat
                  org.apache.hadoop.mapreduce.security.TestJHSSecurity
                  org.apache.hadoop.mapred.TestMapProgress
                  org.apache.hadoop.mapreduce.lib.chain.TestMapReduceChain

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2609//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2609//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2609//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419313#comment-13419313 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

Findbugs issue is a false positive, due to the ShuffleHandler start() method being syncrhonized (wherethe buffer variable gets instantiated), I'll add the corresponding findbug exclusion before committing.

Test failures seem unrelated.

                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

Same as last patch, just adding a comment about the performance impact (per Devaraj's suggestion):

*Using encrypted shuffle will incurs in a significant performance impact. Users should profile this and potentially reserve 1 or more cores for encrypted shuffle.*

                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415890#comment-13415890 ] 

Alejandro Abdelnur commented on MAPREDUCE-4417:
-----------------------------------------------

I've meant 'and now ...'
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

minor correction in testcase where a conf.writeXml was outside of the try block.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13416906#comment-13416906 ] 

Hadoop QA commented on MAPREDUCE-4417:
--------------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536950/MAPREDUCE-4417.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 6 new or modified test files.

    -1 javac.  The applied patch generated 2108 javac compiler warnings (more than the trunk's current 2066 warnings).

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

    -1 core tests.  The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site:

                  org.apache.hadoop.security.ssl.TestSSLFactory
                  org.apache.hadoop.mapreduce.lib.input.TestCombineFileInputFormat

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2618//testReport/
Javac warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2618//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/2618//console

This message is automatically generated.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Alejandro Abdelnur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alejandro Abdelnur updated MAPREDUCE-4417:
------------------------------------------

    Attachment: MAPREDUCE-4417.patch

again, missed to add a file git cache before cutting the previous patch. Now we should be fine.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419435#comment-13419435 ] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

- The {{AtomicBoolean running}} doesn't need to be an atomic boolean, since it's already volatile. You can just use a volatile boolean here.
- in {{loadTrustManager}}, you need to get {{file.lastModified}} before you even open the {{FileInputStream}}. Otherwise if the file is replaced in between opening the stream and you getting the mtime, you'll read the old version of the file but think you read the new one (assuming an atomic rename-over-old-file replacement)

Otherwise looks good to me.
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Aaron T. Myers (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13411703#comment-13411703 ] 

Aaron T. Myers commented on MAPREDUCE-4417:
-------------------------------------------

bq. Do you already have an HDFS solution in place? This only covers a fraction of the data traffic.

Just filed: HDFS-3637
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.0.1-alpha
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4417) add support for encrypted shuffle

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418810#comment-13418810 ] 

Todd Lipcon commented on MAPREDUCE-4417:
----------------------------------------

bq. The javadoc style for 'Returns BLAH' and then '@return BLAH' is Sun javadoc sytle.
Ew. That's disgusting. Oh well.

bq. the ReloadingX509TrustManager will work with an empty keystore if the keystore file is not avail at initialization time, and if the keystore file becomes available later one, it will be loaded. WARNs are logged while the file is not present, so it won't go unnoticed.

WARNs in the logs are often not noticed. Don't you think it's simpler to just fail if the conf is not present? If someone configures this and doesn't create the file (or the file is unreadable due to a permissions error), I think it's friendlier to fail fast. Otherwise they'll just end up seeing strange downstream issues like client certs not being properly trusted, which will be more difficult to root-cause back to the trust store configuration without log spelunking.

bq. If reload() fails to reload the new keystore, it assumes there are not certs and runs empty until the next reload attempt. Seems a safer assumption that continuing running with obsolete keys.

My worry here is that people might be using a conf management system to push out the key store files. If the reload happens to trigger right in the middle of a conf mgmt update, and the update is non-atomic, it will see an invalid keystore. I wouldn't want the TT to revert to an empty key store until the next reload interval in that case.

bq. While hadoop.ssl.enabled only applies to shuffle, the intention is to use it for the rest of the HTTP endpoints. Thus, a single know would enable SSL. That is why the name of the property and its location (in core-default.xml)

Given it doesn't currently affect the other HTTP endpoints, I find this very confusing. Why not make a separate config for now, and then once it affects more than just the shuffle, you can change the default for {{mapred.shuffle.use.ssl}} to {{${hadoop.use.ssl}}} to pick up the system-wide default.

bq. In the TestSSLFactory, the Assert.fail() statements, are sections the test should not make it; they are used for negative tests.
I get that. But, if the test breaks, you'll end up with a meaningless failure, instead of a message explaining why it failed. If you let the exception fall through, then the failed unit test would actually have a stack trace that explains why it failed, which aids in debugging.

bq. Client certs are disabled by default. If they are per job, yes they could be shipped via DC. This would require a alternate implementation of the KeyStoresFactory, thus the mechanism is already in place.
Does it need an alternate implementation? The distributed cache files can be put on the classpath already, in which case the existing keystore-loading code should be able to find them. The only change would be in the documentation -- explaining that the client should ship the files via distributed cache rather than putting them in HADOOP_CONF_DIR. Why wouldn't that be enough?
                
> add support for encrypted shuffle
> ---------------------------------
>
>                 Key: MAPREDUCE-4417
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4417
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, security
>    Affects Versions: 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>             Fix For: 2.1.0-alpha
>
>         Attachments: MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch, MAPREDUCE-4417.patch
>
>
> Currently Shuffle fetches go on the clear. While Kerberos provides comprehensive authentication for the cluster, it does not provide confidentiality. 
> When processing sensitive data confidentiality may be desired (at the expense of job performance and resources utilization for doing encryption).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira