You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Larry McCay <lm...@hortonworks.com> on 2013/07/02 22:03:42 UTC

[DISCUSS] Hadoop SSO/Token Server Components

All -

As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.

https://issues.apache.org/jira/browse/HADOOP-9533
https://issues.apache.org/jira/browse/HADOOP-9392

As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
* An alternative authentication mechanism to Kerberos for user authentication
* A broader capability for integration into enterprise identity and SSO solutions
* Possibly the advertisement/negotiation of available authentication mechanisms
* Backward compatibility for the existing use of Kerberos
* No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc)
* Pluggable authentication mechanisms across: RPC, REST and webui enforcement points
* Continued support for existing authorization policy/ACLs, etc
* Keeping more fine grained authorization policies in mind - like attribute based access control
- fine grained access control is a separate but related effort that we must not preclude with this effort
* Cross cluster SSO

The above diagram represents the simplest interaction model for an SSO service in Hadoop.
1. client authenticates to SSO service and acquires an access token
a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token
2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
a. access token is presented as appropriate for the service endpoint protocol being used
b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer

The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token
a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity
2. client authenticates to SSO service and acquires an access token
a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token
3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
a. access token is presented as appropriate for the service endpoint protocol being used
b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer

Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:

1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.

2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.

3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.

4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.

5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL.

6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.

7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.

8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.

So, discussion points:

1. Are there additional components that would be required for a Hadoop SSO service?
2. Should any of the above described components be considered not actually necessary or poorly described?
2. Should we create a new umbrella Jira to identify each of these as a subtask?
3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?

Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.

Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.

thanks,

--larry

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Thanks, Brian!
Look at that - the power of collaboration - the numbering is correct already! ;-)

I am inclined to agree that we should start with the Hadoop SSO Tokens and am leaning toward a new jira that leaves behind the cruft but I don't feel very strongly about it being new.
I do feel like, especially given Kai's new document, that we have only one.

On Jul 3, 2013, at 2:32 PM, Brian Swan <Br...@microsoft.com> wrote:

> Thanks, Larry, for starting this conversation (and thanks for the great Summit meeting summary you sent out a couple of days ago). To weigh in on your specific discussion points (and renumber them :-))...
> 
> 1. Are there additional components that would be required for a Hadoop SSO service?
> Not that I can see.
> 
> 2. Should any of the above described components be considered not actually necessary or poorly described?
> I think this will be determined as we get into the details of each component. What you've described here is certainly an excellent starting point.
> 
> 3. Should we create a new umbrella Jira to identify each of these as a subtask?
> 4. Should we just continue to use 9533 for the SSO server and add additional subtasks?
> What is described here seem to fit with 9533, though 9533 may contain some details that need further discussion. IMHO, it may be better to file a new umbrella Jira, though I'm not 100% convinced of that. Would be very interested on input from others.
> 
> 5. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
> Is 4 the right place to start? (4. Hadoop SSO Tokens: the exact shape and form of the sso tokens...) It seemed that in some 1:1 conversations after the Summit meeting that others may agree with this. Would like to hear if that is the case more broadly.
> 
> -Brian
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com] 
> Sent: Tuesday, July 2, 2013 1:04 PM
> To: common-dev@hadoop.apache.org
> Subject: [DISCUSS] Hadoop SSO/Token Server Components
> 
> All -
> 
> As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
> There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.
> 
> https://issues.apache.org/jira/browse/HADOOP-9533
> https://issues.apache.org/jira/browse/HADOOP-9392
> 
> As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
> * An alternative authentication mechanism to Kerberos for user authentication
> * A broader capability for integration into enterprise identity and SSO solutions
> * Possibly the advertisement/negotiation of available authentication mechanisms
> * Backward compatibility for the existing use of Kerberos
> * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc)
> * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points
> * Continued support for existing authorization policy/ACLs, etc
> * Keeping more fine grained authorization policies in mind - like attribute based access control
> 	- fine grained access control is a separate but related effort that we must not preclude with this effort
> * Cross cluster SSO
> 
> In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
>                               +------+
> 	+------+ credentials 1 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  2 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> The above diagram represents the simplest interaction model for an SSO service in Hadoop.
> 1. client authenticates to SSO service and acquires an access token
>  a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
>  b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
>  a. access token is presented as appropriate for the service endpoint protocol being used
>  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
> 
>    +------+
>    |  IdP |
>    +------+
>    1   ^ credentials
>        | :idp_token
>        |                      +------+
> 	+------+  idp_token  2 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  3 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> 
> The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
> 1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token
>  a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity 2. client authenticates to SSO service and acquires an access token
>  a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
>  b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
>  a. access token is presented as appropriate for the service endpoint protocol being used
>  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
> 	
> Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:
> 
> 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.
> 
> 2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.
> 
> 3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.
> 
> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.
> 
> 5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 
> 
> 6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
> 
> 7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.
> 
> 8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.
> 
> So, discussion points:
> 
> 1. Are there additional components that would be required for a Hadoop SSO service?
> 2. Should any of the above described components be considered not actually necessary or poorly described?
> 2. Should we create a new umbrella Jira to identify each of these as a subtask?
> 3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
> 4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
> 
> Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.
> 
> Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.
> 
> thanks,
> 
> --larry
>

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Brian Swan <Br...@microsoft.com>.

Thanks, Larry, for starting this conversation (and thanks for the great Summit meeting summary you sent out a couple of days ago). To weigh in on your specific discussion points (and renumber them :-))...

1. Are there additional components that would be required for a Hadoop SSO service?
Not that I can see.

2. Should any of the above described components be considered not actually necessary or poorly described?
I think this will be determined as we get into the details of each component. What you've described here is certainly an excellent starting point.

3. Should we create a new umbrella Jira to identify each of these as a subtask?
4. Should we just continue to use 9533 for the SSO server and add additional subtasks?
What is described here seem to fit with 9533, though 9533 may contain some details that need further discussion. IMHO, it may be better to file a new umbrella Jira, though I'm not 100% convinced of that. Would be very interested on input from others.

5. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
Is 4 the right place to start? (4. Hadoop SSO Tokens: the exact shape and form of the sso tokens...) It seemed that in some 1:1 conversations after the Summit meeting that others may agree with this. Would like to hear if that is the case more broadly.

-Brian

-----Original Message-----
From: Larry McCay [mailto:lmccay@hortonworks.com] 
Sent: Tuesday, July 2, 2013 1:04 PM
To: common-dev@hadoop.apache.org
Subject: [DISCUSS] Hadoop SSO/Token Server Components

All -

As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.

https://issues.apache.org/jira/browse/HADOOP-9533
https://issues.apache.org/jira/browse/HADOOP-9392

As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
* An alternative authentication mechanism to Kerberos for user authentication
* A broader capability for integration into enterprise identity and SSO solutions
* Possibly the advertisement/negotiation of available authentication mechanisms
* Backward compatibility for the existing use of Kerberos
* No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc)
* Pluggable authentication mechanisms across: RPC, REST and webui enforcement points
* Continued support for existing authorization policy/ACLs, etc
* Keeping more fine grained authorization policies in mind - like attribute based access control
	- fine grained access control is a separate but related effort that we must not preclude with this effort
* Cross cluster SSO

In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
                               +------+
	+------+ credentials 1 | SSO  |
	|CLIENT|-------------->|SERVER|
	+------+  :tokens      +------+
	  2 |                    
	    | access token
	    V :requested resource
	+-------+
	|HADOOP |
	|SERVICE|
	+-------+
	
The above diagram represents the simplest interaction model for an SSO service in Hadoop.
1. client authenticates to SSO service and acquires an access token
  a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
  b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
  a. access token is presented as appropriate for the service endpoint protocol being used
  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
    
    +------+
    |  IdP |
    +------+
    1   ^ credentials
        | :idp_token
        |                      +------+
	+------+  idp_token  2 | SSO  |
	|CLIENT|-------------->|SERVER|
	+------+  :tokens      +------+
	  3 |                    
	    | access token
	    V :requested resource
	+-------+
	|HADOOP |
	|SERVICE|
	+-------+
	

The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token
  a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity 2. client authenticates to SSO service and acquires an access token
  a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
  b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
  a. access token is presented as appropriate for the service endpoint protocol being used
  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
	
Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:

1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.

2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.

3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.

4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.

5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 

6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.

7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.

8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.

So, discussion points:

1. Are there additional components that would be required for a Hadoop SSO service?
2. Should any of the above described components be considered not actually necessary or poorly described?
2. Should we create a new umbrella Jira to identify each of these as a subtask?
3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?

Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.

Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.

thanks,

--larry

hadoop-common build break

Posted by Amir Sanjar <v1...@us.ibm.com>.

Hi all,
when we build hadoop-common 2.0.4-alpha with a local maven repository while
specifying -Pdocs (docs profile) option (mvn clean package -Pdocs
-DskipTests -s ../../settings.xml), we get following error messages:

[WARNING] The POM for org.codehaus.gmaven:gmaven-plugin:jar:1.3 is invalid,
transitive dependencies (if any) will not be available, enable debug
logging for more details
[INFO] ****** FindBugsMojo execute *******
[INFO] canGenerate is true
[INFO] ****** FindBugsMojo executeFindbugs *******
[INFO] Temp File
is /home/sanjar/development/BI/hadoop-2.0.4/hadoop-common-project/hadoop-common/target/findbugsTemp.xml
[INFO]
------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 16.152s
[INFO] Finished at: Sat Jul 27 18:39:32 CDT 2013
[INFO] Final Memory: 36M/60M
[INFO]
------------------------------------------------------------------------
[ERROR] Failed to execute goal
org.codehaus.mojo:findbugs-maven-plugin:2.3.2:findbugs (default) on project
hadoop-common: Execution default of goal
org.codehaus.mojo:findbugs-maven-plugin:2.3.2:findbugs failed: A required
class was missing while executing
org.codehaus.mojo:findbugs-maven-plugin:2.3.2:findbugs:
org.apache.tools.ant.input.InputHandler
[ERROR] -----------------------------------------------------
[ERROR] realm =    plugin>org.codehaus.mojo:findbugs-maven-plugin:2.3.2
[ERROR] strategy =
org.codehaus.plexus.classworlds.strategy.SelfFirstStrategy
[ERROR] urls[0] =
file:/home/sanjar/development/BI/hadoop-2.0.4/m2_repo/org/codehaus/mojo/findbugs-maven-plugin/2.3.2/findbugs-maven-plugin-2.3.2.jar
[ERROR] urls[1] =
file:/home/sanjar/development/BI/hadoop-2.0.4/m2_repo/com/google/code/findbugs/bcel/1.3.9/bcel-1.3.9.jar


Build completes successfully without a -Pdocs option. What is the missing
class? we have verified class InputHandler is included in ant**.jar file
within the local maven repository.  Any help would be greatly appreciated..

Best Regards
Amir Sanjar

System Management Architect
PowerLinux Open Source Hadoop development lead
IBM Senior Software Engineer
Phone# 512-286-8393
Fax#      512-838-8858

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by "Zheng, Kai" <ka...@intel.com>.

Got it Suresh. 

So I guess HADOOP-9797 (and the family) for the UGI change would be a fit to this rule right. The refactoring is improving and cleaning UGI, also preparing for TokenAuth feature. According to this rule the changes would be in trunk first. Thanks for your guidance.

Regards,
Kai

-----Original Message-----
From: Suresh Srinivas [mailto:suresh@hortonworks.com] 
Sent: Thursday, September 05, 2013 2:42 PM
To: common-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

> One aside: if you come across a bug, please try to fix it upstream and 
> then merge into the feature branch rather than cherry-picking patches 
> or only fixing it on the branch. It becomes very awkward to track. -C


Related to this, when refactoring the code, generally required for large feature development, consider first refactoring in trunk and then make additional changes for the feature in the feature branch. This helps a lot in being able to merge the trunk to feature branch periodically. This will also help in keeping the change for merging feature to trunk small and easier reviews.

Regards,
Suresh

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Suresh Srinivas <su...@hortonworks.com>.

> One aside: if you come across a bug, please try to fix it upstream and
> then merge into the feature branch rather than cherry-picking patches
> or only fixing it on the branch. It becomes very awkward to track. -C


Related to this, when refactoring the code, generally required for large
feature development, consider first refactoring in trunk and then make
additional changes for the feature in the feature branch. This helps a lot
in being able to merge the trunk to feature branch periodically. This will
also help in keeping the change for merging feature to trunk small and
easier reviews.

Regards,
Suresh

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Chris Douglas <cd...@apache.org>.

Larry, et al.-

> So, I guess the questions that immediately come to mind are:
> 1. Is there a document that describes the best way to do this?

I'm not aware of anything that speaks to this directly.

> 2. How best do we leverage code being done in one feature branch within
> another?

More than being easily reviewable, branches should be coherent. If two
features need to develop together, then they should be on the same
branch. It's just a mechanism to decouple progress on a feature from
development on trunk; the people invested in the minutiae of the
feature can reach consensus, commit a change, and keep going. It's
isolation effecting specialized evolution, where every intermediate
form need not be viable. Where that's appropriate, it can be a useful
tool, but it's also much heavier than attaching patches to JIRA.
Please don't feel obliged to use it where it doesn't make sense.

One aside: if you come across a bug, please try to fix it upstream and
then merge into the feature branch rather than cherry-picking patches
or only fixing it on the branch. It becomes very awkward to track. -C

On Wed, Sep 4, 2013 at 11:19 AM, Larry McCay <lm...@hortonworks.com> wrote:
> Chris -
>
> I am curious whether there are any guidelines for feature branch use.
>
> The general goals should be to:
> * keep branches as small and as easily reviewable as possible for a given
> feature
> * decouple the pluggable framework from any specific central server
> implementation
> * scope specific content into iterations that can be merged into trunk on
> their own and then development continued in new branches for the next
> iteration
>
> So, I guess the questions that immediately come to mind are:
> 1. Is there a document that describes the best way to do this?
> 2. How best do we leverage code being done in one feature branch within
> another?
>
> Thanks!
>
> --larry
>
>
>
> On Tue, Sep 3, 2013 at 10:00 PM, Zheng, Kai <ka...@intel.com> wrote:
>>
>> This looks good and reasonable to me. Thanks Chris.
>>
>> -----Original Message-----
>> From: Chris Douglas [mailto:cdouglas@apache.org]
>> Sent: Wednesday, September 04, 2013 6:45 AM
>> To: common-dev@hadoop.apache.org
>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>>
>> On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay <lm...@hortonworks.com>
>> wrote:
>> > One outstanding question for me - how do we go about getting the
>> > branches created?
>>
>> Once a group has converged on a purpose- ideally with some initial code
>> from JIRA- please go ahead and create the feature branch in svn.
>> There's no ceremony. -C
>>
>> > On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth
>> > <cn...@hortonworks.com>wrote:
>> >
>> >> Near the bottom of the bylaws, it states that addition of a "New
>> >> Branch Committer" requires "Lazy consensus of active PMC members."  I
>> >> think this means that you'll need to get a PMC member to sponsor the
>> >> vote for you.
>> >>  Regular committer votes happen on the private PMC mailing list, and
>> >> I assume it would be the same for a branch committer vote.
>> >>
>> >> http://hadoop.apache.org/bylaws.html
>> >>
>> >> Chris Nauroth
>> >> Hortonworks
>> >> http://hortonworks.com/
>> >>
>> >>
>> >>
>> >> On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com>
>> >> wrote:
>> >>
>> >> > That sounds perfect!
>> >> > I have been thinking of late that we would maybe need an incubator
>> >> project
>> >> > or something for this - which would be unfortunate.
>> >> >
>> >> > This would allow us to move much more quickly with a set of patches
>> >> broken
>> >> > up into consumable/understandable chunks that are made functional
>> >> > more easily within the branch.
>> >> > I assume that we need to start a separate thread for DISCUSS or
>> >> > VOTE to start that process - correct?
>> >> >
>> >> > On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com>
>> >> wrote:
>> >> >
>> >> > > yep, that is what I meant. Thanks Chris
>> >> > >
>> >> > >
>> >> > > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <
>> >> cnauroth@hortonworks.com
>> >> > >wrote:
>> >> > >
>> >> > >> Perhaps this is also a good opportunity to try out the new
>> >> > >> "branch committers" clause in the bylaws, enabling
>> >> > >> non-committers who are
>> >> > working
>> >> > >> on this to commit to the feature branch.
>> >> > >>
>> >> > >>
>> >> > >>
>> >> >
>> >> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%
>> >> 3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%
>> >> 3E
>> >> > >>
>> >> > >> Chris Nauroth
>> >> > >> Hortonworks
>> >> > >> http://hortonworks.com/
>> >> > >>
>> >> > >>
>> >> > >>
>> >> > >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur
>> >> > >> <tucu@cloudera.com
>> >> > >>> wrote:
>> >> > >>
>> >> > >>> Larry,
>> >> > >>>
>> >> > >>> Sorry for the delay answering. Thanks for laying down things,
>> >> > >>> yes, it
>> >> > >> makes
>> >> > >>> sense.
>> >> > >>>
>> >> > >>> Given the large scope of the changes, number of JIRAs and
>> >> > >>> number of developers involved, wouldn't make sense to create a
>> >> > >>> feature branch
>> >> for
>> >> > >> all
>> >> > >>> this work not to destabilize (more ;) trunk?
>> >> > >>>
>> >> > >>> Thanks again.
>> >> > >>>
>> >> > >>>
>> >> > >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay
>> >> > >>> <lmccay@hortonworks.com
>> >> >
>> >> > >>> wrote:
>> >> > >>>
>> >> > >>>> The following JIRA was filed to provide a token and basic
>> >> > >>>> authority implementation for this effort:
>> >> > >>>> https://issues.apache.org/jira/browse/HADOOP-9781
>> >> > >>>>
>> >> > >>>> I have attached an initial patch though have yet to submit it
>> >> > >>>> as one
>> >> > >>> since
>> >> > >>>> it is dependent on the patch for CMF that was posted to:
>> >> > >>>> https://issues.apache.org/jira/browse/HADOOP-9534
>> >> > >>>> and this patch still has a couple outstanding issues - javac
>> >> warnings
>> >> > >> for
>> >> > >>>> com.sun classes for certification generation and 11 javadoc
>> >> warnings.
>> >> > >>>>
>> >> > >>>> Please feel free to review the patches and raise any questions
>> >> > >>>> or
>> >> > >>> concerns
>> >> > >>>> related to them.
>> >> > >>>>
>> >> > >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay
>> >> > >>>> <lm...@hortonworks.com>
>> >> > >> wrote:
>> >> > >>>>
>> >> > >>>>> Hello All -
>> >> > >>>>>
>> >> > >>>>> In an effort to scope an initial iteration that provides
>> >> > >>>>> value to
>> >> the
>> >> > >>>> community while focusing on the pluggable authentication
>> >> > >>>> aspects,
>> >> I've
>> >> > >>>> written a description for "Iteration 1". It identifies the
>> >> > >>>> goal of
>> >> the
>> >> > >>>> iteration, the endstate and a set of initial usecases. It also
>> >> > >> enumerates
>> >> > >>>> the components that are required for each usecase. There is a
>> >> > >>>> scope
>> >> > >>> section
>> >> > >>>> that details specific things that should be kept out of the
>> >> > >>>> first iteration. This is certainly up for discussion. There
>> >> > >>>> may be some of
>> >> > >>> these
>> >> > >>>> things that can be contributed in short order. If we can add
>> >> > >>>> some
>> >> > >> things
>> >> > >>> in
>> >> > >>>> without unnecessary complexity for the identified usecases
>> >> > >>>> then we
>> >> > >>> should.
>> >> > >>>>>
>> >> > >>>>> @Alejandro - please review this and see whether it satisfies
>> >> > >>>>> your
>> >> > >> point
>> >> > >>>> for a definition of what we are building.
>> >> > >>>>>
>> >> > >>>>> In addition to the document that I will paste here as text
>> >> > >>>>> and
>> >> > >> attach a
>> >> > >>>> pdf version, we have a couple patches for components that are
>> >> > >> identified
>> >> > >>> in
>> >> > >>>> the document.
>> >> > >>>>> Specifically, COMP-7 and COMP-8.
>> >> > >>>>>
>> >> > >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which
>> >> > >>>>> was
>> >> > >> filed
>> >> > >>>> specifically for that functionality.
>> >> > >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as
>> >> > >>>>> the
>> >> > >> token
>> >> > >>>> format and a basic JsonWebTokenAuthority that can issue and
>> >> > >>>> verify
>> >> > >> these
>> >> > >>>> tokens.
>> >> > >>>>>
>> >> > >>>>> Since there is no JIRA for this yet, I will likely file a new
>> >> > >>>>> JIRA
>> >> > >> for
>> >> > >>> a
>> >> > >>>> SSO token implementation.
>> >> > >>>>>
>> >> > >>>>> Both of these patches assume to be modules within
>> >> > >>>> hadoop-common/hadoop-common-project.
>> >> > >>>>> While they are relatively small, I think that they will be
>> >> > >>>>> pulled
>> >> in
>> >> > >> by
>> >> > >>>> other modules such as hadoop-auth which would likely not want
>> >> > >>>> a
>> >> > >>> dependency
>> >> > >>>> on something larger like
>> >> > >>> hadoop-common/hadoop-common-project/hadoop-common.
>> >> > >>>>>
>> >> > >>>>> This is certainly something that we should discuss within the
>> >> > >> community
>> >> > >>>> for this effort though - that being, exactly how to add these
>> >> > libraries
>> >> > >>> so
>> >> > >>>> that they are most easily consumed by existing projects.
>> >> > >>>>>
>> >> > >>>>> Anyway, the following is the Iteration-1 document - it is
>> >> > >>>>> also
>> >> > >> attached
>> >> > >>>> as a pdf:
>> >> > >>>>>
>> >> > >>>>> Iteration 1: Pluggable User Authentication and Federation
>> >> > >>>>>
>> >> > >>>>> Introduction
>> >> > >>>>> The intent of this effort is to bootstrap the development of
>> >> > >> pluggable
>> >> > >>>> token-based authentication mechanisms to support certain goals
>> >> > >>>> of enterprise authentication integrations. By restricting the
>> >> > >>>> scope of
>> >> > >> this
>> >> > >>>> effort, we hope to provide immediate benefit to the community
>> >> > >>>> while
>> >> > >>> keeping
>> >> > >>>> the initial contribution to a manageable size that can be
>> >> > >>>> easily
>> >> > >>> reviewed,
>> >> > >>>> understood and extended with further development through
>> >> > >>>> follow up
>> >> > >> JIRAs
>> >> > >>>> and related iterations.
>> >> > >>>>>
>> >> > >>>>> Iteration Endstate
>> >> > >>>>> Once complete, this effort will have extended the
>> >> > >>>>> authentication
>> >> > >>>> mechanisms - for all client types - from the existing: Simple,
>> >> > Kerberos
>> >> > >>> and
>> >> > >>>> Plain (for RPC) to include LDAP authentication and SAML based
>> >> > >> federation.
>> >> > >>>> In addition, the ability to provide additional/custom
>> >> > >>>> authentication mechanisms will be enabled for users to plug in
>> >> > >>>> their preferred
>> >> > >>> mechanisms.
>> >> > >>>>>
>> >> > >>>>> Project Scope
>> >> > >>>>> The scope of this effort is a subset of the features covered
>> >> > >>>>> by the
>> >> > >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort
>> >> > >>>> concentrates
>> >> on
>> >> > >>>> enabling Hadoop to issue, accept/validate SSO tokens of its
>> >> > >>>> own. The pluggable authentication mechanism within SASL/RPC
>> >> > >>>> layer and the authentication filter pluggability for REST and
>> >> > >>>> UI components will
>> >> be
>> >> > >>>> leveraged and extended to support the results of this effort.
>> >> > >>>>>
>> >> > >>>>> Out of Scope
>> >> > >>>>> In order to scope the initial deliverable as the minimally
>> >> > >>>>> viable
>> >> > >>>> product, a handful of things have been simplified or left out
>> >> > >>>> of
>> >> scope
>> >> > >>> for
>> >> > >>>> this effort. This is not meant to say that these aspects are
>> >> > >>>> not
>> >> > useful
>> >> > >>> or
>> >> > >>>> not needed but that they are not necessary for this iteration.
>> >> > >>>> We do however need to ensure that we don't do anything to
>> >> > >>>> preclude adding
>> >> > >> them
>> >> > >>> in
>> >> > >>>> future iterations.
>> >> > >>>>> 1. Additional Attributes - the result of authentication will
>> >> continue
>> >> > >>> to
>> >> > >>>> use the existing hadoop tokens and identity representations.
>> >> > Additional
>> >> > >>>> attributes used for finer grained authorization decisions will
>> >> > >>>> be
>> >> > added
>> >> > >>>> through follow-up efforts.
>> >> > >>>>> 2. Token revocation - the ability to revoke issued identity
>> >> > >>>>> tokens
>> >> > >> will
>> >> > >>>> be added later
>> >> > >>>>> 3. Multi-factor authentication - this will likely require
>> >> additional
>> >> > >>>> attributes and is not necessary for this iteration.
>> >> > >>>>> 4. Authorization changes - we will require additional
>> >> > >>>>> attributes
>> >> for
>> >> > >>> the
>> >> > >>>> fine-grained access control plans. This is not needed for this
>> >> > >> iteration.
>> >> > >>>>> 5. Domains - we assume a single flat domain for all users 6.
>> >> > >>>>> Kinit alternative - we can leverage existing REST clients
>> >> > >>>>> such
>> >> as
>> >> > >>>> cURL to retrieve tokens through authentication and federation
>> >> > >>>> for
>> >> the
>> >> > >>> time
>> >> > >>>> being
>> >> > >>>>> 7. A specific authentication framework isn't really necessary
>> >> within
>> >> > >>> the
>> >> > >>>> REST endpoints for this iteration. If one is available then we
>> >> > >>>> can
>> >> use
>> >> > >> it
>> >> > >>>> otherwise we can leverage existing things like Apache Shiro
>> >> > >>>> within a servlet filter.
>> >> > >>>>>
>> >> > >>>>> In Scope
>> >> > >>>>> What is in scope for this effort is defined by the usecases
>> >> described
>> >> > >>>> below. Components required for supporting the usecases are
>> >> summarized
>> >> > >> for
>> >> > >>>> each client type. Each component is a candidate for a JIRA
>> >> > >>>> subtask -
>> >> > >>> though
>> >> > >>>> multiple components are likely to be included in a JIRA to
>> >> represent a
>> >> > >>> set
>> >> > >>>> of functionality rather than individual JIRAs per component.
>> >> > >>>>>
>> >> > >>>>> Terminology and Naming
>> >> > >>>>> The terms and names of components within this document are
>> >> > >>>>> merely
>> >> > >>>> descriptive of the functionality that they represent. Any
>> >> > >>>> similarity
>> >> > or
>> >> > >>>> difference in names or terms from those that are found in
>> >> > >>>> other
>> >> > >> documents
>> >> > >>>> are not intended to make any statement about those other
>> >> > >>>> documents
>> >> or
>> >> > >> the
>> >> > >>>> descriptions within. This document represents the pluggable
>> >> > >>> authentication
>> >> > >>>> mechanisms and server functionality required to replace
>> >> > >>>> Kerberos.
>> >> > >>>>>
>> >> > >>>>> Ultimately, the naming of the implementation classes will be
>> >> > >>>>> a
>> >> > >> product
>> >> > >>>> of the patches accepted by the community.
>> >> > >>>>>
>> >> > >>>>> Usecases:
>> >> > >>>>> client types: REST, CLI, UI
>> >> > >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
>> >> > >>>> federation/SAML
>> >> > >>>>>
>> >> > >>>>> Simple and Kerberos
>> >> > >>>>> Simple and Kerberos usecases continue to work as they do
>> >> > >>>>> today. The
>> >> > >>>> addition of Authentication/LDAP and Federation/SAML are added
>> >> through
>> >> > >> the
>> >> > >>>> existing pluggability points either as they are or with
>> >> > >>>> required
>> >> > >>> extension.
>> >> > >>>> Either way, continued support for Simple and Kerberos must not
>> >> require
>> >> > >>>> changes to existing deployments in the field as a result of
>> >> > >>>> this
>> >> > >> effort.
>> >> > >>>>>
>> >> > >>>>> REST
>> >> > >>>>> USECASE REST-1 Authentication/LDAP:
>> >> > >>>>> For REST clients, we will provide the ability to:
>> >> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
>> >> exposed
>> >> > >> by
>> >> > >>>> an AuthenticationServer instance via REST calls to:
>> >> > >>>>>   a. authenticate - passing username/password returning a
>> >> > >>>>> hadoop
>> >> > >>>> id_token
>> >> > >>>>>   b. get-access-token - from the TokenGrantingService by
>> >> > >>>>> passing
>> >> the
>> >> > >>>> hadoop id_token as an Authorization: Bearer token along with
>> >> > >>>> the
>> >> > >> desired
>> >> > >>>> service name (master service name) returning a hadoop access
>> >> > >>>> token
>> >> > >>>>> 2. Successfully invoke a hadoop service REST API passing the
>> >> > >>>>> hadoop
>> >> > >>>> access token through an HTTP header as an Authorization Bearer
>> >> > >>>> token
>> >> > >>>>>   a. validation of the incoming token on the service endpoint
>> >> > >>>>> is
>> >> > >>>> accomplished by an SSOAuthenticationHandler
>> >> > >>>>> 3. Successfully block access to a REST resource when
>> >> > >>>>> presenting a
>> >> > >>> hadoop
>> >> > >>>> access token intended for a different service
>> >> > >>>>>   a. validation of the incoming token on the service endpoint
>> >> > >>>>> is
>> >> > >>>> accomplished by an SSOAuthenticationHandler
>> >> > >>>>>
>> >> > >>>>> USECASE REST-2 Federation/SAML:
>> >> > >>>>> We will also provide federation capabilities for REST clients
>> >> > >>>>> such
>> >> > >>> that:
>> >> > >>>>> 1. acquire SAML assertion token from a trusted IdP
>> >> > >>>>> (shibboleth?)
>> >> and
>> >> > >>>> persist in a permissions protected file - ie.
>> >> > >> ~/.hadoop_tokens/.idp_token
>> >> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an
>> >> > >>>>> SP
>> >> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
>> >> > instance
>> >> > >>> via
>> >> > >>>> REST calls to:
>> >> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
>> >> Bearer
>> >> > >>>> token returning a hadoop id_token
>> >> > >>>>>      - can copy and paste from commandline or use cat to
>> >> > >>>>> include
>> >> > >>>> persisted token through "--Header Authorization: Bearer 'cat
>> >> > >>>> ~/.hadoop_tokens/.id_token'"
>> >> > >>>>>   b. get-access-token - from the TokenGrantingService by
>> >> > >>>>> passing
>> >> the
>> >> > >>>> hadoop id_token as an Authorization: Bearer token along with
>> >> > >>>> the
>> >> > >> desired
>> >> > >>>> service name (master service name) to the TokenGrantingService
>> >> > >> returning
>> >> > >>> a
>> >> > >>>> hadoop access token
>> >> > >>>>> 3. Successfully invoke a hadoop service REST API passing the
>> >> > >>>>> hadoop
>> >> > >>>> access token through an HTTP header as an Authorization Bearer
>> >> > >>>> token
>> >> > >>>>>   a. validation of the incoming token on the service endpoint
>> >> > >>>>> is
>> >> > >>>> accomplished by an SSOAuthenticationHandler
>> >> > >>>>> 4. Successfully block access to a REST resource when
>> >> > >>>>> presenting a
>> >> > >>> hadoop
>> >> > >>>> access token intended for a different service
>> >> > >>>>>   a. validation of the incoming token on the service endpoint
>> >> > >>>>> is
>> >> > >>>> accomplished by an SSOAuthenticationHandler
>> >> > >>>>>
>> >> > >>>>> REQUIRED COMPONENTS for REST USECASES:
>> >> > >>>>> COMP-1. REST client - cURL or similar COMP-2. REST endpoint
>> >> > >>>>> for BASIC authentication to LDAP - IdP
>> >> endpoint
>> >> > >>>> example - returning hadoop id_token
>> >> > >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
>> >> > >>> shibboleth
>> >> > >>>> SP?|OpenSAML? - returning hadoop id_token
>> >> > >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring
>> >> > >>>>> hadoop
>> >> access
>> >> > >>>> tokens from hadoop id_tokens
>> >> > >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop
>> >> > >>>>> access
>> >> > >>>> tokens
>> >> > >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
>> >> > >>>>> COMP-7. hadoop token and authority implementations COMP-8.
>> >> > >>>>> core services for crypto support for signing, verifying and
>> >> > >> PKI
>> >> > >>>> management
>> >> > >>>>>
>> >> > >>>>> CLI
>> >> > >>>>> USECASE CLI-1 Authentication/LDAP:
>> >> > >>>>> For CLI/RPC clients, we will provide the ability to:
>> >> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
>> >> exposed
>> >> > >> by
>> >> > >>>> an AuthenticationServer instance via REST calls to:
>> >> > >>>>>   a. authenticate - passing username/password returning a
>> >> > >>>>> hadoop
>> >> > >>>> id_token
>> >> > >>>>>      - for RPC clients we need to persist the returned hadoop
>> >> > >> identity
>> >> > >>>> token in a file protected by fs permissions so that it may be
>> >> > leveraged
>> >> > >>>> until expiry
>> >> > >>>>>      - directing the returned response to a file may suffice
>> >> > >>>>> for
>> >> now
>> >> > >>>> something like ">~/.hadoop_tokens/.id_token"
>> >> > >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop
>> >> > >>>>> service
>> >> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL
>> >> > >>>>> layer,
>> >> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token
>> >> > >>>> is
>> >> passed
>> >> > >> as
>> >> > >>>> Authorization: Bearer token to the get-access-token REST
>> >> > >>>> endpoint
>> >> > >> exposed
>> >> > >>>> by TokenGrantingService returning a hadoop access token
>> >> > >>>>>   b. RPC server side validates the presented hadoop access
>> >> > >>>>> token
>> >> and
>> >> > >>>> continues to serve request
>> >> > >>>>>   c. Successfully invoke a hadoop service RPC API
>> >> > >>>>>
>> >> > >>>>> USECASE CLI-2 Federation/SAML:
>> >> > >>>>> For CLI/RPC clients, we will provide the ability to:
>> >> > >>>>> 1. acquire SAML assertion token from a trusted IdP
>> >> > >>>>> (shibboleth?)
>> >> and
>> >> > >>>> persist in a permissions protected file - ie.
>> >> > >> ~/.hadoop_tokens/.idp_token
>> >> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an
>> >> > >>>>> SP
>> >> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
>> >> > instance
>> >> > >>> via
>> >> > >>>> REST calls to:
>> >> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
>> >> Bearer
>> >> > >>>> token returning a hadoop id_token
>> >> > >>>>>      - can copy and paste from commandline or use cat to
>> >> > >>>>> include
>> >> > >>>> previously persisted token through "--Header Authorization:
>> >> > >>>> Bearer
>> >> > 'cat
>> >> > >>>> ~/.hadoop_tokens/.id_token'"
>> >> > >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop
>> >> > >>>>> service
>> >> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL
>> >> > >>>>> layer,
>> >> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token
>> >> > >>>> is
>> >> passed
>> >> > >> as
>> >> > >>>> Authorization: Bearer token to the get-access-token REST
>> >> > >>>> endpoint
>> >> > >> exposed
>> >> > >>>> by TokenGrantingService returning a hadoop access token
>> >> > >>>>>   b. RPC server side validates the presented hadoop access
>> >> > >>>>> token
>> >> and
>> >> > >>>> continues to serve request
>> >> > >>>>>   c. Successfully invoke a hadoop service RPC API
>> >> > >>>>>
>> >> > >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required
>> >> > >>>>> for
>> >> > >>> REST):
>> >> > >>>>> COMP-9. TokenAuth Method negotiation, etc COMP-10. Client
>> >> > >>>>> side implementation to leverage REST endpoint for
>> >> > >>>> acquiring hadoop access tokens given a hadoop id_token
>> >> > >>>>> COMP-11. Server side implementation to validate incoming
>> >> > >>>>> hadoop
>> >> > >> access
>> >> > >>>> tokens
>> >> > >>>>>
>> >> > >>>>> UI
>> >> > >>>>> Various Hadoop services have their own web UI consoles for
>> >> > >>>> administration and end user interactions. These consoles need
>> >> > >>>> to
>> >> also
>> >> > >>>> benefit from the pluggability of authentication mechansims to
>> >> > >>>> be on
>> >> > par
>> >> > >>>> with the access control of the cluster REST and RPC APIs.
>> >> > >>>>> Web consoles are protected with an
>> >> > >>>>> WebSSOAuthenticationHandler
>> >> which
>> >> > >>>> will be configured for either authentication or federation.
>> >> > >>>>>
>> >> > >>>>> USECASE UI-1 Authentication/LDAP:
>> >> > >>>>> For the authentication usecase:
>> >> > >>>>> 1. User's browser requests access to a UI console page 2.
>> >> > >>>>> WebSSOAuthenticationHandler intercepts the request and
>> >> > >>>>> redirects
>> >> > >> the
>> >> > >>>> browser to an IdP web endpoint exposed by the
>> >> > >>>> AuthenticationServer
>> >> > >>> passing
>> >> > >>>> the requested url as the redirect_url
>> >> > >>>>> 3. IdP web endpoint presents the user with a FORM over https
>> >> > >>>>>   a. user provides username/password and submits the FORM 4.
>> >> > >>>>> AuthenticationServer authenticates the user with provided
>> >> > >>> credentials
>> >> > >>>> against the configured LDAP server and:
>> >> > >>>>>   a. leverages a servlet filter or other authentication
>> >> > >>>>> mechanism
>> >> > >> for
>> >> > >>>> the endpoint and authenticates the user with a simple LDAP
>> >> > >>>> bind with username and password
>> >> > >>>>>   b. acquires a hadoop id_token and uses it to acquire the
>> >> > >>>>> required
>> >> > >>>> hadoop access token which is added as a cookie
>> >> > >>>>>   c. redirects the browser to the original service UI
>> >> > >>>>> resource via
>> >> > >> the
>> >> > >>>> provided redirect_url
>> >> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>> >> > >>> interrogates
>> >> > >>>> the incoming request again for an authcookie that contains an
>> >> > >>>> access
>> >> > >>> token
>> >> > >>>> upon finding one:
>> >> > >>>>>   a. validates the incoming token
>> >> > >>>>>   b. returns the AuthenticationToken as per
>> >> > >>>>> AuthenticationHandler
>> >> > >>>> contract
>> >> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with
>> >> > >>>>> the
>> >> > >>> expected
>> >> > >>>> token
>> >> > >>>>>   d. serves requested resource for valid tokens
>> >> > >>>>>   e. subsequent requests are handled by the
>> >> > >>>>> AuthenticationFilter
>> >> > >>>> recognition of the hadoop auth cookie
>> >> > >>>>>
>> >> > >>>>> USECASE UI-2 Federation/SAML:
>> >> > >>>>> For the federation usecase:
>> >> > >>>>> 1. User's browser requests access to a UI console page 2.
>> >> > >>>>> WebSSOAuthenticationHandler intercepts the request and
>> >> > >>>>> redirects
>> >> > >> the
>> >> > >>>> browser to an SP web endpoint exposed by the
>> >> > >>>> AuthenticationServer
>> >> > >> passing
>> >> > >>>> the requested url as the redirect_url. This endpoint:
>> >> > >>>>>   a. is dedicated to redirecting to the external IdP passing
>> >> > >>>>> the
>> >> > >>>> required parameters which may include a redirect_url back to
>> >> > >>>> itself
>> >> as
>> >> > >>> well
>> >> > >>>> as encoding the original redirect_url so that it can determine
>> >> > >>>> it on
>> >> > >> the
>> >> > >>>> way back to the client
>> >> > >>>>> 3. the IdP:
>> >> > >>>>>   a. challenges the user for credentials and authenticates the
>> >> > >>>>> user
>> >> > >>>>>   b. creates appropriate token/cookie and redirects back to
>> >> > >>>>> the
>> >> > >>>> AuthenticationServer endpoint
>> >> > >>>>> 4. AuthenticationServer endpoint:
>> >> > >>>>>   a. extracts the expected token/cookie from the incoming
>> >> > >>>>> request
>> >> > >> and
>> >> > >>>> validates it
>> >> > >>>>>   b. creates a hadoop id_token
>> >> > >>>>>   c. acquires a hadoop access token for the id_token
>> >> > >>>>>   d. creates appropriate cookie and redirects back to the
>> >> > >>>>> original
>> >> > >>>> redirect_url - being the requested resource
>> >> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>> >> > >>> interrogates
>> >> > >>>> the incoming request again for an authcookie that contains an
>> >> > >>>> access
>> >> > >>> token
>> >> > >>>> upon finding one:
>> >> > >>>>>   a. validates the incoming token
>> >> > >>>>>   b. returns the AuthenticationToken as per
>> >> > >>>>> AuthenticationHandler
>> >> > >>>> contrac
>> >> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with
>> >> > >>>>> the
>> >> > >>> expected
>> >> > >>>> token
>> >> > >>>>>   d. serves requested resource for valid tokens
>> >> > >>>>>   e. subsequent requests are handled by the
>> >> > >>>>> AuthenticationFilter
>> >> > >>>> recognition of the hadoop auth cookie
>> >> > >>>>> REQUIRED COMPONENTS for UI USECASES:
>> >> > >>>>> COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web
>> >> > >>>>> Endpoint within AuthenticationServer for FORM
>> >> based
>> >> > >>>> login
>> >> > >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd
>> >> > >>>>> party
>> >> > >>> token
>> >> > >>>> federation
>> >> > >>>>>
>> >> > >>>>>
>> >> > >>>>>
>> >> > >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
>> >> > >> Brian.Swan@microsoft.com>
>> >> > >>>> wrote:
>> >> > >>>>> Thanks, Larry. That is what I was trying to say, but you've
>> >> > >>>>> said it
>> >> > >>>> better and in more detail. :-) To extract from what you are
>> >> > >>>> saying:
>> >> > "If
>> >> > >>> we
>> >> > >>>> were to reframe the immediate scope to the lowest common
>> >> > >>>> denominator
>> >> > of
>> >> > >>>> what is needed for accepting tokens in authentication plugins
>> >> > >>>> then
>> >> we
>> >> > >>>> gain... an end-state for the lowest common denominator that
>> >> > >>>> enables
>> >> > >> code
>> >> > >>>> patches in the near-term is the best of both worlds."
>> >> > >>>>>
>> >> > >>>>> -Brian
>> >> > >>>>>
>> >> > >>>>> -----Original Message-----
>> >> > >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>> >> > >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
>> >> > >>>>> To: common-dev@hadoop.apache.org
>> >> > >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
>> >> > >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>> >> > >>>>>
>> >> > >>>>> It seems to me that we can have the best of both worlds
>> >> > >>>>> here...it's
>> >> > >> all
>> >> > >>>> about the scoping.
>> >> > >>>>>
>> >> > >>>>> If we were to reframe the immediate scope to the lowest
>> >> > >>>>> common
>> >> > >>>> denominator of what is needed for accepting tokens in
>> >> > >>>> authentication plugins then we gain:
>> >> > >>>>>
>> >> > >>>>> 1. a very manageable scope to define and agree upon 2. a
>> >> deliverable
>> >> > >>>> that should be useful in and of itself 3. a foundation for
>> >> > >>>> community collaboration that we build on for higher level
>> >> > >>>> solutions built on
>> >> > this
>> >> > >>>> lowest common denominator and experience as a working
>> >> > >>>> community
>> >> > >>>>>
>> >> > >>>>> So, to Alejandro's point, perhaps we need to define what
>> >> > >>>>> would make
>> >> > >> #2
>> >> > >>>> above true - this could serve as the "what" we are building
>> >> > >>>> instead
>> >> of
>> >> > >>> the
>> >> > >>>> "how" to build it.
>> >> > >>>>> Including:
>> >> > >>>>> a. project structure within
>> >> > >>>>> hadoop-common-project/common-security
>> >> or
>> >> > >>> the
>> >> > >>>> like b. the usecases that would need to be enabled to make it
>> >> > >>>> a self contained and useful contribution - without higher
>> >> > >>>> level solutions
>> >> c.
>> >> > >> the
>> >> > >>>> JIRA/s for contributing patches d. what specific patches will
>> >> > >>>> be
>> >> > needed
>> >> > >>> to
>> >> > >>>> accomplished the usecases in #b
>> >> > >>>>>
>> >> > >>>>> In other words, an end-state for the lowest common
>> >> > >>>>> denominator that
>> >> > >>>> enables code patches in the near-term is the best of both
>> >> > >>>> worlds.
>> >> > >>>>>
>> >> > >>>>> I think this may be a good way to bootstrap the collaboration
>> >> process
>> >> > >>>> for our emerging security community rather than trying to
>> >> > >>>> tackle a
>> >> > huge
>> >> > >>>> vision all at once.
>> >> > >>>>>
>> >> > >>>>> @Alejandro - if you have something else in mind that would
>> >> bootstrap
>> >> > >>>> this process - that would great - please advise.
>> >> > >>>>>
>> >> > >>>>> thoughts?
>> >> > >>>>>
>> >> > >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan
>> >> > >>>>> <Br...@microsoft.com>
>> >> > >>>> wrote:
>> >> > >>>>>
>> >> > >>>>>> Hi Alejandro, all-
>> >> > >>>>>>
>> >> > >>>>>> There seems to be agreement on the broad stroke description
>> >> > >>>>>> of the
>> >> > >>>> components needed to achieve pluggable token authentication
>> >> > >>>> (I'm
>> >> sure
>> >> > >>> I'll
>> >> > >>>> be corrected if that isn't the case). However, discussion of
>> >> > >>>> the
>> >> > >> details
>> >> > >>> of
>> >> > >>>> those components doesn't seem to be moving forward. I think
>> >> > >>>> this is
>> >> > >>> because
>> >> > >>>> the details are really best understood through code. I also
>> >> > >>>> see *a*
>> >> > >> (i.e.
>> >> > >>>> one of many possible) token format and pluggable
>> >> > >>>> authentication
>> >> > >>> mechanisms
>> >> > >>>> within the RPC layer as components that can have immediate
>> >> > >>>> benefit
>> >> to
>> >> > >>>> Hadoop users AND still allow flexibility in the larger design.
>> >> > >>>> So, I
>> >> > >>> think
>> >> > >>>> the best way to move the conversation of "what we are aiming
>> >> > >>>> for"
>> >> > >> forward
>> >> > >>>> is to start looking at code for these components. I am
>> >> > >>>> especially interested in moving forward with pluggable
>> >> > >>>> authentication
>> >> mechanisms
>> >> > >>>> within the RPC layer and would love to see what others have
>> >> > >>>> done in
>> >> > >> this
>> >> > >>>> area (if anything).
>> >> > >>>>>>
>> >> > >>>>>> Thanks.
>> >> > >>>>>>
>> >> > >>>>>> -Brian
>> >> > >>>>>>
>> >> > >>>>>> -----Original Message-----
>> >> > >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
>> >> > >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
>> >> > >>>>>> To: Larry McCay
>> >> > >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai
>> >> > >>>>>> Zheng
>> >> > >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>> >> > >>>>>>
>> >> > >>>>>> Larry, all,
>> >> > >>>>>>
>> >> > >>>>>> Still is not clear to me what is the end state we are aiming
>> >> > >>>>>> for,
>> >> > >> or
>> >> > >>>> that we even agree on that.
>> >> > >>>>>>
>> >> > >>>>>> IMO, Instead trying to agree what to do, we should first
>> >> > >>>>>> agree on
>> >> > >>> the
>> >> > >>>> final state, then we see what should be changed to there
>> >> > >>>> there, then
>> >> > we
>> >> > >>> see
>> >> > >>>> how we change things to get there.
>> >> > >>>>>>
>> >> > >>>>>> The different documents out there focus more on how.
>> >> > >>>>>>
>> >> > >>>>>> We not try to say how before we know what.
>> >> > >>>>>>
>> >> > >>>>>> Thx.
>> >> > >>>>>>
>> >> > >>>>>>
>> >> > >>>>>>
>> >> > >>>>>>
>> >> > >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
>> >> > >> lmccay@hortonworks.com
>> >> > >>>>
>> >> > >>>> wrote:
>> >> > >>>>>>
>> >> > >>>>>>> All -
>> >> > >>>>>>>
>> >> > >>>>>>> After combing through this thread - as well as the summit
>> >> > >>>>>>> session summary thread, I think that we have the following
>> >> > >>>>>>> two items that
>> >> > >> we
>> >> > >>>>>>> can probably move forward with:
>> >> > >>>>>>>
>> >> > >>>>>>> 1. TokenAuth method - assuming this means the pluggable
>> >> > >>>>>>> authentication mechanisms within the RPC layer (2 votes:
>> >> > >>>>>>> Kai and
>> >> > >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and
>> >> myself)
>> >> > >>>>>>>
>> >> > >>>>>>> I propose that we attack both of these aspects as one.
>> >> > >>>>>>> Let's
>> >> > >> provide
>> >> > >>>>>>> the structure and interfaces of the pluggable framework for
>> >> > >>>>>>> use
>> >> in
>> >> > >>>>>>> the RPC layer through leveraging Daryn's pluggability work
>> >> > >>>>>>> and
>> >> POC
>> >> > >>> it
>> >> > >>>>>>> with a particular token format (not necessarily the only
>> >> > >>>>>>> format
>> >> > >> ever
>> >> > >>>>>>> supported - we just need one to start). If there has
>> >> > >>>>>>> already been work done in this area by anyone then please
>> >> > >>>>>>> speak up and commit
>> >> > >> to
>> >> > >>>>>>> providing a patch - so that we don't duplicate effort.
>> >> > >>>>>>>
>> >> > >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we
>> >> > >>>>>>> can
>> >> > >> look
>> >> > >>>>>>> at to discern the pluggability mechanism details?
>> >> > >>>>>>> Documentation
>> >> of
>> >> > >>> it
>> >> > >>>>>>> would be great as well.
>> >> > >>>>>>> @Kai - do you have existing code for the pluggable token
>> >> > >>>>>>> authentication mechanism - if not, we can take a stab at
>> >> > >>> representing
>> >> > >>>>>>> it with interfaces and/or POC code.
>> >> > >>>>>>> I can standup and say that we have a token format that we
>> >> > >>>>>>> have
>> >> > >> been
>> >> > >>>>>>> working with already and can provide a patch that
>> >> > >>>>>>> represents it
>> >> > >> as a
>> >> > >>>>>>> contribution to test out the pluggable tokenAuth.
>> >> > >>>>>>>
>> >> > >>>>>>> These patches will provide progress toward code being the
>> >> > >>>>>>> central discussion vehicle. As a community, we can then
>> >> > >>>>>>> incrementally
>> >> > >> build
>> >> > >>>>>>> on that foundation in order to collaboratively deliver the
>> >> > >>>>>>> common
>> >> > >>>> vision.
>> >> > >>>>>>>
>> >> > >>>>>>> In the absence of any other home for posting such patches,
>> >> > >>>>>>> let's assume that they will be attached to HADOOP-9392 - or
>> >> > >>>>>>> a dedicated subtask for this particular aspect/s - I will
>> >> > >>>>>>> leave that detail
>> >> to
>> >> > >>>> Kai.
>> >> > >>>>>>>
>> >> > >>>>>>> @Alejandro, being the only voice on this thread that isn't
>> >> > >>>>>>> represented in the votes above, please feel free to agree
>> >> > >>>>>>> or
>> >> > >>> disagree
>> >> > >>>> with this direction.
>> >> > >>>>>>>
>> >> > >>>>>>> thanks,
>> >> > >>>>>>>
>> >> > >>>>>>> --larry
>> >> > >>>>>>>
>> >> > >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay
>> >> > >>>>>>> <lm...@hortonworks.com>
>> >> > >>>> wrote:
>> >> > >>>>>>>
>> >> > >>>>>>>> Hi Andy -
>> >> > >>>>>>>>
>> >> > >>>>>>>>> Happy Fourth of July to you and yours.
>> >> > >>>>>>>>
>> >> > >>>>>>>> Same to you and yours. :-) We had some fun in the sun for
>> >> > >>>>>>>> a change - we've had nothing but
>> >> > >>> rain
>> >> > >>>>>>>> on
>> >> > >>>>>>> the east coast lately.
>> >> > >>>>>>>>
>> >> > >>>>>>>>> My concern here is there may have been a
>> >> > >>>>>>>>> misinterpretation or
>> >> > >> lack
>> >> > >>>>>>>>> of consensus on what is meant by "clean slate"
>> >> > >>>>>>>>
>> >> > >>>>>>>>
>> >> > >>>>>>>> Apparently so.
>> >> > >>>>>>>> On the pre-summit call, I stated that I was interested in
>> >> > >>>>>>>> reconciling
>> >> > >>>>>>> the jiras so that we had one to work from.
>> >> > >>>>>>>>
>> >> > >>>>>>>> You recommended that we set them aside for the time being
>> >> > >>>>>>>> - with
>> >> > >>> the
>> >> > >>>>>>> understanding that work would continue on your side (and
>> >> > >>>>>>> our's as
>> >> > >>>>>>> well) - and approach the community discussion from a clean
>> >> > >>>>>>> slate.
>> >> > >>>>>>>> We seemed to do this at the summit session quite well.
>> >> > >>>>>>>> It was my understanding that this community discussion
>> >> > >>>>>>>> would
>> >> live
>> >> > >>>>>>>> beyond
>> >> > >>>>>>> the summit and continue on this list.
>> >> > >>>>>>>>
>> >> > >>>>>>>> While closing the summit session we agreed to follow up on
>> >> > >>>>>>>> common-dev
>> >> > >>>>>>> with first a summary then a discussion of the moving parts.
>> >> > >>>>>>>>
>> >> > >>>>>>>> I never expected the previous work to be abandoned and
>> >> > >>>>>>>> fully expected it
>> >> > >>>>>>> to inform the discussion that happened here.
>> >> > >>>>>>>>
>> >> > >>>>>>>> If you would like to reframe what clean slate was supposed
>> >> > >>>>>>>> to
>> >> > >> mean
>> >> > >>>>>>>> or
>> >> > >>>>>>> describe what it means now - that would be welcome - before
>> >> > >>>>>>> I
>> >> > >> waste
>> >> > >>>>>>> anymore time trying to facilitate a community discussion
>> >> > >>>>>>> that is apparently not wanted.
>> >> > >>>>>>>>
>> >> > >>>>>>>>> Nowhere in this
>> >> > >>>>>>>>> picture are self appointed "master JIRAs" and such, which
>> >> > >>>>>>>>> have
>> >> > >>> been
>> >> > >>>>>>>>> disappointing to see crop up, we should be
>> >> > >>>>>>>>> collaboratively
>> >> > >> coding
>> >> > >>>>>>>>> not planting flags.
>> >> > >>>>>>>>
>> >> > >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
>> >> > >>>>>>>> It has certainly not been anyone's intention to disappoint.
>> >> > >>>>>>>> Any mention of a new JIRA was just to have a clear context
>> >> > >>>>>>>> to
>> >> > >>> gather
>> >> > >>>>>>>> the
>> >> > >>>>>>> agreed upon points - previous and/or existing JIRAs would
>> >> > >>>>>>> easily
>> >> > >> be
>> >> > >>>> linked.
>> >> > >>>>>>>>
>> >> > >>>>>>>> Planting flags... I need to go back and read my discussion
>> >> > >>>>>>>> point about the
>> >> > >>>>>>> JIRA and see how this is the impression that was made.
>> >> > >>>>>>>> That is not how I define success. The only flags that
>> >> > >>>>>>>> count is
>> >> > >>> code.
>> >> > >>>>>>> What we are lacking is the roadmap on which to put the code.
>> >> > >>>>>>>>
>> >> > >>>>>>>>> I read Kai's latest document as something approaching
>> >> > >>>>>>>>> today's consensus
>> >> > >>>>>>> (or
>> >> > >>>>>>>>> at least a common point of view?) rather than a
>> >> > >>>>>>>>> historical
>> >> > >>> document.
>> >> > >>>>>>>>> Perhaps he and it can be given equal share of the
>> >> consideration.
>> >> > >>>>>>>>
>> >> > >>>>>>>> I definitely read it as something that has evolved into
>> >> something
>> >> > >>>>>>> approaching what we have been talking about so far. There
>> >> > >>>>>>> has not however been enough discussion anywhere near the
>> >> > >>>>>>> level of detail
>> >> > >> in
>> >> > >>>>>>> that document and more details are needed for each
>> >> > >>>>>>> component in
>> >> > >> the
>> >> > >>>> design.
>> >> > >>>>>>>> Why the work in that document should not be fed into the
>> >> > >> community
>> >> > >>>>>>> discussion as anyone else's would be - I fail to understand.
>> >> > >>>>>>>>
>> >> > >>>>>>>> My suggestion continues to be that you should take that
>> >> > >>>>>>>> document
>> >> > >>> and
>> >> > >>>>>>> speak to the inventory of moving parts as we agreed.
>> >> > >>>>>>>> As these are agreed upon, we will ensure that the
>> >> > >>>>>>>> appropriate subtasks
>> >> > >>>>>>> are filed against whatever JIRA is to host them - don't
>> >> > >>>>>>> really
>> >> > >> care
>> >> > >>>>>>> much which it is.
>> >> > >>>>>>>>
>> >> > >>>>>>>> I don't really want to continue with two separate JIRAs -
>> >> > >>>>>>>> as I stated
>> >> > >>>>>>> long ago - but until we understand what the pieces are and
>> >> > >>>>>>> how
>> >> > >> they
>> >> > >>>>>>> relate then they can't be consolidated.
>> >> > >>>>>>>> Even if 9533 ended up being repurposed as the server
>> >> > >>>>>>>> instance of
>> >> > >>> the
>> >> > >>>>>>> work - it should be a subtask of a larger one - if that is
>> >> > >>>>>>> to be 9392, so be it.
>> >> > >>>>>>>> We still need to define all the pieces of the larger
>> >> > >>>>>>>> picture
>> >> > >> before
>> >> > >>>>>>>> that
>> >> > >>>>>>> can be done.
>> >> > >>>>>>>>
>> >> > >>>>>>>> What I thought was the clean slate approach to the
>> >> > >>>>>>>> discussion
>> >> > >>> seemed
>> >> > >>>>>>>> a
>> >> > >>>>>>> very reasonable way to make all this happen.
>> >> > >>>>>>>> If you would like to restate what you intended by it or
>> >> something
>> >> > >>>>>>>> else
>> >> > >>>>>>> equally as reasonable as a way to move forward that would
>> >> > >>>>>>> be
>> >> > >>> awesome.
>> >> > >>>>>>>>
>> >> > >>>>>>>> I will be happy to work toward the roadmap with everyone
>> >> > >>>>>>>> once it
>> >> > >> is
>> >> > >>>>>>> articulated, understood and actionable.
>> >> > >>>>>>>> In the meantime, I have work to do.
>> >> > >>>>>>>>
>> >> > >>>>>>>> thanks,
>> >> > >>>>>>>>
>> >> > >>>>>>>> --larry
>> >> > >>>>>>>>
>> >> > >>>>>>>> BTW - I meant to quote you in an earlier response and
>> >> > >>>>>>>> ended up saying it
>> >> > >>>>>>> was Aaron instead. Not sure what happened there. :-)
>> >> > >>>>>>>>
>> >> > >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell
>> >> > >>>>>>>> <apurtell@apache.org
>> >> >
>> >> > >>>> wrote:
>> >> > >>>>>>>>
>> >> > >>>>>>>>> Hi Larry (and all),
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> Happy Fourth of July to you and yours.
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> In our shop Kai and Tianyou are already doing the coding,
>> >> > >>>>>>>>> so
>> >> I'd
>> >> > >>>>>>>>> defer
>> >> > >>>>>>> to
>> >> > >>>>>>>>> them on the detailed points.
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> My concern here is there may have been a
>> >> > >>>>>>>>> misinterpretation or
>> >> > >> lack
>> >> > >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully
>> >> > >>>>>>>>> that
>> >> > >> can
>> >> > >>>>>>>>> be
>> >> > >>>>>>> quickly
>> >> > >>>>>>>>> cleared up. Certainly we did not mean ignore all that
>> >> > >>>>>>>>> came
>> >> > >> before.
>> >> > >>>>>>>>> The
>> >> > >>>>>>> idea
>> >> > >>>>>>>>> was to reset discussions to find common ground and new
>> >> direction
>> >> > >>>>>>>>> where
>> >> > >>>>>>> we
>> >> > >>>>>>>>> are working together, not in conflict, on an agreed upon
>> >> > >>>>>>>>> set of design points and tasks. There's been a lot of
>> >> > >>>>>>>>> good discussion
>> >> > >> and
>> >> > >>>>>>>>> design preceeding that we should figure out how to port
>> >> > >>>>>>>>> over.
>> >> > >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs"
>> >> > >>>>>>>>> and
>> >> > >>> such,
>> >> > >>>>>>>>> which have been disappointing to see crop up, we should
>> >> > >>>>>>>>> be collaboratively coding not planting flags.
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> I read Kai's latest document as something approaching
>> >> > >>>>>>>>> today's consensus
>> >> > >>>>>>> (or
>> >> > >>>>>>>>> at least a common point of view?) rather than a
>> >> > >>>>>>>>> historical
>> >> > >>> document.
>> >> > >>>>>>>>> Perhaps he and it can be given equal share of the
>> >> consideration.
>> >> > >>>>>>>>>
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>> >> > >>>>>>>>>
>> >> > >>>>>>>>>> Hey Andrew -
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> I largely agree with that statement.
>> >> > >>>>>>>>>> My intention was to let the differences be worked out
>> >> > >>>>>>>>>> within
>> >> > >> the
>> >> > >>>>>>>>>> individual components once they were identified and
>> >> > >>>>>>>>>> subtasks
>> >> > >>>> created.
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> My reference to HSSO was really referring to a SSO
>> >> > >>>>>>>>>> *server*
>> >> > >> based
>> >> > >>>>>>> design
>> >> > >>>>>>>>>> which was not clearly articulated in the earlier
>> >> > >>>>>>>>>> documents.
>> >> > >>>>>>>>>> We aren't trying to compare and contrast one design over
>> >> > >> another
>> >> > >>>>>>> anymore.
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> Let's move this collaboration along as we've mapped out
>> >> > >>>>>>>>>> and
>> >> the
>> >> > >>>>>>>>>> differences in the details will reveal themselves and be
>> >> > >>> addressed
>> >> > >>>>>>> within
>> >> > >>>>>>>>>> their components.
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> I've actually been looking forward to you weighing in on
>> >> > >>>>>>>>>> the actual discussion points in this thread.
>> >> > >>>>>>>>>> Could you do that?
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> At this point, I am most interested in your thoughts on
>> >> > >>>>>>>>>> a
>> >> > >> single
>> >> > >>>>>>>>>> jira
>> >> > >>>>>>> to
>> >> > >>>>>>>>>> represent all of this work and whether we should start
>> >> > >> discussing
>> >> > >>>>>>>>>> the
>> >> > >>>>>>> SSO
>> >> > >>>>>>>>>> Tokens.
>> >> > >>>>>>>>>> If you think there are discussion points missing from
>> >> > >>>>>>>>>> that
>> >> > >> list,
>> >> > >>>>>>>>>> feel
>> >> > >>>>>>> free
>> >> > >>>>>>>>>> to add to it.
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> thanks,
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> --larry
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
>> >> > >> apurtell@apache.org>
>> >> > >>>>>>> wrote:
>> >> > >>>>>>>>>>
>> >> > >>>>>>>>>>> Hi Larry,
>> >> > >>>>>>>>>>>
>> >> > >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let
>> >> > >>>>>>>>>>> me
>> >> > >> point
>> >> > >>>>>>>>>>> out
>> >> > >>>>>>> that,
>> >> > >>>>>>>>>>> while the differences between the competing JIRAs have
>> >> > >>>>>>>>>>> been reduced
>> >> > >>>>>>> for
>> >> > >>>>>>>>>>> sure, there were some key differences that didn't just
>> >> > >>> disappear.
>> >> > >>>>>>>>>>> Subsequent discussion will make that clear. I also
>> >> > >>>>>>>>>>> disagree
>> >> > >> with
>> >> > >>>>>>>>>>> your characterization that we have simply endorsed all
>> >> > >>>>>>>>>>> of the design
>> >> > >>>>>>> decisions
>> >> > >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an
>> >> > >>>>>>>>>>> inch. We
>> >> > >>> are
>> >> > >>>>>>> here to
>> >> > >>>>>>>>>>> engage in a collaborative process as peers. I've been
>> >> > >> encouraged
>> >> > >>>>>>>>>>> by
>> >> > >>>>>>> the
>> >> > >>>>>>>>>>> spirit of the discussions up to this point and hope
>> >> > >>>>>>>>>>> that can continue beyond one design summit.
>> >> > >>>>>>>>>>>
>> >> > >>>>>>>>>>>
>> >> > >>>>>>>>>>>
>> >> > >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
>> >> > >>>>>>>>>>> <lm...@hortonworks.com>
>> >> > >>>>>>>>>> wrote:
>> >> > >>>>>>>>>>>
>> >> > >>>>>>>>>>>> Hi Kai -
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> I think that I need to clarify something...
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> This is not an update for 9533 but a continuation of
>> >> > >>>>>>>>>>>> the discussions
>> >> > >>>>>>>>>> that
>> >> > >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
>> >> > >>>>>>>>>>>> We've agreed to leave our previous designs behind and
>> >> > >> therefore
>> >> > >>>>>>>>>>>> we
>> >> > >>>>>>>>>> aren't
>> >> > >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS
>> >> > >>>>>>>>>>>> approach
>> >> or
>> >> > >>> an
>> >> > >>>>>>> HSSO vs
>> >> > >>>>>>>>>>>> TAS discussion.
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> Your latest design revision actually makes it clear
>> >> > >>>>>>>>>>>> that you
>> >> > >>> are
>> >> > >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
>> >> > >> comparing
>> >> > >>>>>>>>>>>> and
>> >> > >>>>>>>>>> contrasting
>> >> > >>>>>>>>>>>> is not going to add any value.
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> What we need you to do at this point, is to look at
>> >> > >>>>>>>>>>>> those high-level components described on this thread
>> >> > >>>>>>>>>>>> and comment
>> >> on
>> >> > >>>>>>>>>>>> whether we need additional components or any that are
>> >> > >>>>>>>>>>>> listed that don't seem
>> >> > >>>>>>> necessary
>> >> > >>>>>>>>>> to
>> >> > >>>>>>>>>>>> you and why.
>> >> > >>>>>>>>>>>> In other words, we need to define and agree on the
>> >> > >>>>>>>>>>>> work that
>> >> > >>> has
>> >> > >>>>>>>>>>>> to
>> >> > >>>>>>> be
>> >> > >>>>>>>>>>>> done.
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> We also need to determine those components that need
>> >> > >>>>>>>>>>>> to be
>> >> > >> done
>> >> > >>>>>>> before
>> >> > >>>>>>>>>>>> anything else can be started.
>> >> > >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens
>> >> > >>>>>>>>>>>> are central to
>> >> > >>>>>>>>>> all
>> >> > >>>>>>>>>>>> the other components and should probably be defined
>> >> > >>>>>>>>>>>> and
>> >> POC'd
>> >> > >>> in
>> >> > >>>>>>> short
>> >> > >>>>>>>>>>>> order.
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> Personally, I think that continuing the separation of
>> >> > >>>>>>>>>>>> 9533
>> >> > >> and
>> >> > >>>>>>>>>>>> 9392
>> >> > >>>>>>> will
>> >> > >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be
>> >> > >>>>>>>>>>>> enough
>> >> > >>>>>>> differences
>> >> > >>>>>>>>>>>> between the two to justify separate jiras anymore. It
>> >> > >>>>>>>>>>>> may be best to
>> >> > >>>>>>>>>> file a
>> >> > >>>>>>>>>>>> new one that reflects a single vision without the
>> >> > >>>>>>>>>>>> extra
>> >> cruft
>> >> > >>>>>>>>>>>> that
>> >> > >>>>>>> has
>> >> > >>>>>>>>>>>> built up in either of the existing ones. We would
>> >> > >>>>>>>>>>>> certainly reference
>> >> > >>>>>>>>>> the
>> >> > >>>>>>>>>>>> existing ones within the new one. This approach would
>> >> > >>>>>>>>>>>> align
>> >> > >>> with
>> >> > >>>>>>>>>>>> the
>> >> > >>>>>>>>>> spirit
>> >> > >>>>>>>>>>>> of the discussions up to this point.
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> I am prepared to start a discussion around the shape
>> >> > >>>>>>>>>>>> of the
>> >> > >> two
>> >> > >>>>>>> Hadoop
>> >> > >>>>>>>>>> SSO
>> >> > >>>>>>>>>>>> tokens: identity and access. If this is what others
>> >> > >>>>>>>>>>>> feel the next
>> >> > >>>>>>> topic
>> >> > >>>>>>>>>>>> should be.
>> >> > >>>>>>>>>>>> If we can identify a jira home for it, we can do it
>> >> > >>>>>>>>>>>> there -
>> >> > >>>>>>> otherwise we
>> >> > >>>>>>>>>>>> can create another DISCUSS thread for it.
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> thanks,
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> --larry
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
>> >> > >> kai.zheng@intel.com>
>> >> > >>>>>>> wrote:
>> >> > >>>>>>>>>>>>
>> >> > >>>>>>>>>>>>> Hi Larry,
>> >> > >>>>>>>>>>>>>
>> >> > >>>>>>>>>>>>> Thanks for the update. Good to see that with this
>> >> > >>>>>>>>>>>>> update we
>> >> > >>> are
>> >> > >>>>>>>>>>>>> now
>> >> > >>>>>>>>>>>> aligned on most points.
>> >> > >>>>>>>>>>>>>
>> >> > >>>>>>>>>>>>> I have also updated our TokenAuth design in
>> >> > >>>>>>>>>>>>> HADOOP-9392.
>> >> The
>> >> > >>>>>>>>>>>>> new
>> >> > >>>>>>>>>>>> revision incorporates feedback and suggestions in
>> >> > >>>>>>>>>>>> related discussion
>> >> > >>>>>>>>>> with
>> >> > >>>>>>>>>>>> the community, particularly from Microsoft and others
>> >> > >> attending
>> >> > >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
>> >> > >>> Summary
>> >> > >>>>>>>>>>>> of the
>> >> > >>>>>>>>>> changes:
>> >> > >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens,
>> >> > >>>>>>>>>>>>> Identity
>> >> > >>> Token
>> >> > >>>>>>> plus
>> >> > >>>>>>>>>>>> Access Token, particularly considering our
>> >> > >>>>>>>>>>>> authorization framework
>> >> > >>>>>>> and
>> >> > >>>>>>>>>>>> compatibility with HSSO;
>> >> > >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
>> >> > >>>> authorization
>> >> > >>>>>>>>>>>> framework into the flow that issues access tokens for
>> >> clients
>> >> > >>>>>>>>>>>> with
>> >> > >>>>>>>>>> identity
>> >> > >>>>>>>>>>>> tokens to access services;
>> >> > >>>>>>>>>>>>> 3.    Refined proxy access token and the
>> >> proxy/impersonation
>> >> > >>>> flow;
>> >> > >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access
>> >> > >>>>>>>>>>>>> to
>> >> > >>>> Hadoop
>> >> > >>>>>>> web
>> >> > >>>>>>>>>>>> services;
>> >> > >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
>> >> > >>>>>>>>>
>> >> > >>>>>>>>>
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> --
>> >> > >>>>>>>>> Best regards,
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> - Andy
>> >> > >>>>>>>>>
>> >> > >>>>>>>>> Problems worthy of attack prove their worth by hitting
>> >> > >>>>>>>>> back. -
>> >> > >>> Piet
>> >> > >>>>>>>>> Hein (via Tom White)
>> >> > >>>>>>>>
>> >> > >>>>>>>
>> >> > >>>>>>>
>> >> > >>>>>>
>> >> > >>>>>>
>> >> > >>>>>> --
>> >> > >>>>>> Alejandro
>> >> > >>>>>>
>> >> > >>>>>
>> >> > >>>>>
>> >> > >>>>>
>> >> > >>>>>
>> >> > >>>>>
>> >> > >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
>> >> > >>>>
>> >> > >>>>
>> >> > >>>
>> >> > >>>
>> >> > >>> --
>> >> > >>> Alejandro
>> >> > >>>
>> >> > >>
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Alejandro
>> >> >
>> >> >
>> >>
>> >
>> > --
>> > CONFIDENTIALITY NOTICE
>> > NOTICE: This message is intended for the use of the individual or
>> > entity to which it is addressed and may contain information that is
>> > confidential, privileged and exempt from disclosure under applicable
>> > law. If the reader of this message is not the intended recipient, you
>> > are hereby notified that any printing, copying, dissemination,
>> > distribution, disclosure or forwarding of this communication is
>> > strictly prohibited. If you have received this communication in error,
>> > please contact the sender immediately and delete it from your system.
>> > Thank You.
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Chris -

I am curious whether there are any guidelines for feature branch use.

The general goals should be to:
* keep branches as small and as easily reviewable as possible for a given
feature
* decouple the pluggable framework from any specific central server
implementation
* scope specific content into iterations that can be merged into trunk on
their own and then development continued in new branches for the next
iteration

So, I guess the questions that immediately come to mind are:
1. Is there a document that describes the best way to do this?
2. How best do we leverage code being done in one feature branch within
another?

Thanks!

--larry



On Tue, Sep 3, 2013 at 10:00 PM, Zheng, Kai <ka...@intel.com> wrote:

> This looks good and reasonable to me. Thanks Chris.
>
> -----Original Message-----
> From: Chris Douglas [mailto:cdouglas@apache.org]
> Sent: Wednesday, September 04, 2013 6:45 AM
> To: common-dev@hadoop.apache.org
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>
> On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay <lm...@hortonworks.com>
> wrote:
> > One outstanding question for me - how do we go about getting the
> > branches created?
>
> Once a group has converged on a purpose- ideally with some initial code
> from JIRA- please go ahead and create the feature branch in svn.
> There's no ceremony. -C
>
> > On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth <cnauroth@hortonworks.com
> >wrote:
> >
> >> Near the bottom of the bylaws, it states that addition of a "New
> >> Branch Committer" requires "Lazy consensus of active PMC members."  I
> >> think this means that you'll need to get a PMC member to sponsor the
> vote for you.
> >>  Regular committer votes happen on the private PMC mailing list, and
> >> I assume it would be the same for a branch committer vote.
> >>
> >> http://hadoop.apache.org/bylaws.html
> >>
> >> Chris Nauroth
> >> Hortonworks
> >> http://hortonworks.com/
> >>
> >>
> >>
> >> On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com>
> >> wrote:
> >>
> >> > That sounds perfect!
> >> > I have been thinking of late that we would maybe need an incubator
> >> project
> >> > or something for this - which would be unfortunate.
> >> >
> >> > This would allow us to move much more quickly with a set of patches
> >> broken
> >> > up into consumable/understandable chunks that are made functional
> >> > more easily within the branch.
> >> > I assume that we need to start a separate thread for DISCUSS or
> >> > VOTE to start that process - correct?
> >> >
> >> > On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com>
> >> wrote:
> >> >
> >> > > yep, that is what I meant. Thanks Chris
> >> > >
> >> > >
> >> > > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <
> >> cnauroth@hortonworks.com
> >> > >wrote:
> >> > >
> >> > >> Perhaps this is also a good opportunity to try out the new
> >> > >> "branch committers" clause in the bylaws, enabling
> >> > >> non-committers who are
> >> > working
> >> > >> on this to commit to the feature branch.
> >> > >>
> >> > >>
> >> > >>
> >> >
> >> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%
> >> 3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%
> >> 3E
> >> > >>
> >> > >> Chris Nauroth
> >> > >> Hortonworks
> >> > >> http://hortonworks.com/
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur
> >> > >> <tucu@cloudera.com
> >> > >>> wrote:
> >> > >>
> >> > >>> Larry,
> >> > >>>
> >> > >>> Sorry for the delay answering. Thanks for laying down things,
> >> > >>> yes, it
> >> > >> makes
> >> > >>> sense.
> >> > >>>
> >> > >>> Given the large scope of the changes, number of JIRAs and
> >> > >>> number of developers involved, wouldn't make sense to create a
> >> > >>> feature branch
> >> for
> >> > >> all
> >> > >>> this work not to destabilize (more ;) trunk?
> >> > >>>
> >> > >>> Thanks again.
> >> > >>>
> >> > >>>
> >> > >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay
> >> > >>> <lmccay@hortonworks.com
> >> >
> >> > >>> wrote:
> >> > >>>
> >> > >>>> The following JIRA was filed to provide a token and basic
> >> > >>>> authority implementation for this effort:
> >> > >>>> https://issues.apache.org/jira/browse/HADOOP-9781
> >> > >>>>
> >> > >>>> I have attached an initial patch though have yet to submit it
> >> > >>>> as one
> >> > >>> since
> >> > >>>> it is dependent on the patch for CMF that was posted to:
> >> > >>>> https://issues.apache.org/jira/browse/HADOOP-9534
> >> > >>>> and this patch still has a couple outstanding issues - javac
> >> warnings
> >> > >> for
> >> > >>>> com.sun classes for certification generation and 11 javadoc
> >> warnings.
> >> > >>>>
> >> > >>>> Please feel free to review the patches and raise any questions
> >> > >>>> or
> >> > >>> concerns
> >> > >>>> related to them.
> >> > >>>>
> >> > >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay
> >> > >>>> <lm...@hortonworks.com>
> >> > >> wrote:
> >> > >>>>
> >> > >>>>> Hello All -
> >> > >>>>>
> >> > >>>>> In an effort to scope an initial iteration that provides
> >> > >>>>> value to
> >> the
> >> > >>>> community while focusing on the pluggable authentication
> >> > >>>> aspects,
> >> I've
> >> > >>>> written a description for "Iteration 1". It identifies the
> >> > >>>> goal of
> >> the
> >> > >>>> iteration, the endstate and a set of initial usecases. It also
> >> > >> enumerates
> >> > >>>> the components that are required for each usecase. There is a
> >> > >>>> scope
> >> > >>> section
> >> > >>>> that details specific things that should be kept out of the
> >> > >>>> first iteration. This is certainly up for discussion. There
> >> > >>>> may be some of
> >> > >>> these
> >> > >>>> things that can be contributed in short order. If we can add
> >> > >>>> some
> >> > >> things
> >> > >>> in
> >> > >>>> without unnecessary complexity for the identified usecases
> >> > >>>> then we
> >> > >>> should.
> >> > >>>>>
> >> > >>>>> @Alejandro - please review this and see whether it satisfies
> >> > >>>>> your
> >> > >> point
> >> > >>>> for a definition of what we are building.
> >> > >>>>>
> >> > >>>>> In addition to the document that I will paste here as text
> >> > >>>>> and
> >> > >> attach a
> >> > >>>> pdf version, we have a couple patches for components that are
> >> > >> identified
> >> > >>> in
> >> > >>>> the document.
> >> > >>>>> Specifically, COMP-7 and COMP-8.
> >> > >>>>>
> >> > >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which
> >> > >>>>> was
> >> > >> filed
> >> > >>>> specifically for that functionality.
> >> > >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as
> >> > >>>>> the
> >> > >> token
> >> > >>>> format and a basic JsonWebTokenAuthority that can issue and
> >> > >>>> verify
> >> > >> these
> >> > >>>> tokens.
> >> > >>>>>
> >> > >>>>> Since there is no JIRA for this yet, I will likely file a new
> >> > >>>>> JIRA
> >> > >> for
> >> > >>> a
> >> > >>>> SSO token implementation.
> >> > >>>>>
> >> > >>>>> Both of these patches assume to be modules within
> >> > >>>> hadoop-common/hadoop-common-project.
> >> > >>>>> While they are relatively small, I think that they will be
> >> > >>>>> pulled
> >> in
> >> > >> by
> >> > >>>> other modules such as hadoop-auth which would likely not want
> >> > >>>> a
> >> > >>> dependency
> >> > >>>> on something larger like
> >> > >>> hadoop-common/hadoop-common-project/hadoop-common.
> >> > >>>>>
> >> > >>>>> This is certainly something that we should discuss within the
> >> > >> community
> >> > >>>> for this effort though - that being, exactly how to add these
> >> > libraries
> >> > >>> so
> >> > >>>> that they are most easily consumed by existing projects.
> >> > >>>>>
> >> > >>>>> Anyway, the following is the Iteration-1 document - it is
> >> > >>>>> also
> >> > >> attached
> >> > >>>> as a pdf:
> >> > >>>>>
> >> > >>>>> Iteration 1: Pluggable User Authentication and Federation
> >> > >>>>>
> >> > >>>>> Introduction
> >> > >>>>> The intent of this effort is to bootstrap the development of
> >> > >> pluggable
> >> > >>>> token-based authentication mechanisms to support certain goals
> >> > >>>> of enterprise authentication integrations. By restricting the
> >> > >>>> scope of
> >> > >> this
> >> > >>>> effort, we hope to provide immediate benefit to the community
> >> > >>>> while
> >> > >>> keeping
> >> > >>>> the initial contribution to a manageable size that can be
> >> > >>>> easily
> >> > >>> reviewed,
> >> > >>>> understood and extended with further development through
> >> > >>>> follow up
> >> > >> JIRAs
> >> > >>>> and related iterations.
> >> > >>>>>
> >> > >>>>> Iteration Endstate
> >> > >>>>> Once complete, this effort will have extended the
> >> > >>>>> authentication
> >> > >>>> mechanisms - for all client types - from the existing: Simple,
> >> > Kerberos
> >> > >>> and
> >> > >>>> Plain (for RPC) to include LDAP authentication and SAML based
> >> > >> federation.
> >> > >>>> In addition, the ability to provide additional/custom
> >> > >>>> authentication mechanisms will be enabled for users to plug in
> >> > >>>> their preferred
> >> > >>> mechanisms.
> >> > >>>>>
> >> > >>>>> Project Scope
> >> > >>>>> The scope of this effort is a subset of the features covered
> >> > >>>>> by the
> >> > >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort
> >> > >>>> concentrates
> >> on
> >> > >>>> enabling Hadoop to issue, accept/validate SSO tokens of its
> >> > >>>> own. The pluggable authentication mechanism within SASL/RPC
> >> > >>>> layer and the authentication filter pluggability for REST and
> >> > >>>> UI components will
> >> be
> >> > >>>> leveraged and extended to support the results of this effort.
> >> > >>>>>
> >> > >>>>> Out of Scope
> >> > >>>>> In order to scope the initial deliverable as the minimally
> >> > >>>>> viable
> >> > >>>> product, a handful of things have been simplified or left out
> >> > >>>> of
> >> scope
> >> > >>> for
> >> > >>>> this effort. This is not meant to say that these aspects are
> >> > >>>> not
> >> > useful
> >> > >>> or
> >> > >>>> not needed but that they are not necessary for this iteration.
> >> > >>>> We do however need to ensure that we don't do anything to
> >> > >>>> preclude adding
> >> > >> them
> >> > >>> in
> >> > >>>> future iterations.
> >> > >>>>> 1. Additional Attributes - the result of authentication will
> >> continue
> >> > >>> to
> >> > >>>> use the existing hadoop tokens and identity representations.
> >> > Additional
> >> > >>>> attributes used for finer grained authorization decisions will
> >> > >>>> be
> >> > added
> >> > >>>> through follow-up efforts.
> >> > >>>>> 2. Token revocation - the ability to revoke issued identity
> >> > >>>>> tokens
> >> > >> will
> >> > >>>> be added later
> >> > >>>>> 3. Multi-factor authentication - this will likely require
> >> additional
> >> > >>>> attributes and is not necessary for this iteration.
> >> > >>>>> 4. Authorization changes - we will require additional
> >> > >>>>> attributes
> >> for
> >> > >>> the
> >> > >>>> fine-grained access control plans. This is not needed for this
> >> > >> iteration.
> >> > >>>>> 5. Domains - we assume a single flat domain for all users 6.
> >> > >>>>> Kinit alternative - we can leverage existing REST clients
> >> > >>>>> such
> >> as
> >> > >>>> cURL to retrieve tokens through authentication and federation
> >> > >>>> for
> >> the
> >> > >>> time
> >> > >>>> being
> >> > >>>>> 7. A specific authentication framework isn't really necessary
> >> within
> >> > >>> the
> >> > >>>> REST endpoints for this iteration. If one is available then we
> >> > >>>> can
> >> use
> >> > >> it
> >> > >>>> otherwise we can leverage existing things like Apache Shiro
> >> > >>>> within a servlet filter.
> >> > >>>>>
> >> > >>>>> In Scope
> >> > >>>>> What is in scope for this effort is defined by the usecases
> >> described
> >> > >>>> below. Components required for supporting the usecases are
> >> summarized
> >> > >> for
> >> > >>>> each client type. Each component is a candidate for a JIRA
> >> > >>>> subtask -
> >> > >>> though
> >> > >>>> multiple components are likely to be included in a JIRA to
> >> represent a
> >> > >>> set
> >> > >>>> of functionality rather than individual JIRAs per component.
> >> > >>>>>
> >> > >>>>> Terminology and Naming
> >> > >>>>> The terms and names of components within this document are
> >> > >>>>> merely
> >> > >>>> descriptive of the functionality that they represent. Any
> >> > >>>> similarity
> >> > or
> >> > >>>> difference in names or terms from those that are found in
> >> > >>>> other
> >> > >> documents
> >> > >>>> are not intended to make any statement about those other
> >> > >>>> documents
> >> or
> >> > >> the
> >> > >>>> descriptions within. This document represents the pluggable
> >> > >>> authentication
> >> > >>>> mechanisms and server functionality required to replace Kerberos.
> >> > >>>>>
> >> > >>>>> Ultimately, the naming of the implementation classes will be
> >> > >>>>> a
> >> > >> product
> >> > >>>> of the patches accepted by the community.
> >> > >>>>>
> >> > >>>>> Usecases:
> >> > >>>>> client types: REST, CLI, UI
> >> > >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
> >> > >>>> federation/SAML
> >> > >>>>>
> >> > >>>>> Simple and Kerberos
> >> > >>>>> Simple and Kerberos usecases continue to work as they do
> >> > >>>>> today. The
> >> > >>>> addition of Authentication/LDAP and Federation/SAML are added
> >> through
> >> > >> the
> >> > >>>> existing pluggability points either as they are or with
> >> > >>>> required
> >> > >>> extension.
> >> > >>>> Either way, continued support for Simple and Kerberos must not
> >> require
> >> > >>>> changes to existing deployments in the field as a result of
> >> > >>>> this
> >> > >> effort.
> >> > >>>>>
> >> > >>>>> REST
> >> > >>>>> USECASE REST-1 Authentication/LDAP:
> >> > >>>>> For REST clients, we will provide the ability to:
> >> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
> >> exposed
> >> > >> by
> >> > >>>> an AuthenticationServer instance via REST calls to:
> >> > >>>>>   a. authenticate - passing username/password returning a
> >> > >>>>> hadoop
> >> > >>>> id_token
> >> > >>>>>   b. get-access-token - from the TokenGrantingService by
> >> > >>>>> passing
> >> the
> >> > >>>> hadoop id_token as an Authorization: Bearer token along with
> >> > >>>> the
> >> > >> desired
> >> > >>>> service name (master service name) returning a hadoop access
> >> > >>>> token
> >> > >>>>> 2. Successfully invoke a hadoop service REST API passing the
> >> > >>>>> hadoop
> >> > >>>> access token through an HTTP header as an Authorization Bearer
> >> > >>>> token
> >> > >>>>>   a. validation of the incoming token on the service endpoint
> >> > >>>>> is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>> 3. Successfully block access to a REST resource when
> >> > >>>>> presenting a
> >> > >>> hadoop
> >> > >>>> access token intended for a different service
> >> > >>>>>   a. validation of the incoming token on the service endpoint
> >> > >>>>> is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>>
> >> > >>>>> USECASE REST-2 Federation/SAML:
> >> > >>>>> We will also provide federation capabilities for REST clients
> >> > >>>>> such
> >> > >>> that:
> >> > >>>>> 1. acquire SAML assertion token from a trusted IdP
> >> > >>>>> (shibboleth?)
> >> and
> >> > >>>> persist in a permissions protected file - ie.
> >> > >> ~/.hadoop_tokens/.idp_token
> >> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an
> >> > >>>>> SP
> >> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> >> > instance
> >> > >>> via
> >> > >>>> REST calls to:
> >> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
> >> Bearer
> >> > >>>> token returning a hadoop id_token
> >> > >>>>>      - can copy and paste from commandline or use cat to
> >> > >>>>> include
> >> > >>>> persisted token through "--Header Authorization: Bearer 'cat
> >> > >>>> ~/.hadoop_tokens/.id_token'"
> >> > >>>>>   b. get-access-token - from the TokenGrantingService by
> >> > >>>>> passing
> >> the
> >> > >>>> hadoop id_token as an Authorization: Bearer token along with
> >> > >>>> the
> >> > >> desired
> >> > >>>> service name (master service name) to the TokenGrantingService
> >> > >> returning
> >> > >>> a
> >> > >>>> hadoop access token
> >> > >>>>> 3. Successfully invoke a hadoop service REST API passing the
> >> > >>>>> hadoop
> >> > >>>> access token through an HTTP header as an Authorization Bearer
> >> > >>>> token
> >> > >>>>>   a. validation of the incoming token on the service endpoint
> >> > >>>>> is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>> 4. Successfully block access to a REST resource when
> >> > >>>>> presenting a
> >> > >>> hadoop
> >> > >>>> access token intended for a different service
> >> > >>>>>   a. validation of the incoming token on the service endpoint
> >> > >>>>> is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>>
> >> > >>>>> REQUIRED COMPONENTS for REST USECASES:
> >> > >>>>> COMP-1. REST client - cURL or similar COMP-2. REST endpoint
> >> > >>>>> for BASIC authentication to LDAP - IdP
> >> endpoint
> >> > >>>> example - returning hadoop id_token
> >> > >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
> >> > >>> shibboleth
> >> > >>>> SP?|OpenSAML? - returning hadoop id_token
> >> > >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring
> >> > >>>>> hadoop
> >> access
> >> > >>>> tokens from hadoop id_tokens
> >> > >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop
> >> > >>>>> access
> >> > >>>> tokens
> >> > >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
> >> > >>>>> COMP-7. hadoop token and authority implementations COMP-8.
> >> > >>>>> core services for crypto support for signing, verifying and
> >> > >> PKI
> >> > >>>> management
> >> > >>>>>
> >> > >>>>> CLI
> >> > >>>>> USECASE CLI-1 Authentication/LDAP:
> >> > >>>>> For CLI/RPC clients, we will provide the ability to:
> >> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
> >> exposed
> >> > >> by
> >> > >>>> an AuthenticationServer instance via REST calls to:
> >> > >>>>>   a. authenticate - passing username/password returning a
> >> > >>>>> hadoop
> >> > >>>> id_token
> >> > >>>>>      - for RPC clients we need to persist the returned hadoop
> >> > >> identity
> >> > >>>> token in a file protected by fs permissions so that it may be
> >> > leveraged
> >> > >>>> until expiry
> >> > >>>>>      - directing the returned response to a file may suffice
> >> > >>>>> for
> >> now
> >> > >>>> something like ">~/.hadoop_tokens/.id_token"
> >> > >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> >> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL
> >> > >>>>> layer,
> >> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token
> >> > >>>> is
> >> passed
> >> > >> as
> >> > >>>> Authorization: Bearer token to the get-access-token REST
> >> > >>>> endpoint
> >> > >> exposed
> >> > >>>> by TokenGrantingService returning a hadoop access token
> >> > >>>>>   b. RPC server side validates the presented hadoop access
> >> > >>>>> token
> >> and
> >> > >>>> continues to serve request
> >> > >>>>>   c. Successfully invoke a hadoop service RPC API
> >> > >>>>>
> >> > >>>>> USECASE CLI-2 Federation/SAML:
> >> > >>>>> For CLI/RPC clients, we will provide the ability to:
> >> > >>>>> 1. acquire SAML assertion token from a trusted IdP
> >> > >>>>> (shibboleth?)
> >> and
> >> > >>>> persist in a permissions protected file - ie.
> >> > >> ~/.hadoop_tokens/.idp_token
> >> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an
> >> > >>>>> SP
> >> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> >> > instance
> >> > >>> via
> >> > >>>> REST calls to:
> >> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
> >> Bearer
> >> > >>>> token returning a hadoop id_token
> >> > >>>>>      - can copy and paste from commandline or use cat to
> >> > >>>>> include
> >> > >>>> previously persisted token through "--Header Authorization:
> >> > >>>> Bearer
> >> > 'cat
> >> > >>>> ~/.hadoop_tokens/.id_token'"
> >> > >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> >> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL
> >> > >>>>> layer,
> >> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token
> >> > >>>> is
> >> passed
> >> > >> as
> >> > >>>> Authorization: Bearer token to the get-access-token REST
> >> > >>>> endpoint
> >> > >> exposed
> >> > >>>> by TokenGrantingService returning a hadoop access token
> >> > >>>>>   b. RPC server side validates the presented hadoop access
> >> > >>>>> token
> >> and
> >> > >>>> continues to serve request
> >> > >>>>>   c. Successfully invoke a hadoop service RPC API
> >> > >>>>>
> >> > >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required
> >> > >>>>> for
> >> > >>> REST):
> >> > >>>>> COMP-9. TokenAuth Method negotiation, etc COMP-10. Client
> >> > >>>>> side implementation to leverage REST endpoint for
> >> > >>>> acquiring hadoop access tokens given a hadoop id_token
> >> > >>>>> COMP-11. Server side implementation to validate incoming
> >> > >>>>> hadoop
> >> > >> access
> >> > >>>> tokens
> >> > >>>>>
> >> > >>>>> UI
> >> > >>>>> Various Hadoop services have their own web UI consoles for
> >> > >>>> administration and end user interactions. These consoles need
> >> > >>>> to
> >> also
> >> > >>>> benefit from the pluggability of authentication mechansims to
> >> > >>>> be on
> >> > par
> >> > >>>> with the access control of the cluster REST and RPC APIs.
> >> > >>>>> Web consoles are protected with an
> >> > >>>>> WebSSOAuthenticationHandler
> >> which
> >> > >>>> will be configured for either authentication or federation.
> >> > >>>>>
> >> > >>>>> USECASE UI-1 Authentication/LDAP:
> >> > >>>>> For the authentication usecase:
> >> > >>>>> 1. User's browser requests access to a UI console page 2.
> >> > >>>>> WebSSOAuthenticationHandler intercepts the request and
> >> > >>>>> redirects
> >> > >> the
> >> > >>>> browser to an IdP web endpoint exposed by the
> >> > >>>> AuthenticationServer
> >> > >>> passing
> >> > >>>> the requested url as the redirect_url
> >> > >>>>> 3. IdP web endpoint presents the user with a FORM over https
> >> > >>>>>   a. user provides username/password and submits the FORM 4.
> >> > >>>>> AuthenticationServer authenticates the user with provided
> >> > >>> credentials
> >> > >>>> against the configured LDAP server and:
> >> > >>>>>   a. leverages a servlet filter or other authentication
> >> > >>>>> mechanism
> >> > >> for
> >> > >>>> the endpoint and authenticates the user with a simple LDAP
> >> > >>>> bind with username and password
> >> > >>>>>   b. acquires a hadoop id_token and uses it to acquire the
> >> > >>>>> required
> >> > >>>> hadoop access token which is added as a cookie
> >> > >>>>>   c. redirects the browser to the original service UI
> >> > >>>>> resource via
> >> > >> the
> >> > >>>> provided redirect_url
> >> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> >> > >>> interrogates
> >> > >>>> the incoming request again for an authcookie that contains an
> >> > >>>> access
> >> > >>> token
> >> > >>>> upon finding one:
> >> > >>>>>   a. validates the incoming token
> >> > >>>>>   b. returns the AuthenticationToken as per
> >> > >>>>> AuthenticationHandler
> >> > >>>> contract
> >> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with
> >> > >>>>> the
> >> > >>> expected
> >> > >>>> token
> >> > >>>>>   d. serves requested resource for valid tokens
> >> > >>>>>   e. subsequent requests are handled by the
> >> > >>>>> AuthenticationFilter
> >> > >>>> recognition of the hadoop auth cookie
> >> > >>>>>
> >> > >>>>> USECASE UI-2 Federation/SAML:
> >> > >>>>> For the federation usecase:
> >> > >>>>> 1. User's browser requests access to a UI console page 2.
> >> > >>>>> WebSSOAuthenticationHandler intercepts the request and
> >> > >>>>> redirects
> >> > >> the
> >> > >>>> browser to an SP web endpoint exposed by the
> >> > >>>> AuthenticationServer
> >> > >> passing
> >> > >>>> the requested url as the redirect_url. This endpoint:
> >> > >>>>>   a. is dedicated to redirecting to the external IdP passing
> >> > >>>>> the
> >> > >>>> required parameters which may include a redirect_url back to
> >> > >>>> itself
> >> as
> >> > >>> well
> >> > >>>> as encoding the original redirect_url so that it can determine
> >> > >>>> it on
> >> > >> the
> >> > >>>> way back to the client
> >> > >>>>> 3. the IdP:
> >> > >>>>>   a. challenges the user for credentials and authenticates the
> user
> >> > >>>>>   b. creates appropriate token/cookie and redirects back to
> >> > >>>>> the
> >> > >>>> AuthenticationServer endpoint
> >> > >>>>> 4. AuthenticationServer endpoint:
> >> > >>>>>   a. extracts the expected token/cookie from the incoming
> >> > >>>>> request
> >> > >> and
> >> > >>>> validates it
> >> > >>>>>   b. creates a hadoop id_token
> >> > >>>>>   c. acquires a hadoop access token for the id_token
> >> > >>>>>   d. creates appropriate cookie and redirects back to the
> >> > >>>>> original
> >> > >>>> redirect_url - being the requested resource
> >> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> >> > >>> interrogates
> >> > >>>> the incoming request again for an authcookie that contains an
> >> > >>>> access
> >> > >>> token
> >> > >>>> upon finding one:
> >> > >>>>>   a. validates the incoming token
> >> > >>>>>   b. returns the AuthenticationToken as per
> >> > >>>>> AuthenticationHandler
> >> > >>>> contrac
> >> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with
> >> > >>>>> the
> >> > >>> expected
> >> > >>>> token
> >> > >>>>>   d. serves requested resource for valid tokens
> >> > >>>>>   e. subsequent requests are handled by the
> >> > >>>>> AuthenticationFilter
> >> > >>>> recognition of the hadoop auth cookie
> >> > >>>>> REQUIRED COMPONENTS for UI USECASES:
> >> > >>>>> COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web
> >> > >>>>> Endpoint within AuthenticationServer for FORM
> >> based
> >> > >>>> login
> >> > >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd
> >> > >>>>> party
> >> > >>> token
> >> > >>>> federation
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
> >> > >> Brian.Swan@microsoft.com>
> >> > >>>> wrote:
> >> > >>>>> Thanks, Larry. That is what I was trying to say, but you've
> >> > >>>>> said it
> >> > >>>> better and in more detail. :-) To extract from what you are
> saying:
> >> > "If
> >> > >>> we
> >> > >>>> were to reframe the immediate scope to the lowest common
> >> > >>>> denominator
> >> > of
> >> > >>>> what is needed for accepting tokens in authentication plugins
> >> > >>>> then
> >> we
> >> > >>>> gain... an end-state for the lowest common denominator that
> >> > >>>> enables
> >> > >> code
> >> > >>>> patches in the near-term is the best of both worlds."
> >> > >>>>>
> >> > >>>>> -Brian
> >> > >>>>>
> >> > >>>>> -----Original Message-----
> >> > >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
> >> > >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
> >> > >>>>> To: common-dev@hadoop.apache.org
> >> > >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> >> > >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >> > >>>>>
> >> > >>>>> It seems to me that we can have the best of both worlds
> >> > >>>>> here...it's
> >> > >> all
> >> > >>>> about the scoping.
> >> > >>>>>
> >> > >>>>> If we were to reframe the immediate scope to the lowest
> >> > >>>>> common
> >> > >>>> denominator of what is needed for accepting tokens in
> >> > >>>> authentication plugins then we gain:
> >> > >>>>>
> >> > >>>>> 1. a very manageable scope to define and agree upon 2. a
> >> deliverable
> >> > >>>> that should be useful in and of itself 3. a foundation for
> >> > >>>> community collaboration that we build on for higher level
> >> > >>>> solutions built on
> >> > this
> >> > >>>> lowest common denominator and experience as a working
> >> > >>>> community
> >> > >>>>>
> >> > >>>>> So, to Alejandro's point, perhaps we need to define what
> >> > >>>>> would make
> >> > >> #2
> >> > >>>> above true - this could serve as the "what" we are building
> >> > >>>> instead
> >> of
> >> > >>> the
> >> > >>>> "how" to build it.
> >> > >>>>> Including:
> >> > >>>>> a. project structure within
> >> > >>>>> hadoop-common-project/common-security
> >> or
> >> > >>> the
> >> > >>>> like b. the usecases that would need to be enabled to make it
> >> > >>>> a self contained and useful contribution - without higher
> >> > >>>> level solutions
> >> c.
> >> > >> the
> >> > >>>> JIRA/s for contributing patches d. what specific patches will
> >> > >>>> be
> >> > needed
> >> > >>> to
> >> > >>>> accomplished the usecases in #b
> >> > >>>>>
> >> > >>>>> In other words, an end-state for the lowest common
> >> > >>>>> denominator that
> >> > >>>> enables code patches in the near-term is the best of both worlds.
> >> > >>>>>
> >> > >>>>> I think this may be a good way to bootstrap the collaboration
> >> process
> >> > >>>> for our emerging security community rather than trying to
> >> > >>>> tackle a
> >> > huge
> >> > >>>> vision all at once.
> >> > >>>>>
> >> > >>>>> @Alejandro - if you have something else in mind that would
> >> bootstrap
> >> > >>>> this process - that would great - please advise.
> >> > >>>>>
> >> > >>>>> thoughts?
> >> > >>>>>
> >> > >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan
> >> > >>>>> <Br...@microsoft.com>
> >> > >>>> wrote:
> >> > >>>>>
> >> > >>>>>> Hi Alejandro, all-
> >> > >>>>>>
> >> > >>>>>> There seems to be agreement on the broad stroke description
> >> > >>>>>> of the
> >> > >>>> components needed to achieve pluggable token authentication
> >> > >>>> (I'm
> >> sure
> >> > >>> I'll
> >> > >>>> be corrected if that isn't the case). However, discussion of
> >> > >>>> the
> >> > >> details
> >> > >>> of
> >> > >>>> those components doesn't seem to be moving forward. I think
> >> > >>>> this is
> >> > >>> because
> >> > >>>> the details are really best understood through code. I also
> >> > >>>> see *a*
> >> > >> (i.e.
> >> > >>>> one of many possible) token format and pluggable
> >> > >>>> authentication
> >> > >>> mechanisms
> >> > >>>> within the RPC layer as components that can have immediate
> >> > >>>> benefit
> >> to
> >> > >>>> Hadoop users AND still allow flexibility in the larger design.
> >> > >>>> So, I
> >> > >>> think
> >> > >>>> the best way to move the conversation of "what we are aiming for"
> >> > >> forward
> >> > >>>> is to start looking at code for these components. I am
> >> > >>>> especially interested in moving forward with pluggable
> >> > >>>> authentication
> >> mechanisms
> >> > >>>> within the RPC layer and would love to see what others have
> >> > >>>> done in
> >> > >> this
> >> > >>>> area (if anything).
> >> > >>>>>>
> >> > >>>>>> Thanks.
> >> > >>>>>>
> >> > >>>>>> -Brian
> >> > >>>>>>
> >> > >>>>>> -----Original Message-----
> >> > >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> >> > >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
> >> > >>>>>> To: Larry McCay
> >> > >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai
> >> > >>>>>> Zheng
> >> > >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >> > >>>>>>
> >> > >>>>>> Larry, all,
> >> > >>>>>>
> >> > >>>>>> Still is not clear to me what is the end state we are aiming
> >> > >>>>>> for,
> >> > >> or
> >> > >>>> that we even agree on that.
> >> > >>>>>>
> >> > >>>>>> IMO, Instead trying to agree what to do, we should first
> >> > >>>>>> agree on
> >> > >>> the
> >> > >>>> final state, then we see what should be changed to there
> >> > >>>> there, then
> >> > we
> >> > >>> see
> >> > >>>> how we change things to get there.
> >> > >>>>>>
> >> > >>>>>> The different documents out there focus more on how.
> >> > >>>>>>
> >> > >>>>>> We not try to say how before we know what.
> >> > >>>>>>
> >> > >>>>>> Thx.
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
> >> > >> lmccay@hortonworks.com
> >> > >>>>
> >> > >>>> wrote:
> >> > >>>>>>
> >> > >>>>>>> All -
> >> > >>>>>>>
> >> > >>>>>>> After combing through this thread - as well as the summit
> >> > >>>>>>> session summary thread, I think that we have the following
> >> > >>>>>>> two items that
> >> > >> we
> >> > >>>>>>> can probably move forward with:
> >> > >>>>>>>
> >> > >>>>>>> 1. TokenAuth method - assuming this means the pluggable
> >> > >>>>>>> authentication mechanisms within the RPC layer (2 votes:
> >> > >>>>>>> Kai and
> >> > >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and
> >> myself)
> >> > >>>>>>>
> >> > >>>>>>> I propose that we attack both of these aspects as one.
> >> > >>>>>>> Let's
> >> > >> provide
> >> > >>>>>>> the structure and interfaces of the pluggable framework for
> >> > >>>>>>> use
> >> in
> >> > >>>>>>> the RPC layer through leveraging Daryn's pluggability work
> >> > >>>>>>> and
> >> POC
> >> > >>> it
> >> > >>>>>>> with a particular token format (not necessarily the only
> >> > >>>>>>> format
> >> > >> ever
> >> > >>>>>>> supported - we just need one to start). If there has
> >> > >>>>>>> already been work done in this area by anyone then please
> >> > >>>>>>> speak up and commit
> >> > >> to
> >> > >>>>>>> providing a patch - so that we don't duplicate effort.
> >> > >>>>>>>
> >> > >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we
> >> > >>>>>>> can
> >> > >> look
> >> > >>>>>>> at to discern the pluggability mechanism details?
> >> > >>>>>>> Documentation
> >> of
> >> > >>> it
> >> > >>>>>>> would be great as well.
> >> > >>>>>>> @Kai - do you have existing code for the pluggable token
> >> > >>>>>>> authentication mechanism - if not, we can take a stab at
> >> > >>> representing
> >> > >>>>>>> it with interfaces and/or POC code.
> >> > >>>>>>> I can standup and say that we have a token format that we
> >> > >>>>>>> have
> >> > >> been
> >> > >>>>>>> working with already and can provide a patch that
> >> > >>>>>>> represents it
> >> > >> as a
> >> > >>>>>>> contribution to test out the pluggable tokenAuth.
> >> > >>>>>>>
> >> > >>>>>>> These patches will provide progress toward code being the
> >> > >>>>>>> central discussion vehicle. As a community, we can then
> >> > >>>>>>> incrementally
> >> > >> build
> >> > >>>>>>> on that foundation in order to collaboratively deliver the
> >> > >>>>>>> common
> >> > >>>> vision.
> >> > >>>>>>>
> >> > >>>>>>> In the absence of any other home for posting such patches,
> >> > >>>>>>> let's assume that they will be attached to HADOOP-9392 - or
> >> > >>>>>>> a dedicated subtask for this particular aspect/s - I will
> >> > >>>>>>> leave that detail
> >> to
> >> > >>>> Kai.
> >> > >>>>>>>
> >> > >>>>>>> @Alejandro, being the only voice on this thread that isn't
> >> > >>>>>>> represented in the votes above, please feel free to agree
> >> > >>>>>>> or
> >> > >>> disagree
> >> > >>>> with this direction.
> >> > >>>>>>>
> >> > >>>>>>> thanks,
> >> > >>>>>>>
> >> > >>>>>>> --larry
> >> > >>>>>>>
> >> > >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay
> >> > >>>>>>> <lm...@hortonworks.com>
> >> > >>>> wrote:
> >> > >>>>>>>
> >> > >>>>>>>> Hi Andy -
> >> > >>>>>>>>
> >> > >>>>>>>>> Happy Fourth of July to you and yours.
> >> > >>>>>>>>
> >> > >>>>>>>> Same to you and yours. :-) We had some fun in the sun for
> >> > >>>>>>>> a change - we've had nothing but
> >> > >>> rain
> >> > >>>>>>>> on
> >> > >>>>>>> the east coast lately.
> >> > >>>>>>>>
> >> > >>>>>>>>> My concern here is there may have been a
> >> > >>>>>>>>> misinterpretation or
> >> > >> lack
> >> > >>>>>>>>> of consensus on what is meant by "clean slate"
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> Apparently so.
> >> > >>>>>>>> On the pre-summit call, I stated that I was interested in
> >> > >>>>>>>> reconciling
> >> > >>>>>>> the jiras so that we had one to work from.
> >> > >>>>>>>>
> >> > >>>>>>>> You recommended that we set them aside for the time being
> >> > >>>>>>>> - with
> >> > >>> the
> >> > >>>>>>> understanding that work would continue on your side (and
> >> > >>>>>>> our's as
> >> > >>>>>>> well) - and approach the community discussion from a clean
> slate.
> >> > >>>>>>>> We seemed to do this at the summit session quite well.
> >> > >>>>>>>> It was my understanding that this community discussion
> >> > >>>>>>>> would
> >> live
> >> > >>>>>>>> beyond
> >> > >>>>>>> the summit and continue on this list.
> >> > >>>>>>>>
> >> > >>>>>>>> While closing the summit session we agreed to follow up on
> >> > >>>>>>>> common-dev
> >> > >>>>>>> with first a summary then a discussion of the moving parts.
> >> > >>>>>>>>
> >> > >>>>>>>> I never expected the previous work to be abandoned and
> >> > >>>>>>>> fully expected it
> >> > >>>>>>> to inform the discussion that happened here.
> >> > >>>>>>>>
> >> > >>>>>>>> If you would like to reframe what clean slate was supposed
> >> > >>>>>>>> to
> >> > >> mean
> >> > >>>>>>>> or
> >> > >>>>>>> describe what it means now - that would be welcome - before
> >> > >>>>>>> I
> >> > >> waste
> >> > >>>>>>> anymore time trying to facilitate a community discussion
> >> > >>>>>>> that is apparently not wanted.
> >> > >>>>>>>>
> >> > >>>>>>>>> Nowhere in this
> >> > >>>>>>>>> picture are self appointed "master JIRAs" and such, which
> >> > >>>>>>>>> have
> >> > >>> been
> >> > >>>>>>>>> disappointing to see crop up, we should be
> >> > >>>>>>>>> collaboratively
> >> > >> coding
> >> > >>>>>>>>> not planting flags.
> >> > >>>>>>>>
> >> > >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
> >> > >>>>>>>> It has certainly not been anyone's intention to disappoint.
> >> > >>>>>>>> Any mention of a new JIRA was just to have a clear context
> >> > >>>>>>>> to
> >> > >>> gather
> >> > >>>>>>>> the
> >> > >>>>>>> agreed upon points - previous and/or existing JIRAs would
> >> > >>>>>>> easily
> >> > >> be
> >> > >>>> linked.
> >> > >>>>>>>>
> >> > >>>>>>>> Planting flags... I need to go back and read my discussion
> >> > >>>>>>>> point about the
> >> > >>>>>>> JIRA and see how this is the impression that was made.
> >> > >>>>>>>> That is not how I define success. The only flags that
> >> > >>>>>>>> count is
> >> > >>> code.
> >> > >>>>>>> What we are lacking is the roadmap on which to put the code.
> >> > >>>>>>>>
> >> > >>>>>>>>> I read Kai's latest document as something approaching
> >> > >>>>>>>>> today's consensus
> >> > >>>>>>> (or
> >> > >>>>>>>>> at least a common point of view?) rather than a
> >> > >>>>>>>>> historical
> >> > >>> document.
> >> > >>>>>>>>> Perhaps he and it can be given equal share of the
> >> consideration.
> >> > >>>>>>>>
> >> > >>>>>>>> I definitely read it as something that has evolved into
> >> something
> >> > >>>>>>> approaching what we have been talking about so far. There
> >> > >>>>>>> has not however been enough discussion anywhere near the
> >> > >>>>>>> level of detail
> >> > >> in
> >> > >>>>>>> that document and more details are needed for each
> >> > >>>>>>> component in
> >> > >> the
> >> > >>>> design.
> >> > >>>>>>>> Why the work in that document should not be fed into the
> >> > >> community
> >> > >>>>>>> discussion as anyone else's would be - I fail to understand.
> >> > >>>>>>>>
> >> > >>>>>>>> My suggestion continues to be that you should take that
> >> > >>>>>>>> document
> >> > >>> and
> >> > >>>>>>> speak to the inventory of moving parts as we agreed.
> >> > >>>>>>>> As these are agreed upon, we will ensure that the
> >> > >>>>>>>> appropriate subtasks
> >> > >>>>>>> are filed against whatever JIRA is to host them - don't
> >> > >>>>>>> really
> >> > >> care
> >> > >>>>>>> much which it is.
> >> > >>>>>>>>
> >> > >>>>>>>> I don't really want to continue with two separate JIRAs -
> >> > >>>>>>>> as I stated
> >> > >>>>>>> long ago - but until we understand what the pieces are and
> >> > >>>>>>> how
> >> > >> they
> >> > >>>>>>> relate then they can't be consolidated.
> >> > >>>>>>>> Even if 9533 ended up being repurposed as the server
> >> > >>>>>>>> instance of
> >> > >>> the
> >> > >>>>>>> work - it should be a subtask of a larger one - if that is
> >> > >>>>>>> to be 9392, so be it.
> >> > >>>>>>>> We still need to define all the pieces of the larger
> >> > >>>>>>>> picture
> >> > >> before
> >> > >>>>>>>> that
> >> > >>>>>>> can be done.
> >> > >>>>>>>>
> >> > >>>>>>>> What I thought was the clean slate approach to the
> >> > >>>>>>>> discussion
> >> > >>> seemed
> >> > >>>>>>>> a
> >> > >>>>>>> very reasonable way to make all this happen.
> >> > >>>>>>>> If you would like to restate what you intended by it or
> >> something
> >> > >>>>>>>> else
> >> > >>>>>>> equally as reasonable as a way to move forward that would
> >> > >>>>>>> be
> >> > >>> awesome.
> >> > >>>>>>>>
> >> > >>>>>>>> I will be happy to work toward the roadmap with everyone
> >> > >>>>>>>> once it
> >> > >> is
> >> > >>>>>>> articulated, understood and actionable.
> >> > >>>>>>>> In the meantime, I have work to do.
> >> > >>>>>>>>
> >> > >>>>>>>> thanks,
> >> > >>>>>>>>
> >> > >>>>>>>> --larry
> >> > >>>>>>>>
> >> > >>>>>>>> BTW - I meant to quote you in an earlier response and
> >> > >>>>>>>> ended up saying it
> >> > >>>>>>> was Aaron instead. Not sure what happened there. :-)
> >> > >>>>>>>>
> >> > >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell
> >> > >>>>>>>> <apurtell@apache.org
> >> >
> >> > >>>> wrote:
> >> > >>>>>>>>
> >> > >>>>>>>>> Hi Larry (and all),
> >> > >>>>>>>>>
> >> > >>>>>>>>> Happy Fourth of July to you and yours.
> >> > >>>>>>>>>
> >> > >>>>>>>>> In our shop Kai and Tianyou are already doing the coding,
> >> > >>>>>>>>> so
> >> I'd
> >> > >>>>>>>>> defer
> >> > >>>>>>> to
> >> > >>>>>>>>> them on the detailed points.
> >> > >>>>>>>>>
> >> > >>>>>>>>> My concern here is there may have been a
> >> > >>>>>>>>> misinterpretation or
> >> > >> lack
> >> > >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully
> >> > >>>>>>>>> that
> >> > >> can
> >> > >>>>>>>>> be
> >> > >>>>>>> quickly
> >> > >>>>>>>>> cleared up. Certainly we did not mean ignore all that
> >> > >>>>>>>>> came
> >> > >> before.
> >> > >>>>>>>>> The
> >> > >>>>>>> idea
> >> > >>>>>>>>> was to reset discussions to find common ground and new
> >> direction
> >> > >>>>>>>>> where
> >> > >>>>>>> we
> >> > >>>>>>>>> are working together, not in conflict, on an agreed upon
> >> > >>>>>>>>> set of design points and tasks. There's been a lot of
> >> > >>>>>>>>> good discussion
> >> > >> and
> >> > >>>>>>>>> design preceeding that we should figure out how to port
> over.
> >> > >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs"
> >> > >>>>>>>>> and
> >> > >>> such,
> >> > >>>>>>>>> which have been disappointing to see crop up, we should
> >> > >>>>>>>>> be collaboratively coding not planting flags.
> >> > >>>>>>>>>
> >> > >>>>>>>>> I read Kai's latest document as something approaching
> >> > >>>>>>>>> today's consensus
> >> > >>>>>>> (or
> >> > >>>>>>>>> at least a common point of view?) rather than a
> >> > >>>>>>>>> historical
> >> > >>> document.
> >> > >>>>>>>>> Perhaps he and it can be given equal share of the
> >> consideration.
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> >> > >>>>>>>>>
> >> > >>>>>>>>>> Hey Andrew -
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> I largely agree with that statement.
> >> > >>>>>>>>>> My intention was to let the differences be worked out
> >> > >>>>>>>>>> within
> >> > >> the
> >> > >>>>>>>>>> individual components once they were identified and
> >> > >>>>>>>>>> subtasks
> >> > >>>> created.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> My reference to HSSO was really referring to a SSO
> >> > >>>>>>>>>> *server*
> >> > >> based
> >> > >>>>>>> design
> >> > >>>>>>>>>> which was not clearly articulated in the earlier documents.
> >> > >>>>>>>>>> We aren't trying to compare and contrast one design over
> >> > >> another
> >> > >>>>>>> anymore.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> Let's move this collaboration along as we've mapped out
> >> > >>>>>>>>>> and
> >> the
> >> > >>>>>>>>>> differences in the details will reveal themselves and be
> >> > >>> addressed
> >> > >>>>>>> within
> >> > >>>>>>>>>> their components.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> I've actually been looking forward to you weighing in on
> >> > >>>>>>>>>> the actual discussion points in this thread.
> >> > >>>>>>>>>> Could you do that?
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> At this point, I am most interested in your thoughts on
> >> > >>>>>>>>>> a
> >> > >> single
> >> > >>>>>>>>>> jira
> >> > >>>>>>> to
> >> > >>>>>>>>>> represent all of this work and whether we should start
> >> > >> discussing
> >> > >>>>>>>>>> the
> >> > >>>>>>> SSO
> >> > >>>>>>>>>> Tokens.
> >> > >>>>>>>>>> If you think there are discussion points missing from
> >> > >>>>>>>>>> that
> >> > >> list,
> >> > >>>>>>>>>> feel
> >> > >>>>>>> free
> >> > >>>>>>>>>> to add to it.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> thanks,
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> --larry
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
> >> > >> apurtell@apache.org>
> >> > >>>>>>> wrote:
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>> Hi Larry,
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let
> >> > >>>>>>>>>>> me
> >> > >> point
> >> > >>>>>>>>>>> out
> >> > >>>>>>> that,
> >> > >>>>>>>>>>> while the differences between the competing JIRAs have
> >> > >>>>>>>>>>> been reduced
> >> > >>>>>>> for
> >> > >>>>>>>>>>> sure, there were some key differences that didn't just
> >> > >>> disappear.
> >> > >>>>>>>>>>> Subsequent discussion will make that clear. I also
> >> > >>>>>>>>>>> disagree
> >> > >> with
> >> > >>>>>>>>>>> your characterization that we have simply endorsed all
> >> > >>>>>>>>>>> of the design
> >> > >>>>>>> decisions
> >> > >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an
> >> > >>>>>>>>>>> inch. We
> >> > >>> are
> >> > >>>>>>> here to
> >> > >>>>>>>>>>> engage in a collaborative process as peers. I've been
> >> > >> encouraged
> >> > >>>>>>>>>>> by
> >> > >>>>>>> the
> >> > >>>>>>>>>>> spirit of the discussions up to this point and hope
> >> > >>>>>>>>>>> that can continue beyond one design summit.
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> >> > >>>>>>>>>>> <lm...@hortonworks.com>
> >> > >>>>>>>>>> wrote:
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>>> Hi Kai -
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> I think that I need to clarify something...
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> This is not an update for 9533 but a continuation of
> >> > >>>>>>>>>>>> the discussions
> >> > >>>>>>>>>> that
> >> > >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
> >> > >>>>>>>>>>>> We've agreed to leave our previous designs behind and
> >> > >> therefore
> >> > >>>>>>>>>>>> we
> >> > >>>>>>>>>> aren't
> >> > >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS
> >> > >>>>>>>>>>>> approach
> >> or
> >> > >>> an
> >> > >>>>>>> HSSO vs
> >> > >>>>>>>>>>>> TAS discussion.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> Your latest design revision actually makes it clear
> >> > >>>>>>>>>>>> that you
> >> > >>> are
> >> > >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
> >> > >> comparing
> >> > >>>>>>>>>>>> and
> >> > >>>>>>>>>> contrasting
> >> > >>>>>>>>>>>> is not going to add any value.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> What we need you to do at this point, is to look at
> >> > >>>>>>>>>>>> those high-level components described on this thread
> >> > >>>>>>>>>>>> and comment
> >> on
> >> > >>>>>>>>>>>> whether we need additional components or any that are
> >> > >>>>>>>>>>>> listed that don't seem
> >> > >>>>>>> necessary
> >> > >>>>>>>>>> to
> >> > >>>>>>>>>>>> you and why.
> >> > >>>>>>>>>>>> In other words, we need to define and agree on the
> >> > >>>>>>>>>>>> work that
> >> > >>> has
> >> > >>>>>>>>>>>> to
> >> > >>>>>>> be
> >> > >>>>>>>>>>>> done.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> We also need to determine those components that need
> >> > >>>>>>>>>>>> to be
> >> > >> done
> >> > >>>>>>> before
> >> > >>>>>>>>>>>> anything else can be started.
> >> > >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens
> >> > >>>>>>>>>>>> are central to
> >> > >>>>>>>>>> all
> >> > >>>>>>>>>>>> the other components and should probably be defined
> >> > >>>>>>>>>>>> and
> >> POC'd
> >> > >>> in
> >> > >>>>>>> short
> >> > >>>>>>>>>>>> order.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> Personally, I think that continuing the separation of
> >> > >>>>>>>>>>>> 9533
> >> > >> and
> >> > >>>>>>>>>>>> 9392
> >> > >>>>>>> will
> >> > >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be
> >> > >>>>>>>>>>>> enough
> >> > >>>>>>> differences
> >> > >>>>>>>>>>>> between the two to justify separate jiras anymore. It
> >> > >>>>>>>>>>>> may be best to
> >> > >>>>>>>>>> file a
> >> > >>>>>>>>>>>> new one that reflects a single vision without the
> >> > >>>>>>>>>>>> extra
> >> cruft
> >> > >>>>>>>>>>>> that
> >> > >>>>>>> has
> >> > >>>>>>>>>>>> built up in either of the existing ones. We would
> >> > >>>>>>>>>>>> certainly reference
> >> > >>>>>>>>>> the
> >> > >>>>>>>>>>>> existing ones within the new one. This approach would
> >> > >>>>>>>>>>>> align
> >> > >>> with
> >> > >>>>>>>>>>>> the
> >> > >>>>>>>>>> spirit
> >> > >>>>>>>>>>>> of the discussions up to this point.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> I am prepared to start a discussion around the shape
> >> > >>>>>>>>>>>> of the
> >> > >> two
> >> > >>>>>>> Hadoop
> >> > >>>>>>>>>> SSO
> >> > >>>>>>>>>>>> tokens: identity and access. If this is what others
> >> > >>>>>>>>>>>> feel the next
> >> > >>>>>>> topic
> >> > >>>>>>>>>>>> should be.
> >> > >>>>>>>>>>>> If we can identify a jira home for it, we can do it
> >> > >>>>>>>>>>>> there -
> >> > >>>>>>> otherwise we
> >> > >>>>>>>>>>>> can create another DISCUSS thread for it.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> thanks,
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> --larry
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
> >> > >> kai.zheng@intel.com>
> >> > >>>>>>> wrote:
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>>> Hi Larry,
> >> > >>>>>>>>>>>>>
> >> > >>>>>>>>>>>>> Thanks for the update. Good to see that with this
> >> > >>>>>>>>>>>>> update we
> >> > >>> are
> >> > >>>>>>>>>>>>> now
> >> > >>>>>>>>>>>> aligned on most points.
> >> > >>>>>>>>>>>>>
> >> > >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392.
> >> The
> >> > >>>>>>>>>>>>> new
> >> > >>>>>>>>>>>> revision incorporates feedback and suggestions in
> >> > >>>>>>>>>>>> related discussion
> >> > >>>>>>>>>> with
> >> > >>>>>>>>>>>> the community, particularly from Microsoft and others
> >> > >> attending
> >> > >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
> >> > >>> Summary
> >> > >>>>>>>>>>>> of the
> >> > >>>>>>>>>> changes:
> >> > >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens,
> Identity
> >> > >>> Token
> >> > >>>>>>> plus
> >> > >>>>>>>>>>>> Access Token, particularly considering our
> >> > >>>>>>>>>>>> authorization framework
> >> > >>>>>>> and
> >> > >>>>>>>>>>>> compatibility with HSSO;
> >> > >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
> >> > >>>> authorization
> >> > >>>>>>>>>>>> framework into the flow that issues access tokens for
> >> clients
> >> > >>>>>>>>>>>> with
> >> > >>>>>>>>>> identity
> >> > >>>>>>>>>>>> tokens to access services;
> >> > >>>>>>>>>>>>> 3.    Refined proxy access token and the
> >> proxy/impersonation
> >> > >>>> flow;
> >> > >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access
> to
> >> > >>>> Hadoop
> >> > >>>>>>> web
> >> > >>>>>>>>>>>> services;
> >> > >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> --
> >> > >>>>>>>>> Best regards,
> >> > >>>>>>>>>
> >> > >>>>>>>>> - Andy
> >> > >>>>>>>>>
> >> > >>>>>>>>> Problems worthy of attack prove their worth by hitting
> >> > >>>>>>>>> back. -
> >> > >>> Piet
> >> > >>>>>>>>> Hein (via Tom White)
> >> > >>>>>>>>
> >> > >>>>>>>
> >> > >>>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> --
> >> > >>>>>> Alejandro
> >> > >>>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
> >> > >>>>
> >> > >>>>
> >> > >>>
> >> > >>>
> >> > >>> --
> >> > >>> Alejandro
> >> > >>>
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Alejandro
> >> >
> >> >
> >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or
> > entity to which it is addressed and may contain information that is
> > confidential, privileged and exempt from disclosure under applicable
> > law. If the reader of this message is not the intended recipient, you
> > are hereby notified that any printing, copying, dissemination,
> > distribution, disclosure or forwarding of this communication is
> > strictly prohibited. If you have received this communication in error,
> > please contact the sender immediately and delete it from your system.
> Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by "Zheng, Kai" <ka...@intel.com>.

This looks good and reasonable to me. Thanks Chris.

-----Original Message-----
From: Chris Douglas [mailto:cdouglas@apache.org] 
Sent: Wednesday, September 04, 2013 6:45 AM
To: common-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay <lm...@hortonworks.com> wrote:
> One outstanding question for me - how do we go about getting the 
> branches created?

Once a group has converged on a purpose- ideally with some initial code from JIRA- please go ahead and create the feature branch in svn.
There's no ceremony. -C

> On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
>
>> Near the bottom of the bylaws, it states that addition of a "New 
>> Branch Committer" requires "Lazy consensus of active PMC members."  I 
>> think this means that you'll need to get a PMC member to sponsor the vote for you.
>>  Regular committer votes happen on the private PMC mailing list, and 
>> I assume it would be the same for a branch committer vote.
>>
>> http://hadoop.apache.org/bylaws.html
>>
>> Chris Nauroth
>> Hortonworks
>> http://hortonworks.com/
>>
>>
>>
>> On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com>
>> wrote:
>>
>> > That sounds perfect!
>> > I have been thinking of late that we would maybe need an incubator
>> project
>> > or something for this - which would be unfortunate.
>> >
>> > This would allow us to move much more quickly with a set of patches
>> broken
>> > up into consumable/understandable chunks that are made functional 
>> > more easily within the branch.
>> > I assume that we need to start a separate thread for DISCUSS or 
>> > VOTE to start that process - correct?
>> >
>> > On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com>
>> wrote:
>> >
>> > > yep, that is what I meant. Thanks Chris
>> > >
>> > >
>> > > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <
>> cnauroth@hortonworks.com
>> > >wrote:
>> > >
>> > >> Perhaps this is also a good opportunity to try out the new 
>> > >> "branch committers" clause in the bylaws, enabling 
>> > >> non-committers who are
>> > working
>> > >> on this to commit to the feature branch.
>> > >>
>> > >>
>> > >>
>> >
>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%
>> 3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%
>> 3E
>> > >>
>> > >> Chris Nauroth
>> > >> Hortonworks
>> > >> http://hortonworks.com/
>> > >>
>> > >>
>> > >>
>> > >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur 
>> > >> <tucu@cloudera.com
>> > >>> wrote:
>> > >>
>> > >>> Larry,
>> > >>>
>> > >>> Sorry for the delay answering. Thanks for laying down things, 
>> > >>> yes, it
>> > >> makes
>> > >>> sense.
>> > >>>
>> > >>> Given the large scope of the changes, number of JIRAs and 
>> > >>> number of developers involved, wouldn't make sense to create a 
>> > >>> feature branch
>> for
>> > >> all
>> > >>> this work not to destabilize (more ;) trunk?
>> > >>>
>> > >>> Thanks again.
>> > >>>
>> > >>>
>> > >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay 
>> > >>> <lmccay@hortonworks.com
>> >
>> > >>> wrote:
>> > >>>
>> > >>>> The following JIRA was filed to provide a token and basic 
>> > >>>> authority implementation for this effort:
>> > >>>> https://issues.apache.org/jira/browse/HADOOP-9781
>> > >>>>
>> > >>>> I have attached an initial patch though have yet to submit it 
>> > >>>> as one
>> > >>> since
>> > >>>> it is dependent on the patch for CMF that was posted to:
>> > >>>> https://issues.apache.org/jira/browse/HADOOP-9534
>> > >>>> and this patch still has a couple outstanding issues - javac
>> warnings
>> > >> for
>> > >>>> com.sun classes for certification generation and 11 javadoc
>> warnings.
>> > >>>>
>> > >>>> Please feel free to review the patches and raise any questions 
>> > >>>> or
>> > >>> concerns
>> > >>>> related to them.
>> > >>>>
>> > >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay 
>> > >>>> <lm...@hortonworks.com>
>> > >> wrote:
>> > >>>>
>> > >>>>> Hello All -
>> > >>>>>
>> > >>>>> In an effort to scope an initial iteration that provides 
>> > >>>>> value to
>> the
>> > >>>> community while focusing on the pluggable authentication 
>> > >>>> aspects,
>> I've
>> > >>>> written a description for "Iteration 1". It identifies the 
>> > >>>> goal of
>> the
>> > >>>> iteration, the endstate and a set of initial usecases. It also
>> > >> enumerates
>> > >>>> the components that are required for each usecase. There is a 
>> > >>>> scope
>> > >>> section
>> > >>>> that details specific things that should be kept out of the 
>> > >>>> first iteration. This is certainly up for discussion. There 
>> > >>>> may be some of
>> > >>> these
>> > >>>> things that can be contributed in short order. If we can add 
>> > >>>> some
>> > >> things
>> > >>> in
>> > >>>> without unnecessary complexity for the identified usecases 
>> > >>>> then we
>> > >>> should.
>> > >>>>>
>> > >>>>> @Alejandro - please review this and see whether it satisfies 
>> > >>>>> your
>> > >> point
>> > >>>> for a definition of what we are building.
>> > >>>>>
>> > >>>>> In addition to the document that I will paste here as text 
>> > >>>>> and
>> > >> attach a
>> > >>>> pdf version, we have a couple patches for components that are
>> > >> identified
>> > >>> in
>> > >>>> the document.
>> > >>>>> Specifically, COMP-7 and COMP-8.
>> > >>>>>
>> > >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which 
>> > >>>>> was
>> > >> filed
>> > >>>> specifically for that functionality.
>> > >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as 
>> > >>>>> the
>> > >> token
>> > >>>> format and a basic JsonWebTokenAuthority that can issue and 
>> > >>>> verify
>> > >> these
>> > >>>> tokens.
>> > >>>>>
>> > >>>>> Since there is no JIRA for this yet, I will likely file a new 
>> > >>>>> JIRA
>> > >> for
>> > >>> a
>> > >>>> SSO token implementation.
>> > >>>>>
>> > >>>>> Both of these patches assume to be modules within
>> > >>>> hadoop-common/hadoop-common-project.
>> > >>>>> While they are relatively small, I think that they will be 
>> > >>>>> pulled
>> in
>> > >> by
>> > >>>> other modules such as hadoop-auth which would likely not want 
>> > >>>> a
>> > >>> dependency
>> > >>>> on something larger like
>> > >>> hadoop-common/hadoop-common-project/hadoop-common.
>> > >>>>>
>> > >>>>> This is certainly something that we should discuss within the
>> > >> community
>> > >>>> for this effort though - that being, exactly how to add these
>> > libraries
>> > >>> so
>> > >>>> that they are most easily consumed by existing projects.
>> > >>>>>
>> > >>>>> Anyway, the following is the Iteration-1 document - it is 
>> > >>>>> also
>> > >> attached
>> > >>>> as a pdf:
>> > >>>>>
>> > >>>>> Iteration 1: Pluggable User Authentication and Federation
>> > >>>>>
>> > >>>>> Introduction
>> > >>>>> The intent of this effort is to bootstrap the development of
>> > >> pluggable
>> > >>>> token-based authentication mechanisms to support certain goals 
>> > >>>> of enterprise authentication integrations. By restricting the 
>> > >>>> scope of
>> > >> this
>> > >>>> effort, we hope to provide immediate benefit to the community 
>> > >>>> while
>> > >>> keeping
>> > >>>> the initial contribution to a manageable size that can be 
>> > >>>> easily
>> > >>> reviewed,
>> > >>>> understood and extended with further development through 
>> > >>>> follow up
>> > >> JIRAs
>> > >>>> and related iterations.
>> > >>>>>
>> > >>>>> Iteration Endstate
>> > >>>>> Once complete, this effort will have extended the 
>> > >>>>> authentication
>> > >>>> mechanisms - for all client types - from the existing: Simple,
>> > Kerberos
>> > >>> and
>> > >>>> Plain (for RPC) to include LDAP authentication and SAML based
>> > >> federation.
>> > >>>> In addition, the ability to provide additional/custom 
>> > >>>> authentication mechanisms will be enabled for users to plug in 
>> > >>>> their preferred
>> > >>> mechanisms.
>> > >>>>>
>> > >>>>> Project Scope
>> > >>>>> The scope of this effort is a subset of the features covered 
>> > >>>>> by the
>> > >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort 
>> > >>>> concentrates
>> on
>> > >>>> enabling Hadoop to issue, accept/validate SSO tokens of its 
>> > >>>> own. The pluggable authentication mechanism within SASL/RPC 
>> > >>>> layer and the authentication filter pluggability for REST and 
>> > >>>> UI components will
>> be
>> > >>>> leveraged and extended to support the results of this effort.
>> > >>>>>
>> > >>>>> Out of Scope
>> > >>>>> In order to scope the initial deliverable as the minimally 
>> > >>>>> viable
>> > >>>> product, a handful of things have been simplified or left out 
>> > >>>> of
>> scope
>> > >>> for
>> > >>>> this effort. This is not meant to say that these aspects are 
>> > >>>> not
>> > useful
>> > >>> or
>> > >>>> not needed but that they are not necessary for this iteration. 
>> > >>>> We do however need to ensure that we don't do anything to 
>> > >>>> preclude adding
>> > >> them
>> > >>> in
>> > >>>> future iterations.
>> > >>>>> 1. Additional Attributes - the result of authentication will
>> continue
>> > >>> to
>> > >>>> use the existing hadoop tokens and identity representations.
>> > Additional
>> > >>>> attributes used for finer grained authorization decisions will 
>> > >>>> be
>> > added
>> > >>>> through follow-up efforts.
>> > >>>>> 2. Token revocation - the ability to revoke issued identity 
>> > >>>>> tokens
>> > >> will
>> > >>>> be added later
>> > >>>>> 3. Multi-factor authentication - this will likely require
>> additional
>> > >>>> attributes and is not necessary for this iteration.
>> > >>>>> 4. Authorization changes - we will require additional 
>> > >>>>> attributes
>> for
>> > >>> the
>> > >>>> fine-grained access control plans. This is not needed for this
>> > >> iteration.
>> > >>>>> 5. Domains - we assume a single flat domain for all users 6. 
>> > >>>>> Kinit alternative - we can leverage existing REST clients 
>> > >>>>> such
>> as
>> > >>>> cURL to retrieve tokens through authentication and federation 
>> > >>>> for
>> the
>> > >>> time
>> > >>>> being
>> > >>>>> 7. A specific authentication framework isn't really necessary
>> within
>> > >>> the
>> > >>>> REST endpoints for this iteration. If one is available then we 
>> > >>>> can
>> use
>> > >> it
>> > >>>> otherwise we can leverage existing things like Apache Shiro 
>> > >>>> within a servlet filter.
>> > >>>>>
>> > >>>>> In Scope
>> > >>>>> What is in scope for this effort is defined by the usecases
>> described
>> > >>>> below. Components required for supporting the usecases are
>> summarized
>> > >> for
>> > >>>> each client type. Each component is a candidate for a JIRA 
>> > >>>> subtask -
>> > >>> though
>> > >>>> multiple components are likely to be included in a JIRA to
>> represent a
>> > >>> set
>> > >>>> of functionality rather than individual JIRAs per component.
>> > >>>>>
>> > >>>>> Terminology and Naming
>> > >>>>> The terms and names of components within this document are 
>> > >>>>> merely
>> > >>>> descriptive of the functionality that they represent. Any 
>> > >>>> similarity
>> > or
>> > >>>> difference in names or terms from those that are found in 
>> > >>>> other
>> > >> documents
>> > >>>> are not intended to make any statement about those other 
>> > >>>> documents
>> or
>> > >> the
>> > >>>> descriptions within. This document represents the pluggable
>> > >>> authentication
>> > >>>> mechanisms and server functionality required to replace Kerberos.
>> > >>>>>
>> > >>>>> Ultimately, the naming of the implementation classes will be 
>> > >>>>> a
>> > >> product
>> > >>>> of the patches accepted by the community.
>> > >>>>>
>> > >>>>> Usecases:
>> > >>>>> client types: REST, CLI, UI
>> > >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
>> > >>>> federation/SAML
>> > >>>>>
>> > >>>>> Simple and Kerberos
>> > >>>>> Simple and Kerberos usecases continue to work as they do 
>> > >>>>> today. The
>> > >>>> addition of Authentication/LDAP and Federation/SAML are added
>> through
>> > >> the
>> > >>>> existing pluggability points either as they are or with 
>> > >>>> required
>> > >>> extension.
>> > >>>> Either way, continued support for Simple and Kerberos must not
>> require
>> > >>>> changes to existing deployments in the field as a result of 
>> > >>>> this
>> > >> effort.
>> > >>>>>
>> > >>>>> REST
>> > >>>>> USECASE REST-1 Authentication/LDAP:
>> > >>>>> For REST clients, we will provide the ability to:
>> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
>> exposed
>> > >> by
>> > >>>> an AuthenticationServer instance via REST calls to:
>> > >>>>>   a. authenticate - passing username/password returning a 
>> > >>>>> hadoop
>> > >>>> id_token
>> > >>>>>   b. get-access-token - from the TokenGrantingService by 
>> > >>>>> passing
>> the
>> > >>>> hadoop id_token as an Authorization: Bearer token along with 
>> > >>>> the
>> > >> desired
>> > >>>> service name (master service name) returning a hadoop access 
>> > >>>> token
>> > >>>>> 2. Successfully invoke a hadoop service REST API passing the 
>> > >>>>> hadoop
>> > >>>> access token through an HTTP header as an Authorization Bearer 
>> > >>>> token
>> > >>>>>   a. validation of the incoming token on the service endpoint 
>> > >>>>> is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>> 3. Successfully block access to a REST resource when 
>> > >>>>> presenting a
>> > >>> hadoop
>> > >>>> access token intended for a different service
>> > >>>>>   a. validation of the incoming token on the service endpoint 
>> > >>>>> is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>>
>> > >>>>> USECASE REST-2 Federation/SAML:
>> > >>>>> We will also provide federation capabilities for REST clients 
>> > >>>>> such
>> > >>> that:
>> > >>>>> 1. acquire SAML assertion token from a trusted IdP 
>> > >>>>> (shibboleth?)
>> and
>> > >>>> persist in a permissions protected file - ie.
>> > >> ~/.hadoop_tokens/.idp_token
>> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an 
>> > >>>>> SP
>> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
>> > instance
>> > >>> via
>> > >>>> REST calls to:
>> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
>> Bearer
>> > >>>> token returning a hadoop id_token
>> > >>>>>      - can copy and paste from commandline or use cat to 
>> > >>>>> include
>> > >>>> persisted token through "--Header Authorization: Bearer 'cat 
>> > >>>> ~/.hadoop_tokens/.id_token'"
>> > >>>>>   b. get-access-token - from the TokenGrantingService by 
>> > >>>>> passing
>> the
>> > >>>> hadoop id_token as an Authorization: Bearer token along with 
>> > >>>> the
>> > >> desired
>> > >>>> service name (master service name) to the TokenGrantingService
>> > >> returning
>> > >>> a
>> > >>>> hadoop access token
>> > >>>>> 3. Successfully invoke a hadoop service REST API passing the 
>> > >>>>> hadoop
>> > >>>> access token through an HTTP header as an Authorization Bearer 
>> > >>>> token
>> > >>>>>   a. validation of the incoming token on the service endpoint 
>> > >>>>> is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>> 4. Successfully block access to a REST resource when 
>> > >>>>> presenting a
>> > >>> hadoop
>> > >>>> access token intended for a different service
>> > >>>>>   a. validation of the incoming token on the service endpoint 
>> > >>>>> is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>>
>> > >>>>> REQUIRED COMPONENTS for REST USECASES:
>> > >>>>> COMP-1. REST client - cURL or similar COMP-2. REST endpoint 
>> > >>>>> for BASIC authentication to LDAP - IdP
>> endpoint
>> > >>>> example - returning hadoop id_token
>> > >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
>> > >>> shibboleth
>> > >>>> SP?|OpenSAML? - returning hadoop id_token
>> > >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring 
>> > >>>>> hadoop
>> access
>> > >>>> tokens from hadoop id_tokens
>> > >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop 
>> > >>>>> access
>> > >>>> tokens
>> > >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
>> > >>>>> COMP-7. hadoop token and authority implementations COMP-8. 
>> > >>>>> core services for crypto support for signing, verifying and
>> > >> PKI
>> > >>>> management
>> > >>>>>
>> > >>>>> CLI
>> > >>>>> USECASE CLI-1 Authentication/LDAP:
>> > >>>>> For CLI/RPC clients, we will provide the ability to:
>> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
>> exposed
>> > >> by
>> > >>>> an AuthenticationServer instance via REST calls to:
>> > >>>>>   a. authenticate - passing username/password returning a 
>> > >>>>> hadoop
>> > >>>> id_token
>> > >>>>>      - for RPC clients we need to persist the returned hadoop
>> > >> identity
>> > >>>> token in a file protected by fs permissions so that it may be
>> > leveraged
>> > >>>> until expiry
>> > >>>>>      - directing the returned response to a file may suffice 
>> > >>>>> for
>> now
>> > >>>> something like ">~/.hadoop_tokens/.id_token"
>> > >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
>> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL 
>> > >>>>> layer,
>> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token 
>> > >>>> is
>> passed
>> > >> as
>> > >>>> Authorization: Bearer token to the get-access-token REST 
>> > >>>> endpoint
>> > >> exposed
>> > >>>> by TokenGrantingService returning a hadoop access token
>> > >>>>>   b. RPC server side validates the presented hadoop access 
>> > >>>>> token
>> and
>> > >>>> continues to serve request
>> > >>>>>   c. Successfully invoke a hadoop service RPC API
>> > >>>>>
>> > >>>>> USECASE CLI-2 Federation/SAML:
>> > >>>>> For CLI/RPC clients, we will provide the ability to:
>> > >>>>> 1. acquire SAML assertion token from a trusted IdP 
>> > >>>>> (shibboleth?)
>> and
>> > >>>> persist in a permissions protected file - ie.
>> > >> ~/.hadoop_tokens/.idp_token
>> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an 
>> > >>>>> SP
>> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
>> > instance
>> > >>> via
>> > >>>> REST calls to:
>> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
>> Bearer
>> > >>>> token returning a hadoop id_token
>> > >>>>>      - can copy and paste from commandline or use cat to 
>> > >>>>> include
>> > >>>> previously persisted token through "--Header Authorization: 
>> > >>>> Bearer
>> > 'cat
>> > >>>> ~/.hadoop_tokens/.id_token'"
>> > >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
>> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL 
>> > >>>>> layer,
>> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token 
>> > >>>> is
>> passed
>> > >> as
>> > >>>> Authorization: Bearer token to the get-access-token REST 
>> > >>>> endpoint
>> > >> exposed
>> > >>>> by TokenGrantingService returning a hadoop access token
>> > >>>>>   b. RPC server side validates the presented hadoop access 
>> > >>>>> token
>> and
>> > >>>> continues to serve request
>> > >>>>>   c. Successfully invoke a hadoop service RPC API
>> > >>>>>
>> > >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required 
>> > >>>>> for
>> > >>> REST):
>> > >>>>> COMP-9. TokenAuth Method negotiation, etc COMP-10. Client 
>> > >>>>> side implementation to leverage REST endpoint for
>> > >>>> acquiring hadoop access tokens given a hadoop id_token
>> > >>>>> COMP-11. Server side implementation to validate incoming 
>> > >>>>> hadoop
>> > >> access
>> > >>>> tokens
>> > >>>>>
>> > >>>>> UI
>> > >>>>> Various Hadoop services have their own web UI consoles for
>> > >>>> administration and end user interactions. These consoles need 
>> > >>>> to
>> also
>> > >>>> benefit from the pluggability of authentication mechansims to 
>> > >>>> be on
>> > par
>> > >>>> with the access control of the cluster REST and RPC APIs.
>> > >>>>> Web consoles are protected with an 
>> > >>>>> WebSSOAuthenticationHandler
>> which
>> > >>>> will be configured for either authentication or federation.
>> > >>>>>
>> > >>>>> USECASE UI-1 Authentication/LDAP:
>> > >>>>> For the authentication usecase:
>> > >>>>> 1. User's browser requests access to a UI console page 2. 
>> > >>>>> WebSSOAuthenticationHandler intercepts the request and 
>> > >>>>> redirects
>> > >> the
>> > >>>> browser to an IdP web endpoint exposed by the 
>> > >>>> AuthenticationServer
>> > >>> passing
>> > >>>> the requested url as the redirect_url
>> > >>>>> 3. IdP web endpoint presents the user with a FORM over https
>> > >>>>>   a. user provides username/password and submits the FORM 4. 
>> > >>>>> AuthenticationServer authenticates the user with provided
>> > >>> credentials
>> > >>>> against the configured LDAP server and:
>> > >>>>>   a. leverages a servlet filter or other authentication 
>> > >>>>> mechanism
>> > >> for
>> > >>>> the endpoint and authenticates the user with a simple LDAP 
>> > >>>> bind with username and password
>> > >>>>>   b. acquires a hadoop id_token and uses it to acquire the 
>> > >>>>> required
>> > >>>> hadoop access token which is added as a cookie
>> > >>>>>   c. redirects the browser to the original service UI 
>> > >>>>> resource via
>> > >> the
>> > >>>> provided redirect_url
>> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>> > >>> interrogates
>> > >>>> the incoming request again for an authcookie that contains an 
>> > >>>> access
>> > >>> token
>> > >>>> upon finding one:
>> > >>>>>   a. validates the incoming token
>> > >>>>>   b. returns the AuthenticationToken as per 
>> > >>>>> AuthenticationHandler
>> > >>>> contract
>> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with 
>> > >>>>> the
>> > >>> expected
>> > >>>> token
>> > >>>>>   d. serves requested resource for valid tokens
>> > >>>>>   e. subsequent requests are handled by the 
>> > >>>>> AuthenticationFilter
>> > >>>> recognition of the hadoop auth cookie
>> > >>>>>
>> > >>>>> USECASE UI-2 Federation/SAML:
>> > >>>>> For the federation usecase:
>> > >>>>> 1. User's browser requests access to a UI console page 2. 
>> > >>>>> WebSSOAuthenticationHandler intercepts the request and 
>> > >>>>> redirects
>> > >> the
>> > >>>> browser to an SP web endpoint exposed by the 
>> > >>>> AuthenticationServer
>> > >> passing
>> > >>>> the requested url as the redirect_url. This endpoint:
>> > >>>>>   a. is dedicated to redirecting to the external IdP passing 
>> > >>>>> the
>> > >>>> required parameters which may include a redirect_url back to 
>> > >>>> itself
>> as
>> > >>> well
>> > >>>> as encoding the original redirect_url so that it can determine 
>> > >>>> it on
>> > >> the
>> > >>>> way back to the client
>> > >>>>> 3. the IdP:
>> > >>>>>   a. challenges the user for credentials and authenticates the user
>> > >>>>>   b. creates appropriate token/cookie and redirects back to 
>> > >>>>> the
>> > >>>> AuthenticationServer endpoint
>> > >>>>> 4. AuthenticationServer endpoint:
>> > >>>>>   a. extracts the expected token/cookie from the incoming 
>> > >>>>> request
>> > >> and
>> > >>>> validates it
>> > >>>>>   b. creates a hadoop id_token
>> > >>>>>   c. acquires a hadoop access token for the id_token
>> > >>>>>   d. creates appropriate cookie and redirects back to the 
>> > >>>>> original
>> > >>>> redirect_url - being the requested resource
>> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>> > >>> interrogates
>> > >>>> the incoming request again for an authcookie that contains an 
>> > >>>> access
>> > >>> token
>> > >>>> upon finding one:
>> > >>>>>   a. validates the incoming token
>> > >>>>>   b. returns the AuthenticationToken as per 
>> > >>>>> AuthenticationHandler
>> > >>>> contrac
>> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with 
>> > >>>>> the
>> > >>> expected
>> > >>>> token
>> > >>>>>   d. serves requested resource for valid tokens
>> > >>>>>   e. subsequent requests are handled by the 
>> > >>>>> AuthenticationFilter
>> > >>>> recognition of the hadoop auth cookie
>> > >>>>> REQUIRED COMPONENTS for UI USECASES:
>> > >>>>> COMP-12. WebSSOAuthenticationHandler COMP-13. IdP Web 
>> > >>>>> Endpoint within AuthenticationServer for FORM
>> based
>> > >>>> login
>> > >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd 
>> > >>>>> party
>> > >>> token
>> > >>>> federation
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
>> > >> Brian.Swan@microsoft.com>
>> > >>>> wrote:
>> > >>>>> Thanks, Larry. That is what I was trying to say, but you've 
>> > >>>>> said it
>> > >>>> better and in more detail. :-) To extract from what you are saying:
>> > "If
>> > >>> we
>> > >>>> were to reframe the immediate scope to the lowest common 
>> > >>>> denominator
>> > of
>> > >>>> what is needed for accepting tokens in authentication plugins 
>> > >>>> then
>> we
>> > >>>> gain... an end-state for the lowest common denominator that 
>> > >>>> enables
>> > >> code
>> > >>>> patches in the near-term is the best of both worlds."
>> > >>>>>
>> > >>>>> -Brian
>> > >>>>>
>> > >>>>> -----Original Message-----
>> > >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>> > >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
>> > >>>>> To: common-dev@hadoop.apache.org
>> > >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
>> > >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>> > >>>>>
>> > >>>>> It seems to me that we can have the best of both worlds 
>> > >>>>> here...it's
>> > >> all
>> > >>>> about the scoping.
>> > >>>>>
>> > >>>>> If we were to reframe the immediate scope to the lowest 
>> > >>>>> common
>> > >>>> denominator of what is needed for accepting tokens in 
>> > >>>> authentication plugins then we gain:
>> > >>>>>
>> > >>>>> 1. a very manageable scope to define and agree upon 2. a
>> deliverable
>> > >>>> that should be useful in and of itself 3. a foundation for 
>> > >>>> community collaboration that we build on for higher level 
>> > >>>> solutions built on
>> > this
>> > >>>> lowest common denominator and experience as a working 
>> > >>>> community
>> > >>>>>
>> > >>>>> So, to Alejandro's point, perhaps we need to define what 
>> > >>>>> would make
>> > >> #2
>> > >>>> above true - this could serve as the "what" we are building 
>> > >>>> instead
>> of
>> > >>> the
>> > >>>> "how" to build it.
>> > >>>>> Including:
>> > >>>>> a. project structure within 
>> > >>>>> hadoop-common-project/common-security
>> or
>> > >>> the
>> > >>>> like b. the usecases that would need to be enabled to make it 
>> > >>>> a self contained and useful contribution - without higher 
>> > >>>> level solutions
>> c.
>> > >> the
>> > >>>> JIRA/s for contributing patches d. what specific patches will 
>> > >>>> be
>> > needed
>> > >>> to
>> > >>>> accomplished the usecases in #b
>> > >>>>>
>> > >>>>> In other words, an end-state for the lowest common 
>> > >>>>> denominator that
>> > >>>> enables code patches in the near-term is the best of both worlds.
>> > >>>>>
>> > >>>>> I think this may be a good way to bootstrap the collaboration
>> process
>> > >>>> for our emerging security community rather than trying to 
>> > >>>> tackle a
>> > huge
>> > >>>> vision all at once.
>> > >>>>>
>> > >>>>> @Alejandro - if you have something else in mind that would
>> bootstrap
>> > >>>> this process - that would great - please advise.
>> > >>>>>
>> > >>>>> thoughts?
>> > >>>>>
>> > >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan 
>> > >>>>> <Br...@microsoft.com>
>> > >>>> wrote:
>> > >>>>>
>> > >>>>>> Hi Alejandro, all-
>> > >>>>>>
>> > >>>>>> There seems to be agreement on the broad stroke description 
>> > >>>>>> of the
>> > >>>> components needed to achieve pluggable token authentication 
>> > >>>> (I'm
>> sure
>> > >>> I'll
>> > >>>> be corrected if that isn't the case). However, discussion of 
>> > >>>> the
>> > >> details
>> > >>> of
>> > >>>> those components doesn't seem to be moving forward. I think 
>> > >>>> this is
>> > >>> because
>> > >>>> the details are really best understood through code. I also 
>> > >>>> see *a*
>> > >> (i.e.
>> > >>>> one of many possible) token format and pluggable 
>> > >>>> authentication
>> > >>> mechanisms
>> > >>>> within the RPC layer as components that can have immediate 
>> > >>>> benefit
>> to
>> > >>>> Hadoop users AND still allow flexibility in the larger design. 
>> > >>>> So, I
>> > >>> think
>> > >>>> the best way to move the conversation of "what we are aiming for"
>> > >> forward
>> > >>>> is to start looking at code for these components. I am 
>> > >>>> especially interested in moving forward with pluggable 
>> > >>>> authentication
>> mechanisms
>> > >>>> within the RPC layer and would love to see what others have 
>> > >>>> done in
>> > >> this
>> > >>>> area (if anything).
>> > >>>>>>
>> > >>>>>> Thanks.
>> > >>>>>>
>> > >>>>>> -Brian
>> > >>>>>>
>> > >>>>>> -----Original Message-----
>> > >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
>> > >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
>> > >>>>>> To: Larry McCay
>> > >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai 
>> > >>>>>> Zheng
>> > >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>> > >>>>>>
>> > >>>>>> Larry, all,
>> > >>>>>>
>> > >>>>>> Still is not clear to me what is the end state we are aiming 
>> > >>>>>> for,
>> > >> or
>> > >>>> that we even agree on that.
>> > >>>>>>
>> > >>>>>> IMO, Instead trying to agree what to do, we should first  
>> > >>>>>> agree on
>> > >>> the
>> > >>>> final state, then we see what should be changed to there 
>> > >>>> there, then
>> > we
>> > >>> see
>> > >>>> how we change things to get there.
>> > >>>>>>
>> > >>>>>> The different documents out there focus more on how.
>> > >>>>>>
>> > >>>>>> We not try to say how before we know what.
>> > >>>>>>
>> > >>>>>> Thx.
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
>> > >> lmccay@hortonworks.com
>> > >>>>
>> > >>>> wrote:
>> > >>>>>>
>> > >>>>>>> All -
>> > >>>>>>>
>> > >>>>>>> After combing through this thread - as well as the summit 
>> > >>>>>>> session summary thread, I think that we have the following 
>> > >>>>>>> two items that
>> > >> we
>> > >>>>>>> can probably move forward with:
>> > >>>>>>>
>> > >>>>>>> 1. TokenAuth method - assuming this means the pluggable 
>> > >>>>>>> authentication mechanisms within the RPC layer (2 votes: 
>> > >>>>>>> Kai and
>> > >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and
>> myself)
>> > >>>>>>>
>> > >>>>>>> I propose that we attack both of these aspects as one. 
>> > >>>>>>> Let's
>> > >> provide
>> > >>>>>>> the structure and interfaces of the pluggable framework for 
>> > >>>>>>> use
>> in
>> > >>>>>>> the RPC layer through leveraging Daryn's pluggability work 
>> > >>>>>>> and
>> POC
>> > >>> it
>> > >>>>>>> with a particular token format (not necessarily the only 
>> > >>>>>>> format
>> > >> ever
>> > >>>>>>> supported - we just need one to start). If there has 
>> > >>>>>>> already been work done in this area by anyone then please 
>> > >>>>>>> speak up and commit
>> > >> to
>> > >>>>>>> providing a patch - so that we don't duplicate effort.
>> > >>>>>>>
>> > >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we 
>> > >>>>>>> can
>> > >> look
>> > >>>>>>> at to discern the pluggability mechanism details? 
>> > >>>>>>> Documentation
>> of
>> > >>> it
>> > >>>>>>> would be great as well.
>> > >>>>>>> @Kai - do you have existing code for the pluggable token 
>> > >>>>>>> authentication mechanism - if not, we can take a stab at
>> > >>> representing
>> > >>>>>>> it with interfaces and/or POC code.
>> > >>>>>>> I can standup and say that we have a token format that we 
>> > >>>>>>> have
>> > >> been
>> > >>>>>>> working with already and can provide a patch that 
>> > >>>>>>> represents it
>> > >> as a
>> > >>>>>>> contribution to test out the pluggable tokenAuth.
>> > >>>>>>>
>> > >>>>>>> These patches will provide progress toward code being the 
>> > >>>>>>> central discussion vehicle. As a community, we can then 
>> > >>>>>>> incrementally
>> > >> build
>> > >>>>>>> on that foundation in order to collaboratively deliver the 
>> > >>>>>>> common
>> > >>>> vision.
>> > >>>>>>>
>> > >>>>>>> In the absence of any other home for posting such patches, 
>> > >>>>>>> let's assume that they will be attached to HADOOP-9392 - or 
>> > >>>>>>> a dedicated subtask for this particular aspect/s - I will 
>> > >>>>>>> leave that detail
>> to
>> > >>>> Kai.
>> > >>>>>>>
>> > >>>>>>> @Alejandro, being the only voice on this thread that isn't 
>> > >>>>>>> represented in the votes above, please feel free to agree 
>> > >>>>>>> or
>> > >>> disagree
>> > >>>> with this direction.
>> > >>>>>>>
>> > >>>>>>> thanks,
>> > >>>>>>>
>> > >>>>>>> --larry
>> > >>>>>>>
>> > >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay 
>> > >>>>>>> <lm...@hortonworks.com>
>> > >>>> wrote:
>> > >>>>>>>
>> > >>>>>>>> Hi Andy -
>> > >>>>>>>>
>> > >>>>>>>>> Happy Fourth of July to you and yours.
>> > >>>>>>>>
>> > >>>>>>>> Same to you and yours. :-) We had some fun in the sun for 
>> > >>>>>>>> a change - we've had nothing but
>> > >>> rain
>> > >>>>>>>> on
>> > >>>>>>> the east coast lately.
>> > >>>>>>>>
>> > >>>>>>>>> My concern here is there may have been a 
>> > >>>>>>>>> misinterpretation or
>> > >> lack
>> > >>>>>>>>> of consensus on what is meant by "clean slate"
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> Apparently so.
>> > >>>>>>>> On the pre-summit call, I stated that I was interested in 
>> > >>>>>>>> reconciling
>> > >>>>>>> the jiras so that we had one to work from.
>> > >>>>>>>>
>> > >>>>>>>> You recommended that we set them aside for the time being 
>> > >>>>>>>> - with
>> > >>> the
>> > >>>>>>> understanding that work would continue on your side (and 
>> > >>>>>>> our's as
>> > >>>>>>> well) - and approach the community discussion from a clean slate.
>> > >>>>>>>> We seemed to do this at the summit session quite well.
>> > >>>>>>>> It was my understanding that this community discussion 
>> > >>>>>>>> would
>> live
>> > >>>>>>>> beyond
>> > >>>>>>> the summit and continue on this list.
>> > >>>>>>>>
>> > >>>>>>>> While closing the summit session we agreed to follow up on 
>> > >>>>>>>> common-dev
>> > >>>>>>> with first a summary then a discussion of the moving parts.
>> > >>>>>>>>
>> > >>>>>>>> I never expected the previous work to be abandoned and 
>> > >>>>>>>> fully expected it
>> > >>>>>>> to inform the discussion that happened here.
>> > >>>>>>>>
>> > >>>>>>>> If you would like to reframe what clean slate was supposed 
>> > >>>>>>>> to
>> > >> mean
>> > >>>>>>>> or
>> > >>>>>>> describe what it means now - that would be welcome - before 
>> > >>>>>>> I
>> > >> waste
>> > >>>>>>> anymore time trying to facilitate a community discussion 
>> > >>>>>>> that is apparently not wanted.
>> > >>>>>>>>
>> > >>>>>>>>> Nowhere in this
>> > >>>>>>>>> picture are self appointed "master JIRAs" and such, which 
>> > >>>>>>>>> have
>> > >>> been
>> > >>>>>>>>> disappointing to see crop up, we should be 
>> > >>>>>>>>> collaboratively
>> > >> coding
>> > >>>>>>>>> not planting flags.
>> > >>>>>>>>
>> > >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
>> > >>>>>>>> It has certainly not been anyone's intention to disappoint.
>> > >>>>>>>> Any mention of a new JIRA was just to have a clear context 
>> > >>>>>>>> to
>> > >>> gather
>> > >>>>>>>> the
>> > >>>>>>> agreed upon points - previous and/or existing JIRAs would 
>> > >>>>>>> easily
>> > >> be
>> > >>>> linked.
>> > >>>>>>>>
>> > >>>>>>>> Planting flags... I need to go back and read my discussion 
>> > >>>>>>>> point about the
>> > >>>>>>> JIRA and see how this is the impression that was made.
>> > >>>>>>>> That is not how I define success. The only flags that 
>> > >>>>>>>> count is
>> > >>> code.
>> > >>>>>>> What we are lacking is the roadmap on which to put the code.
>> > >>>>>>>>
>> > >>>>>>>>> I read Kai's latest document as something approaching 
>> > >>>>>>>>> today's consensus
>> > >>>>>>> (or
>> > >>>>>>>>> at least a common point of view?) rather than a 
>> > >>>>>>>>> historical
>> > >>> document.
>> > >>>>>>>>> Perhaps he and it can be given equal share of the
>> consideration.
>> > >>>>>>>>
>> > >>>>>>>> I definitely read it as something that has evolved into
>> something
>> > >>>>>>> approaching what we have been talking about so far. There 
>> > >>>>>>> has not however been enough discussion anywhere near the 
>> > >>>>>>> level of detail
>> > >> in
>> > >>>>>>> that document and more details are needed for each 
>> > >>>>>>> component in
>> > >> the
>> > >>>> design.
>> > >>>>>>>> Why the work in that document should not be fed into the
>> > >> community
>> > >>>>>>> discussion as anyone else's would be - I fail to understand.
>> > >>>>>>>>
>> > >>>>>>>> My suggestion continues to be that you should take that 
>> > >>>>>>>> document
>> > >>> and
>> > >>>>>>> speak to the inventory of moving parts as we agreed.
>> > >>>>>>>> As these are agreed upon, we will ensure that the 
>> > >>>>>>>> appropriate subtasks
>> > >>>>>>> are filed against whatever JIRA is to host them - don't 
>> > >>>>>>> really
>> > >> care
>> > >>>>>>> much which it is.
>> > >>>>>>>>
>> > >>>>>>>> I don't really want to continue with two separate JIRAs - 
>> > >>>>>>>> as I stated
>> > >>>>>>> long ago - but until we understand what the pieces are and 
>> > >>>>>>> how
>> > >> they
>> > >>>>>>> relate then they can't be consolidated.
>> > >>>>>>>> Even if 9533 ended up being repurposed as the server 
>> > >>>>>>>> instance of
>> > >>> the
>> > >>>>>>> work - it should be a subtask of a larger one - if that is 
>> > >>>>>>> to be 9392, so be it.
>> > >>>>>>>> We still need to define all the pieces of the larger 
>> > >>>>>>>> picture
>> > >> before
>> > >>>>>>>> that
>> > >>>>>>> can be done.
>> > >>>>>>>>
>> > >>>>>>>> What I thought was the clean slate approach to the 
>> > >>>>>>>> discussion
>> > >>> seemed
>> > >>>>>>>> a
>> > >>>>>>> very reasonable way to make all this happen.
>> > >>>>>>>> If you would like to restate what you intended by it or
>> something
>> > >>>>>>>> else
>> > >>>>>>> equally as reasonable as a way to move forward that would 
>> > >>>>>>> be
>> > >>> awesome.
>> > >>>>>>>>
>> > >>>>>>>> I will be happy to work toward the roadmap with everyone 
>> > >>>>>>>> once it
>> > >> is
>> > >>>>>>> articulated, understood and actionable.
>> > >>>>>>>> In the meantime, I have work to do.
>> > >>>>>>>>
>> > >>>>>>>> thanks,
>> > >>>>>>>>
>> > >>>>>>>> --larry
>> > >>>>>>>>
>> > >>>>>>>> BTW - I meant to quote you in an earlier response and 
>> > >>>>>>>> ended up saying it
>> > >>>>>>> was Aaron instead. Not sure what happened there. :-)
>> > >>>>>>>>
>> > >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell 
>> > >>>>>>>> <apurtell@apache.org
>> >
>> > >>>> wrote:
>> > >>>>>>>>
>> > >>>>>>>>> Hi Larry (and all),
>> > >>>>>>>>>
>> > >>>>>>>>> Happy Fourth of July to you and yours.
>> > >>>>>>>>>
>> > >>>>>>>>> In our shop Kai and Tianyou are already doing the coding, 
>> > >>>>>>>>> so
>> I'd
>> > >>>>>>>>> defer
>> > >>>>>>> to
>> > >>>>>>>>> them on the detailed points.
>> > >>>>>>>>>
>> > >>>>>>>>> My concern here is there may have been a 
>> > >>>>>>>>> misinterpretation or
>> > >> lack
>> > >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully 
>> > >>>>>>>>> that
>> > >> can
>> > >>>>>>>>> be
>> > >>>>>>> quickly
>> > >>>>>>>>> cleared up. Certainly we did not mean ignore all that 
>> > >>>>>>>>> came
>> > >> before.
>> > >>>>>>>>> The
>> > >>>>>>> idea
>> > >>>>>>>>> was to reset discussions to find common ground and new
>> direction
>> > >>>>>>>>> where
>> > >>>>>>> we
>> > >>>>>>>>> are working together, not in conflict, on an agreed upon 
>> > >>>>>>>>> set of design points and tasks. There's been a lot of 
>> > >>>>>>>>> good discussion
>> > >> and
>> > >>>>>>>>> design preceeding that we should figure out how to port over.
>> > >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs" 
>> > >>>>>>>>> and
>> > >>> such,
>> > >>>>>>>>> which have been disappointing to see crop up, we should 
>> > >>>>>>>>> be collaboratively coding not planting flags.
>> > >>>>>>>>>
>> > >>>>>>>>> I read Kai's latest document as something approaching 
>> > >>>>>>>>> today's consensus
>> > >>>>>>> (or
>> > >>>>>>>>> at least a common point of view?) rather than a 
>> > >>>>>>>>> historical
>> > >>> document.
>> > >>>>>>>>> Perhaps he and it can be given equal share of the
>> consideration.
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>> > >>>>>>>>>
>> > >>>>>>>>>> Hey Andrew -
>> > >>>>>>>>>>
>> > >>>>>>>>>> I largely agree with that statement.
>> > >>>>>>>>>> My intention was to let the differences be worked out 
>> > >>>>>>>>>> within
>> > >> the
>> > >>>>>>>>>> individual components once they were identified and 
>> > >>>>>>>>>> subtasks
>> > >>>> created.
>> > >>>>>>>>>>
>> > >>>>>>>>>> My reference to HSSO was really referring to a SSO 
>> > >>>>>>>>>> *server*
>> > >> based
>> > >>>>>>> design
>> > >>>>>>>>>> which was not clearly articulated in the earlier documents.
>> > >>>>>>>>>> We aren't trying to compare and contrast one design over
>> > >> another
>> > >>>>>>> anymore.
>> > >>>>>>>>>>
>> > >>>>>>>>>> Let's move this collaboration along as we've mapped out 
>> > >>>>>>>>>> and
>> the
>> > >>>>>>>>>> differences in the details will reveal themselves and be
>> > >>> addressed
>> > >>>>>>> within
>> > >>>>>>>>>> their components.
>> > >>>>>>>>>>
>> > >>>>>>>>>> I've actually been looking forward to you weighing in on 
>> > >>>>>>>>>> the actual discussion points in this thread.
>> > >>>>>>>>>> Could you do that?
>> > >>>>>>>>>>
>> > >>>>>>>>>> At this point, I am most interested in your thoughts on 
>> > >>>>>>>>>> a
>> > >> single
>> > >>>>>>>>>> jira
>> > >>>>>>> to
>> > >>>>>>>>>> represent all of this work and whether we should start
>> > >> discussing
>> > >>>>>>>>>> the
>> > >>>>>>> SSO
>> > >>>>>>>>>> Tokens.
>> > >>>>>>>>>> If you think there are discussion points missing from 
>> > >>>>>>>>>> that
>> > >> list,
>> > >>>>>>>>>> feel
>> > >>>>>>> free
>> > >>>>>>>>>> to add to it.
>> > >>>>>>>>>>
>> > >>>>>>>>>> thanks,
>> > >>>>>>>>>>
>> > >>>>>>>>>> --larry
>> > >>>>>>>>>>
>> > >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
>> > >> apurtell@apache.org>
>> > >>>>>>> wrote:
>> > >>>>>>>>>>
>> > >>>>>>>>>>> Hi Larry,
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let 
>> > >>>>>>>>>>> me
>> > >> point
>> > >>>>>>>>>>> out
>> > >>>>>>> that,
>> > >>>>>>>>>>> while the differences between the competing JIRAs have 
>> > >>>>>>>>>>> been reduced
>> > >>>>>>> for
>> > >>>>>>>>>>> sure, there were some key differences that didn't just
>> > >>> disappear.
>> > >>>>>>>>>>> Subsequent discussion will make that clear. I also 
>> > >>>>>>>>>>> disagree
>> > >> with
>> > >>>>>>>>>>> your characterization that we have simply endorsed all 
>> > >>>>>>>>>>> of the design
>> > >>>>>>> decisions
>> > >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an 
>> > >>>>>>>>>>> inch. We
>> > >>> are
>> > >>>>>>> here to
>> > >>>>>>>>>>> engage in a collaborative process as peers. I've been
>> > >> encouraged
>> > >>>>>>>>>>> by
>> > >>>>>>> the
>> > >>>>>>>>>>> spirit of the discussions up to this point and hope 
>> > >>>>>>>>>>> that can continue beyond one design summit.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay 
>> > >>>>>>>>>>> <lm...@hortonworks.com>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>> Hi Kai -
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> I think that I need to clarify something...
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> This is not an update for 9533 but a continuation of 
>> > >>>>>>>>>>>> the discussions
>> > >>>>>>>>>> that
>> > >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
>> > >>>>>>>>>>>> We've agreed to leave our previous designs behind and
>> > >> therefore
>> > >>>>>>>>>>>> we
>> > >>>>>>>>>> aren't
>> > >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS 
>> > >>>>>>>>>>>> approach
>> or
>> > >>> an
>> > >>>>>>> HSSO vs
>> > >>>>>>>>>>>> TAS discussion.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Your latest design revision actually makes it clear 
>> > >>>>>>>>>>>> that you
>> > >>> are
>> > >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
>> > >> comparing
>> > >>>>>>>>>>>> and
>> > >>>>>>>>>> contrasting
>> > >>>>>>>>>>>> is not going to add any value.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> What we need you to do at this point, is to look at 
>> > >>>>>>>>>>>> those high-level components described on this thread 
>> > >>>>>>>>>>>> and comment
>> on
>> > >>>>>>>>>>>> whether we need additional components or any that are 
>> > >>>>>>>>>>>> listed that don't seem
>> > >>>>>>> necessary
>> > >>>>>>>>>> to
>> > >>>>>>>>>>>> you and why.
>> > >>>>>>>>>>>> In other words, we need to define and agree on the 
>> > >>>>>>>>>>>> work that
>> > >>> has
>> > >>>>>>>>>>>> to
>> > >>>>>>> be
>> > >>>>>>>>>>>> done.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> We also need to determine those components that need 
>> > >>>>>>>>>>>> to be
>> > >> done
>> > >>>>>>> before
>> > >>>>>>>>>>>> anything else can be started.
>> > >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens 
>> > >>>>>>>>>>>> are central to
>> > >>>>>>>>>> all
>> > >>>>>>>>>>>> the other components and should probably be defined 
>> > >>>>>>>>>>>> and
>> POC'd
>> > >>> in
>> > >>>>>>> short
>> > >>>>>>>>>>>> order.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Personally, I think that continuing the separation of 
>> > >>>>>>>>>>>> 9533
>> > >> and
>> > >>>>>>>>>>>> 9392
>> > >>>>>>> will
>> > >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be 
>> > >>>>>>>>>>>> enough
>> > >>>>>>> differences
>> > >>>>>>>>>>>> between the two to justify separate jiras anymore. It 
>> > >>>>>>>>>>>> may be best to
>> > >>>>>>>>>> file a
>> > >>>>>>>>>>>> new one that reflects a single vision without the 
>> > >>>>>>>>>>>> extra
>> cruft
>> > >>>>>>>>>>>> that
>> > >>>>>>> has
>> > >>>>>>>>>>>> built up in either of the existing ones. We would 
>> > >>>>>>>>>>>> certainly reference
>> > >>>>>>>>>> the
>> > >>>>>>>>>>>> existing ones within the new one. This approach would 
>> > >>>>>>>>>>>> align
>> > >>> with
>> > >>>>>>>>>>>> the
>> > >>>>>>>>>> spirit
>> > >>>>>>>>>>>> of the discussions up to this point.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> I am prepared to start a discussion around the shape 
>> > >>>>>>>>>>>> of the
>> > >> two
>> > >>>>>>> Hadoop
>> > >>>>>>>>>> SSO
>> > >>>>>>>>>>>> tokens: identity and access. If this is what others 
>> > >>>>>>>>>>>> feel the next
>> > >>>>>>> topic
>> > >>>>>>>>>>>> should be.
>> > >>>>>>>>>>>> If we can identify a jira home for it, we can do it 
>> > >>>>>>>>>>>> there -
>> > >>>>>>> otherwise we
>> > >>>>>>>>>>>> can create another DISCUSS thread for it.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> thanks,
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> --larry
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
>> > >> kai.zheng@intel.com>
>> > >>>>>>> wrote:
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>> Hi Larry,
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> Thanks for the update. Good to see that with this 
>> > >>>>>>>>>>>>> update we
>> > >>> are
>> > >>>>>>>>>>>>> now
>> > >>>>>>>>>>>> aligned on most points.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392.
>> The
>> > >>>>>>>>>>>>> new
>> > >>>>>>>>>>>> revision incorporates feedback and suggestions in 
>> > >>>>>>>>>>>> related discussion
>> > >>>>>>>>>> with
>> > >>>>>>>>>>>> the community, particularly from Microsoft and others
>> > >> attending
>> > >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
>> > >>> Summary
>> > >>>>>>>>>>>> of the
>> > >>>>>>>>>> changes:
>> > >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens, Identity
>> > >>> Token
>> > >>>>>>> plus
>> > >>>>>>>>>>>> Access Token, particularly considering our 
>> > >>>>>>>>>>>> authorization framework
>> > >>>>>>> and
>> > >>>>>>>>>>>> compatibility with HSSO;
>> > >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
>> > >>>> authorization
>> > >>>>>>>>>>>> framework into the flow that issues access tokens for
>> clients
>> > >>>>>>>>>>>> with
>> > >>>>>>>>>> identity
>> > >>>>>>>>>>>> tokens to access services;
>> > >>>>>>>>>>>>> 3.    Refined proxy access token and the
>> proxy/impersonation
>> > >>>> flow;
>> > >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access to
>> > >>>> Hadoop
>> > >>>>>>> web
>> > >>>>>>>>>>>> services;
>> > >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> --
>> > >>>>>>>>> Best regards,
>> > >>>>>>>>>
>> > >>>>>>>>> - Andy
>> > >>>>>>>>>
>> > >>>>>>>>> Problems worthy of attack prove their worth by hitting 
>> > >>>>>>>>> back. -
>> > >>> Piet
>> > >>>>>>>>> Hein (via Tom White)
>> > >>>>>>>>
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> --
>> > >>>>>> Alejandro
>> > >>>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
>> > >>>>
>> > >>>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Alejandro
>> > >>>
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Alejandro
>> >
>> >
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Very good.
Thank you, Chris!


On Tue, Sep 3, 2013 at 6:44 PM, Chris Douglas <cd...@apache.org> wrote:

> On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay <lm...@hortonworks.com>
> wrote:
> > One outstanding question for me - how do we go about getting the branches
> > created?
>
> Once a group has converged on a purpose- ideally with some initial
> code from JIRA- please go ahead and create the feature branch in svn.
> There's no ceremony. -C
>
> > On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth <cnauroth@hortonworks.com
> >wrote:
> >
> >> Near the bottom of the bylaws, it states that addition of a "New Branch
> >> Committer" requires "Lazy consensus of active PMC members."  I think
> this
> >> means that you'll need to get a PMC member to sponsor the vote for you.
> >>  Regular committer votes happen on the private PMC mailing list, and I
> >> assume it would be the same for a branch committer vote.
> >>
> >> http://hadoop.apache.org/bylaws.html
> >>
> >> Chris Nauroth
> >> Hortonworks
> >> http://hortonworks.com/
> >>
> >>
> >>
> >> On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com>
> >> wrote:
> >>
> >> > That sounds perfect!
> >> > I have been thinking of late that we would maybe need an incubator
> >> project
> >> > or something for this - which would be unfortunate.
> >> >
> >> > This would allow us to move much more quickly with a set of patches
> >> broken
> >> > up into consumable/understandable chunks that are made functional more
> >> > easily within the branch.
> >> > I assume that we need to start a separate thread for DISCUSS or VOTE
> to
> >> > start that process - correct?
> >> >
> >> > On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com>
> >> wrote:
> >> >
> >> > > yep, that is what I meant. Thanks Chris
> >> > >
> >> > >
> >> > > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <
> >> cnauroth@hortonworks.com
> >> > >wrote:
> >> > >
> >> > >> Perhaps this is also a good opportunity to try out the new "branch
> >> > >> committers" clause in the bylaws, enabling non-committers who are
> >> > working
> >> > >> on this to commit to the feature branch.
> >> > >>
> >> > >>
> >> > >>
> >> >
> >>
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E
> >> > >>
> >> > >> Chris Nauroth
> >> > >> Hortonworks
> >> > >> http://hortonworks.com/
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <
> tucu@cloudera.com
> >> > >>> wrote:
> >> > >>
> >> > >>> Larry,
> >> > >>>
> >> > >>> Sorry for the delay answering. Thanks for laying down things,
> yes, it
> >> > >> makes
> >> > >>> sense.
> >> > >>>
> >> > >>> Given the large scope of the changes, number of JIRAs and number
> of
> >> > >>> developers involved, wouldn't make sense to create a feature
> branch
> >> for
> >> > >> all
> >> > >>> this work not to destabilize (more ;) trunk?
> >> > >>>
> >> > >>> Thanks again.
> >> > >>>
> >> > >>>
> >> > >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <
> lmccay@hortonworks.com
> >> >
> >> > >>> wrote:
> >> > >>>
> >> > >>>> The following JIRA was filed to provide a token and basic
> authority
> >> > >>>> implementation for this effort:
> >> > >>>> https://issues.apache.org/jira/browse/HADOOP-9781
> >> > >>>>
> >> > >>>> I have attached an initial patch though have yet to submit it as
> one
> >> > >>> since
> >> > >>>> it is dependent on the patch for CMF that was posted to:
> >> > >>>> https://issues.apache.org/jira/browse/HADOOP-9534
> >> > >>>> and this patch still has a couple outstanding issues - javac
> >> warnings
> >> > >> for
> >> > >>>> com.sun classes for certification generation and 11 javadoc
> >> warnings.
> >> > >>>>
> >> > >>>> Please feel free to review the patches and raise any questions or
> >> > >>> concerns
> >> > >>>> related to them.
> >> > >>>>
> >> > >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay <lmccay@hortonworks.com
> >
> >> > >> wrote:
> >> > >>>>
> >> > >>>>> Hello All -
> >> > >>>>>
> >> > >>>>> In an effort to scope an initial iteration that provides value
> to
> >> the
> >> > >>>> community while focusing on the pluggable authentication aspects,
> >> I've
> >> > >>>> written a description for "Iteration 1". It identifies the goal
> of
> >> the
> >> > >>>> iteration, the endstate and a set of initial usecases. It also
> >> > >> enumerates
> >> > >>>> the components that are required for each usecase. There is a
> scope
> >> > >>> section
> >> > >>>> that details specific things that should be kept out of the first
> >> > >>>> iteration. This is certainly up for discussion. There may be
> some of
> >> > >>> these
> >> > >>>> things that can be contributed in short order. If we can add some
> >> > >> things
> >> > >>> in
> >> > >>>> without unnecessary complexity for the identified usecases then
> we
> >> > >>> should.
> >> > >>>>>
> >> > >>>>> @Alejandro - please review this and see whether it satisfies
> your
> >> > >> point
> >> > >>>> for a definition of what we are building.
> >> > >>>>>
> >> > >>>>> In addition to the document that I will paste here as text and
> >> > >> attach a
> >> > >>>> pdf version, we have a couple patches for components that are
> >> > >> identified
> >> > >>> in
> >> > >>>> the document.
> >> > >>>>> Specifically, COMP-7 and COMP-8.
> >> > >>>>>
> >> > >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was
> >> > >> filed
> >> > >>>> specifically for that functionality.
> >> > >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as
> the
> >> > >> token
> >> > >>>> format and a basic JsonWebTokenAuthority that can issue and
> verify
> >> > >> these
> >> > >>>> tokens.
> >> > >>>>>
> >> > >>>>> Since there is no JIRA for this yet, I will likely file a new
> JIRA
> >> > >> for
> >> > >>> a
> >> > >>>> SSO token implementation.
> >> > >>>>>
> >> > >>>>> Both of these patches assume to be modules within
> >> > >>>> hadoop-common/hadoop-common-project.
> >> > >>>>> While they are relatively small, I think that they will be
> pulled
> >> in
> >> > >> by
> >> > >>>> other modules such as hadoop-auth which would likely not want a
> >> > >>> dependency
> >> > >>>> on something larger like
> >> > >>> hadoop-common/hadoop-common-project/hadoop-common.
> >> > >>>>>
> >> > >>>>> This is certainly something that we should discuss within the
> >> > >> community
> >> > >>>> for this effort though - that being, exactly how to add these
> >> > libraries
> >> > >>> so
> >> > >>>> that they are most easily consumed by existing projects.
> >> > >>>>>
> >> > >>>>> Anyway, the following is the Iteration-1 document - it is also
> >> > >> attached
> >> > >>>> as a pdf:
> >> > >>>>>
> >> > >>>>> Iteration 1: Pluggable User Authentication and Federation
> >> > >>>>>
> >> > >>>>> Introduction
> >> > >>>>> The intent of this effort is to bootstrap the development of
> >> > >> pluggable
> >> > >>>> token-based authentication mechanisms to support certain goals of
> >> > >>>> enterprise authentication integrations. By restricting the scope
> of
> >> > >> this
> >> > >>>> effort, we hope to provide immediate benefit to the community
> while
> >> > >>> keeping
> >> > >>>> the initial contribution to a manageable size that can be easily
> >> > >>> reviewed,
> >> > >>>> understood and extended with further development through follow
> up
> >> > >> JIRAs
> >> > >>>> and related iterations.
> >> > >>>>>
> >> > >>>>> Iteration Endstate
> >> > >>>>> Once complete, this effort will have extended the authentication
> >> > >>>> mechanisms - for all client types - from the existing: Simple,
> >> > Kerberos
> >> > >>> and
> >> > >>>> Plain (for RPC) to include LDAP authentication and SAML based
> >> > >> federation.
> >> > >>>> In addition, the ability to provide additional/custom
> authentication
> >> > >>>> mechanisms will be enabled for users to plug in their preferred
> >> > >>> mechanisms.
> >> > >>>>>
> >> > >>>>> Project Scope
> >> > >>>>> The scope of this effort is a subset of the features covered by
> the
> >> > >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort
> concentrates
> >> on
> >> > >>>> enabling Hadoop to issue, accept/validate SSO tokens of its own.
> The
> >> > >>>> pluggable authentication mechanism within SASL/RPC layer and the
> >> > >>>> authentication filter pluggability for REST and UI components
> will
> >> be
> >> > >>>> leveraged and extended to support the results of this effort.
> >> > >>>>>
> >> > >>>>> Out of Scope
> >> > >>>>> In order to scope the initial deliverable as the minimally
> viable
> >> > >>>> product, a handful of things have been simplified or left out of
> >> scope
> >> > >>> for
> >> > >>>> this effort. This is not meant to say that these aspects are not
> >> > useful
> >> > >>> or
> >> > >>>> not needed but that they are not necessary for this iteration.
> We do
> >> > >>>> however need to ensure that we don’t do anything to preclude
> adding
> >> > >> them
> >> > >>> in
> >> > >>>> future iterations.
> >> > >>>>> 1. Additional Attributes - the result of authentication will
> >> continue
> >> > >>> to
> >> > >>>> use the existing hadoop tokens and identity representations.
> >> > Additional
> >> > >>>> attributes used for finer grained authorization decisions will be
> >> > added
> >> > >>>> through follow-up efforts.
> >> > >>>>> 2. Token revocation - the ability to revoke issued identity
> tokens
> >> > >> will
> >> > >>>> be added later
> >> > >>>>> 3. Multi-factor authentication - this will likely require
> >> additional
> >> > >>>> attributes and is not necessary for this iteration.
> >> > >>>>> 4. Authorization changes - we will require additional attributes
> >> for
> >> > >>> the
> >> > >>>> fine-grained access control plans. This is not needed for this
> >> > >> iteration.
> >> > >>>>> 5. Domains - we assume a single flat domain for all users
> >> > >>>>> 6. Kinit alternative - we can leverage existing REST clients
> such
> >> as
> >> > >>>> cURL to retrieve tokens through authentication and federation for
> >> the
> >> > >>> time
> >> > >>>> being
> >> > >>>>> 7. A specific authentication framework isn’t really necessary
> >> within
> >> > >>> the
> >> > >>>> REST endpoints for this iteration. If one is available then we
> can
> >> use
> >> > >> it
> >> > >>>> otherwise we can leverage existing things like Apache Shiro
> within a
> >> > >>>> servlet filter.
> >> > >>>>>
> >> > >>>>> In Scope
> >> > >>>>> What is in scope for this effort is defined by the usecases
> >> described
> >> > >>>> below. Components required for supporting the usecases are
> >> summarized
> >> > >> for
> >> > >>>> each client type. Each component is a candidate for a JIRA
> subtask -
> >> > >>> though
> >> > >>>> multiple components are likely to be included in a JIRA to
> >> represent a
> >> > >>> set
> >> > >>>> of functionality rather than individual JIRAs per component.
> >> > >>>>>
> >> > >>>>> Terminology and Naming
> >> > >>>>> The terms and names of components within this document are
> merely
> >> > >>>> descriptive of the functionality that they represent. Any
> similarity
> >> > or
> >> > >>>> difference in names or terms from those that are found in other
> >> > >> documents
> >> > >>>> are not intended to make any statement about those other
> documents
> >> or
> >> > >> the
> >> > >>>> descriptions within. This document represents the pluggable
> >> > >>> authentication
> >> > >>>> mechanisms and server functionality required to replace Kerberos.
> >> > >>>>>
> >> > >>>>> Ultimately, the naming of the implementation classes will be a
> >> > >> product
> >> > >>>> of the patches accepted by the community.
> >> > >>>>>
> >> > >>>>> Usecases:
> >> > >>>>> client types: REST, CLI, UI
> >> > >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
> >> > >>>> federation/SAML
> >> > >>>>>
> >> > >>>>> Simple and Kerberos
> >> > >>>>> Simple and Kerberos usecases continue to work as they do today.
> The
> >> > >>>> addition of Authentication/LDAP and Federation/SAML are added
> >> through
> >> > >> the
> >> > >>>> existing pluggability points either as they are or with required
> >> > >>> extension.
> >> > >>>> Either way, continued support for Simple and Kerberos must not
> >> require
> >> > >>>> changes to existing deployments in the field as a result of this
> >> > >> effort.
> >> > >>>>>
> >> > >>>>> REST
> >> > >>>>> USECASE REST-1 Authentication/LDAP:
> >> > >>>>> For REST clients, we will provide the ability to:
> >> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
> >> exposed
> >> > >> by
> >> > >>>> an AuthenticationServer instance via REST calls to:
> >> > >>>>>   a. authenticate - passing username/password returning a hadoop
> >> > >>>> id_token
> >> > >>>>>   b. get-access-token - from the TokenGrantingService by passing
> >> the
> >> > >>>> hadoop id_token as an Authorization: Bearer token along with the
> >> > >> desired
> >> > >>>> service name (master service name) returning a hadoop access
> token
> >> > >>>>> 2. Successfully invoke a hadoop service REST API passing the
> hadoop
> >> > >>>> access token through an HTTP header as an Authorization Bearer
> token
> >> > >>>>>   a. validation of the incoming token on the service endpoint is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>> 3. Successfully block access to a REST resource when presenting
> a
> >> > >>> hadoop
> >> > >>>> access token intended for a different service
> >> > >>>>>   a. validation of the incoming token on the service endpoint is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>>
> >> > >>>>> USECASE REST-2 Federation/SAML:
> >> > >>>>> We will also provide federation capabilities for REST clients
> such
> >> > >>> that:
> >> > >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?)
> >> and
> >> > >>>> persist in a permissions protected file - ie.
> >> > >> ~/.hadoop_tokens/.idp_token
> >> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
> >> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> >> > instance
> >> > >>> via
> >> > >>>> REST calls to:
> >> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
> >> Bearer
> >> > >>>> token returning a hadoop id_token
> >> > >>>>>      - can copy and paste from commandline or use cat to include
> >> > >>>> persisted token through "--Header Authorization: Bearer 'cat
> >> > >>>> ~/.hadoop_tokens/.id_token'"
> >> > >>>>>   b. get-access-token - from the TokenGrantingService by passing
> >> the
> >> > >>>> hadoop id_token as an Authorization: Bearer token along with the
> >> > >> desired
> >> > >>>> service name (master service name) to the TokenGrantingService
> >> > >> returning
> >> > >>> a
> >> > >>>> hadoop access token
> >> > >>>>> 3. Successfully invoke a hadoop service REST API passing the
> hadoop
> >> > >>>> access token through an HTTP header as an Authorization Bearer
> token
> >> > >>>>>   a. validation of the incoming token on the service endpoint is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>> 4. Successfully block access to a REST resource when presenting
> a
> >> > >>> hadoop
> >> > >>>> access token intended for a different service
> >> > >>>>>   a. validation of the incoming token on the service endpoint is
> >> > >>>> accomplished by an SSOAuthenticationHandler
> >> > >>>>>
> >> > >>>>> REQUIRED COMPONENTS for REST USECASES:
> >> > >>>>> COMP-1. REST client - cURL or similar
> >> > >>>>> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP
> >> endpoint
> >> > >>>> example - returning hadoop id_token
> >> > >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
> >> > >>> shibboleth
> >> > >>>> SP?|OpenSAML? - returning hadoop id_token
> >> > >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop
> >> access
> >> > >>>> tokens from hadoop id_tokens
> >> > >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop
> access
> >> > >>>> tokens
> >> > >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
> >> > >>>>> COMP-7. hadoop token and authority implementations
> >> > >>>>> COMP-8. core services for crypto support for signing, verifying
> and
> >> > >> PKI
> >> > >>>> management
> >> > >>>>>
> >> > >>>>> CLI
> >> > >>>>> USECASE CLI-1 Authentication/LDAP:
> >> > >>>>> For CLI/RPC clients, we will provide the ability to:
> >> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
> >> exposed
> >> > >> by
> >> > >>>> an AuthenticationServer instance via REST calls to:
> >> > >>>>>   a. authenticate - passing username/password returning a hadoop
> >> > >>>> id_token
> >> > >>>>>      - for RPC clients we need to persist the returned hadoop
> >> > >> identity
> >> > >>>> token in a file protected by fs permissions so that it may be
> >> > leveraged
> >> > >>>> until expiry
> >> > >>>>>      - directing the returned response to a file may suffice for
> >> now
> >> > >>>> something like ">~/.hadoop_tokens/.id_token"
> >> > >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> >> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL
> layer,
> >> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is
> >> passed
> >> > >> as
> >> > >>>> Authorization: Bearer token to the get-access-token REST endpoint
> >> > >> exposed
> >> > >>>> by TokenGrantingService returning a hadoop access token
> >> > >>>>>   b. RPC server side validates the presented hadoop access token
> >> and
> >> > >>>> continues to serve request
> >> > >>>>>   c. Successfully invoke a hadoop service RPC API
> >> > >>>>>
> >> > >>>>> USECASE CLI-2 Federation/SAML:
> >> > >>>>> For CLI/RPC clients, we will provide the ability to:
> >> > >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?)
> >> and
> >> > >>>> persist in a permissions protected file - ie.
> >> > >> ~/.hadoop_tokens/.idp_token
> >> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
> >> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> >> > instance
> >> > >>> via
> >> > >>>> REST calls to:
> >> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
> >> Bearer
> >> > >>>> token returning a hadoop id_token
> >> > >>>>>      - can copy and paste from commandline or use cat to include
> >> > >>>> previously persisted token through "--Header Authorization:
> Bearer
> >> > 'cat
> >> > >>>> ~/.hadoop_tokens/.id_token'"
> >> > >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> >> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL
> layer,
> >> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is
> >> passed
> >> > >> as
> >> > >>>> Authorization: Bearer token to the get-access-token REST endpoint
> >> > >> exposed
> >> > >>>> by TokenGrantingService returning a hadoop access token
> >> > >>>>>   b. RPC server side validates the presented hadoop access token
> >> and
> >> > >>>> continues to serve request
> >> > >>>>>   c. Successfully invoke a hadoop service RPC API
> >> > >>>>>
> >> > >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required
> for
> >> > >>> REST):
> >> > >>>>> COMP-9. TokenAuth Method negotiation, etc
> >> > >>>>> COMP-10. Client side implementation to leverage REST endpoint
> for
> >> > >>>> acquiring hadoop access tokens given a hadoop id_token
> >> > >>>>> COMP-11. Server side implementation to validate incoming hadoop
> >> > >> access
> >> > >>>> tokens
> >> > >>>>>
> >> > >>>>> UI
> >> > >>>>> Various Hadoop services have their own web UI consoles for
> >> > >>>> administration and end user interactions. These consoles need to
> >> also
> >> > >>>> benefit from the pluggability of authentication mechansims to be
> on
> >> > par
> >> > >>>> with the access control of the cluster REST and RPC APIs.
> >> > >>>>> Web consoles are protected with an WebSSOAuthenticationHandler
> >> which
> >> > >>>> will be configured for either authentication or federation.
> >> > >>>>>
> >> > >>>>> USECASE UI-1 Authentication/LDAP:
> >> > >>>>> For the authentication usecase:
> >> > >>>>> 1. User’s browser requests access to a UI console page
> >> > >>>>> 2. WebSSOAuthenticationHandler intercepts the request and
> redirects
> >> > >> the
> >> > >>>> browser to an IdP web endpoint exposed by the
> AuthenticationServer
> >> > >>> passing
> >> > >>>> the requested url as the redirect_url
> >> > >>>>> 3. IdP web endpoint presents the user with a FORM over https
> >> > >>>>>   a. user provides username/password and submits the FORM
> >> > >>>>> 4. AuthenticationServer authenticates the user with provided
> >> > >>> credentials
> >> > >>>> against the configured LDAP server and:
> >> > >>>>>   a. leverages a servlet filter or other authentication
> mechanism
> >> > >> for
> >> > >>>> the endpoint and authenticates the user with a simple LDAP bind
> with
> >> > >>>> username and password
> >> > >>>>>   b. acquires a hadoop id_token and uses it to acquire the
> required
> >> > >>>> hadoop access token which is added as a cookie
> >> > >>>>>   c. redirects the browser to the original service UI resource
> via
> >> > >> the
> >> > >>>> provided redirect_url
> >> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> >> > >>> interrogates
> >> > >>>> the incoming request again for an authcookie that contains an
> access
> >> > >>> token
> >> > >>>> upon finding one:
> >> > >>>>>   a. validates the incoming token
> >> > >>>>>   b. returns the AuthenticationToken as per
> AuthenticationHandler
> >> > >>>> contract
> >> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
> >> > >>> expected
> >> > >>>> token
> >> > >>>>>   d. serves requested resource for valid tokens
> >> > >>>>>   e. subsequent requests are handled by the AuthenticationFilter
> >> > >>>> recognition of the hadoop auth cookie
> >> > >>>>>
> >> > >>>>> USECASE UI-2 Federation/SAML:
> >> > >>>>> For the federation usecase:
> >> > >>>>> 1. User’s browser requests access to a UI console page
> >> > >>>>> 2. WebSSOAuthenticationHandler intercepts the request and
> redirects
> >> > >> the
> >> > >>>> browser to an SP web endpoint exposed by the AuthenticationServer
> >> > >> passing
> >> > >>>> the requested url as the redirect_url. This endpoint:
> >> > >>>>>   a. is dedicated to redirecting to the external IdP passing the
> >> > >>>> required parameters which may include a redirect_url back to
> itself
> >> as
> >> > >>> well
> >> > >>>> as encoding the original redirect_url so that it can determine
> it on
> >> > >> the
> >> > >>>> way back to the client
> >> > >>>>> 3. the IdP:
> >> > >>>>>   a. challenges the user for credentials and authenticates the
> user
> >> > >>>>>   b. creates appropriate token/cookie and redirects back to the
> >> > >>>> AuthenticationServer endpoint
> >> > >>>>> 4. AuthenticationServer endpoint:
> >> > >>>>>   a. extracts the expected token/cookie from the incoming
> request
> >> > >> and
> >> > >>>> validates it
> >> > >>>>>   b. creates a hadoop id_token
> >> > >>>>>   c. acquires a hadoop access token for the id_token
> >> > >>>>>   d. creates appropriate cookie and redirects back to the
> original
> >> > >>>> redirect_url - being the requested resource
> >> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> >> > >>> interrogates
> >> > >>>> the incoming request again for an authcookie that contains an
> access
> >> > >>> token
> >> > >>>> upon finding one:
> >> > >>>>>   a. validates the incoming token
> >> > >>>>>   b. returns the AuthenticationToken as per
> AuthenticationHandler
> >> > >>>> contrac
> >> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
> >> > >>> expected
> >> > >>>> token
> >> > >>>>>   d. serves requested resource for valid tokens
> >> > >>>>>   e. subsequent requests are handled by the AuthenticationFilter
> >> > >>>> recognition of the hadoop auth cookie
> >> > >>>>> REQUIRED COMPONENTS for UI USECASES:
> >> > >>>>> COMP-12. WebSSOAuthenticationHandler
> >> > >>>>> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM
> >> based
> >> > >>>> login
> >> > >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd
> party
> >> > >>> token
> >> > >>>> federation
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
> >> > >> Brian.Swan@microsoft.com>
> >> > >>>> wrote:
> >> > >>>>> Thanks, Larry. That is what I was trying to say, but you've
> said it
> >> > >>>> better and in more detail. :-) To extract from what you are
> saying:
> >> > "If
> >> > >>> we
> >> > >>>> were to reframe the immediate scope to the lowest common
> denominator
> >> > of
> >> > >>>> what is needed for accepting tokens in authentication plugins
> then
> >> we
> >> > >>>> gain... an end-state for the lowest common denominator that
> enables
> >> > >> code
> >> > >>>> patches in the near-term is the best of both worlds."
> >> > >>>>>
> >> > >>>>> -Brian
> >> > >>>>>
> >> > >>>>> -----Original Message-----
> >> > >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
> >> > >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
> >> > >>>>> To: common-dev@hadoop.apache.org
> >> > >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> >> > >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >> > >>>>>
> >> > >>>>> It seems to me that we can have the best of both worlds
> here...it's
> >> > >> all
> >> > >>>> about the scoping.
> >> > >>>>>
> >> > >>>>> If we were to reframe the immediate scope to the lowest common
> >> > >>>> denominator of what is needed for accepting tokens in
> authentication
> >> > >>>> plugins then we gain:
> >> > >>>>>
> >> > >>>>> 1. a very manageable scope to define and agree upon 2. a
> >> deliverable
> >> > >>>> that should be useful in and of itself 3. a foundation for
> community
> >> > >>>> collaboration that we build on for higher level solutions built
> on
> >> > this
> >> > >>>> lowest common denominator and experience as a working community
> >> > >>>>>
> >> > >>>>> So, to Alejandro's point, perhaps we need to define what would
> make
> >> > >> #2
> >> > >>>> above true - this could serve as the "what" we are building
> instead
> >> of
> >> > >>> the
> >> > >>>> "how" to build it.
> >> > >>>>> Including:
> >> > >>>>> a. project structure within
> hadoop-common-project/common-security
> >> or
> >> > >>> the
> >> > >>>> like b. the usecases that would need to be enabled to make it a
> self
> >> > >>>> contained and useful contribution - without higher level
> solutions
> >> c.
> >> > >> the
> >> > >>>> JIRA/s for contributing patches d. what specific patches will be
> >> > needed
> >> > >>> to
> >> > >>>> accomplished the usecases in #b
> >> > >>>>>
> >> > >>>>> In other words, an end-state for the lowest common denominator
> that
> >> > >>>> enables code patches in the near-term is the best of both worlds.
> >> > >>>>>
> >> > >>>>> I think this may be a good way to bootstrap the collaboration
> >> process
> >> > >>>> for our emerging security community rather than trying to tackle
> a
> >> > huge
> >> > >>>> vision all at once.
> >> > >>>>>
> >> > >>>>> @Alejandro - if you have something else in mind that would
> >> bootstrap
> >> > >>>> this process - that would great - please advise.
> >> > >>>>>
> >> > >>>>> thoughts?
> >> > >>>>>
> >> > >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan <
> Brian.Swan@microsoft.com>
> >> > >>>> wrote:
> >> > >>>>>
> >> > >>>>>> Hi Alejandro, all-
> >> > >>>>>>
> >> > >>>>>> There seems to be agreement on the broad stroke description of
> the
> >> > >>>> components needed to achieve pluggable token authentication (I'm
> >> sure
> >> > >>> I'll
> >> > >>>> be corrected if that isn't the case). However, discussion of the
> >> > >> details
> >> > >>> of
> >> > >>>> those components doesn't seem to be moving forward. I think this
> is
> >> > >>> because
> >> > >>>> the details are really best understood through code. I also see
> *a*
> >> > >> (i.e.
> >> > >>>> one of many possible) token format and pluggable authentication
> >> > >>> mechanisms
> >> > >>>> within the RPC layer as components that can have immediate
> benefit
> >> to
> >> > >>>> Hadoop users AND still allow flexibility in the larger design.
> So, I
> >> > >>> think
> >> > >>>> the best way to move the conversation of "what we are aiming for"
> >> > >> forward
> >> > >>>> is to start looking at code for these components. I am especially
> >> > >>>> interested in moving forward with pluggable authentication
> >> mechanisms
> >> > >>>> within the RPC layer and would love to see what others have done
> in
> >> > >> this
> >> > >>>> area (if anything).
> >> > >>>>>>
> >> > >>>>>> Thanks.
> >> > >>>>>>
> >> > >>>>>> -Brian
> >> > >>>>>>
> >> > >>>>>> -----Original Message-----
> >> > >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> >> > >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
> >> > >>>>>> To: Larry McCay
> >> > >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai
> Zheng
> >> > >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >> > >>>>>>
> >> > >>>>>> Larry, all,
> >> > >>>>>>
> >> > >>>>>> Still is not clear to me what is the end state we are aiming
> for,
> >> > >> or
> >> > >>>> that we even agree on that.
> >> > >>>>>>
> >> > >>>>>> IMO, Instead trying to agree what to do, we should first
>  agree on
> >> > >>> the
> >> > >>>> final state, then we see what should be changed to there there,
> then
> >> > we
> >> > >>> see
> >> > >>>> how we change things to get there.
> >> > >>>>>>
> >> > >>>>>> The different documents out there focus more on how.
> >> > >>>>>>
> >> > >>>>>> We not try to say how before we know what.
> >> > >>>>>>
> >> > >>>>>> Thx.
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
> >> > >> lmccay@hortonworks.com
> >> > >>>>
> >> > >>>> wrote:
> >> > >>>>>>
> >> > >>>>>>> All -
> >> > >>>>>>>
> >> > >>>>>>> After combing through this thread - as well as the summit
> session
> >> > >>>>>>> summary thread, I think that we have the following two items
> that
> >> > >> we
> >> > >>>>>>> can probably move forward with:
> >> > >>>>>>>
> >> > >>>>>>> 1. TokenAuth method - assuming this means the pluggable
> >> > >>>>>>> authentication mechanisms within the RPC layer (2 votes: Kai
> and
> >> > >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and
> >> myself)
> >> > >>>>>>>
> >> > >>>>>>> I propose that we attack both of these aspects as one. Let's
> >> > >> provide
> >> > >>>>>>> the structure and interfaces of the pluggable framework for
> use
> >> in
> >> > >>>>>>> the RPC layer through leveraging Daryn's pluggability work and
> >> POC
> >> > >>> it
> >> > >>>>>>> with a particular token format (not necessarily the only
> format
> >> > >> ever
> >> > >>>>>>> supported - we just need one to start). If there has already
> been
> >> > >>>>>>> work done in this area by anyone then please speak up and
> commit
> >> > >> to
> >> > >>>>>>> providing a patch - so that we don't duplicate effort.
> >> > >>>>>>>
> >> > >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we
> can
> >> > >> look
> >> > >>>>>>> at to discern the pluggability mechanism details?
> Documentation
> >> of
> >> > >>> it
> >> > >>>>>>> would be great as well.
> >> > >>>>>>> @Kai - do you have existing code for the pluggable token
> >> > >>>>>>> authentication mechanism - if not, we can take a stab at
> >> > >>> representing
> >> > >>>>>>> it with interfaces and/or POC code.
> >> > >>>>>>> I can standup and say that we have a token format that we have
> >> > >> been
> >> > >>>>>>> working with already and can provide a patch that represents
> it
> >> > >> as a
> >> > >>>>>>> contribution to test out the pluggable tokenAuth.
> >> > >>>>>>>
> >> > >>>>>>> These patches will provide progress toward code being the
> central
> >> > >>>>>>> discussion vehicle. As a community, we can then incrementally
> >> > >> build
> >> > >>>>>>> on that foundation in order to collaboratively deliver the
> common
> >> > >>>> vision.
> >> > >>>>>>>
> >> > >>>>>>> In the absence of any other home for posting such patches,
> let's
> >> > >>>>>>> assume that they will be attached to HADOOP-9392 - or a
> dedicated
> >> > >>>>>>> subtask for this particular aspect/s - I will leave that
> detail
> >> to
> >> > >>>> Kai.
> >> > >>>>>>>
> >> > >>>>>>> @Alejandro, being the only voice on this thread that isn't
> >> > >>>>>>> represented in the votes above, please feel free to agree or
> >> > >>> disagree
> >> > >>>> with this direction.
> >> > >>>>>>>
> >> > >>>>>>> thanks,
> >> > >>>>>>>
> >> > >>>>>>> --larry
> >> > >>>>>>>
> >> > >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay <
> lmccay@hortonworks.com>
> >> > >>>> wrote:
> >> > >>>>>>>
> >> > >>>>>>>> Hi Andy -
> >> > >>>>>>>>
> >> > >>>>>>>>> Happy Fourth of July to you and yours.
> >> > >>>>>>>>
> >> > >>>>>>>> Same to you and yours. :-)
> >> > >>>>>>>> We had some fun in the sun for a change - we've had nothing
> but
> >> > >>> rain
> >> > >>>>>>>> on
> >> > >>>>>>> the east coast lately.
> >> > >>>>>>>>
> >> > >>>>>>>>> My concern here is there may have been a misinterpretation
> or
> >> > >> lack
> >> > >>>>>>>>> of consensus on what is meant by "clean slate"
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> Apparently so.
> >> > >>>>>>>> On the pre-summit call, I stated that I was interested in
> >> > >>>>>>>> reconciling
> >> > >>>>>>> the jiras so that we had one to work from.
> >> > >>>>>>>>
> >> > >>>>>>>> You recommended that we set them aside for the time being -
> with
> >> > >>> the
> >> > >>>>>>> understanding that work would continue on your side (and
> our's as
> >> > >>>>>>> well) - and approach the community discussion from a clean
> slate.
> >> > >>>>>>>> We seemed to do this at the summit session quite well.
> >> > >>>>>>>> It was my understanding that this community discussion would
> >> live
> >> > >>>>>>>> beyond
> >> > >>>>>>> the summit and continue on this list.
> >> > >>>>>>>>
> >> > >>>>>>>> While closing the summit session we agreed to follow up on
> >> > >>>>>>>> common-dev
> >> > >>>>>>> with first a summary then a discussion of the moving parts.
> >> > >>>>>>>>
> >> > >>>>>>>> I never expected the previous work to be abandoned and fully
> >> > >>>>>>>> expected it
> >> > >>>>>>> to inform the discussion that happened here.
> >> > >>>>>>>>
> >> > >>>>>>>> If you would like to reframe what clean slate was supposed to
> >> > >> mean
> >> > >>>>>>>> or
> >> > >>>>>>> describe what it means now - that would be welcome - before I
> >> > >> waste
> >> > >>>>>>> anymore time trying to facilitate a community discussion that
> is
> >> > >>>>>>> apparently not wanted.
> >> > >>>>>>>>
> >> > >>>>>>>>> Nowhere in this
> >> > >>>>>>>>> picture are self appointed "master JIRAs" and such, which
> have
> >> > >>> been
> >> > >>>>>>>>> disappointing to see crop up, we should be collaboratively
> >> > >> coding
> >> > >>>>>>>>> not planting flags.
> >> > >>>>>>>>
> >> > >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
> >> > >>>>>>>> It has certainly not been anyone's intention to disappoint.
> >> > >>>>>>>> Any mention of a new JIRA was just to have a clear context to
> >> > >>> gather
> >> > >>>>>>>> the
> >> > >>>>>>> agreed upon points - previous and/or existing JIRAs would
> easily
> >> > >> be
> >> > >>>> linked.
> >> > >>>>>>>>
> >> > >>>>>>>> Planting flags... I need to go back and read my discussion
> point
> >> > >>>>>>>> about the
> >> > >>>>>>> JIRA and see how this is the impression that was made.
> >> > >>>>>>>> That is not how I define success. The only flags that count
> is
> >> > >>> code.
> >> > >>>>>>> What we are lacking is the roadmap on which to put the code.
> >> > >>>>>>>>
> >> > >>>>>>>>> I read Kai's latest document as something approaching
> today's
> >> > >>>>>>>>> consensus
> >> > >>>>>>> (or
> >> > >>>>>>>>> at least a common point of view?) rather than a historical
> >> > >>> document.
> >> > >>>>>>>>> Perhaps he and it can be given equal share of the
> >> consideration.
> >> > >>>>>>>>
> >> > >>>>>>>> I definitely read it as something that has evolved into
> >> something
> >> > >>>>>>> approaching what we have been talking about so far. There has
> not
> >> > >>>>>>> however been enough discussion anywhere near the level of
> detail
> >> > >> in
> >> > >>>>>>> that document and more details are needed for each component
> in
> >> > >> the
> >> > >>>> design.
> >> > >>>>>>>> Why the work in that document should not be fed into the
> >> > >> community
> >> > >>>>>>> discussion as anyone else's would be - I fail to understand.
> >> > >>>>>>>>
> >> > >>>>>>>> My suggestion continues to be that you should take that
> document
> >> > >>> and
> >> > >>>>>>> speak to the inventory of moving parts as we agreed.
> >> > >>>>>>>> As these are agreed upon, we will ensure that the appropriate
> >> > >>>>>>>> subtasks
> >> > >>>>>>> are filed against whatever JIRA is to host them - don't really
> >> > >> care
> >> > >>>>>>> much which it is.
> >> > >>>>>>>>
> >> > >>>>>>>> I don't really want to continue with two separate JIRAs - as
> I
> >> > >>>>>>>> stated
> >> > >>>>>>> long ago - but until we understand what the pieces are and how
> >> > >> they
> >> > >>>>>>> relate then they can't be consolidated.
> >> > >>>>>>>> Even if 9533 ended up being repurposed as the server
> instance of
> >> > >>> the
> >> > >>>>>>> work - it should be a subtask of a larger one - if that is to
> be
> >> > >>>>>>> 9392, so be it.
> >> > >>>>>>>> We still need to define all the pieces of the larger picture
> >> > >> before
> >> > >>>>>>>> that
> >> > >>>>>>> can be done.
> >> > >>>>>>>>
> >> > >>>>>>>> What I thought was the clean slate approach to the discussion
> >> > >>> seemed
> >> > >>>>>>>> a
> >> > >>>>>>> very reasonable way to make all this happen.
> >> > >>>>>>>> If you would like to restate what you intended by it or
> >> something
> >> > >>>>>>>> else
> >> > >>>>>>> equally as reasonable as a way to move forward that would be
> >> > >>> awesome.
> >> > >>>>>>>>
> >> > >>>>>>>> I will be happy to work toward the roadmap with everyone
> once it
> >> > >> is
> >> > >>>>>>> articulated, understood and actionable.
> >> > >>>>>>>> In the meantime, I have work to do.
> >> > >>>>>>>>
> >> > >>>>>>>> thanks,
> >> > >>>>>>>>
> >> > >>>>>>>> --larry
> >> > >>>>>>>>
> >> > >>>>>>>> BTW - I meant to quote you in an earlier response and ended
> up
> >> > >>>>>>>> saying it
> >> > >>>>>>> was Aaron instead. Not sure what happened there. :-)
> >> > >>>>>>>>
> >> > >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <
> apurtell@apache.org
> >> >
> >> > >>>> wrote:
> >> > >>>>>>>>
> >> > >>>>>>>>> Hi Larry (and all),
> >> > >>>>>>>>>
> >> > >>>>>>>>> Happy Fourth of July to you and yours.
> >> > >>>>>>>>>
> >> > >>>>>>>>> In our shop Kai and Tianyou are already doing the coding, so
> >> I'd
> >> > >>>>>>>>> defer
> >> > >>>>>>> to
> >> > >>>>>>>>> them on the detailed points.
> >> > >>>>>>>>>
> >> > >>>>>>>>> My concern here is there may have been a misinterpretation
> or
> >> > >> lack
> >> > >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully
> that
> >> > >> can
> >> > >>>>>>>>> be
> >> > >>>>>>> quickly
> >> > >>>>>>>>> cleared up. Certainly we did not mean ignore all that came
> >> > >> before.
> >> > >>>>>>>>> The
> >> > >>>>>>> idea
> >> > >>>>>>>>> was to reset discussions to find common ground and new
> >> direction
> >> > >>>>>>>>> where
> >> > >>>>>>> we
> >> > >>>>>>>>> are working together, not in conflict, on an agreed upon
> set of
> >> > >>>>>>>>> design points and tasks. There's been a lot of good
> discussion
> >> > >> and
> >> > >>>>>>>>> design preceeding that we should figure out how to port
> over.
> >> > >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs"
> and
> >> > >>> such,
> >> > >>>>>>>>> which have been disappointing to see crop up, we should be
> >> > >>>>>>>>> collaboratively coding not planting flags.
> >> > >>>>>>>>>
> >> > >>>>>>>>> I read Kai's latest document as something approaching
> today's
> >> > >>>>>>>>> consensus
> >> > >>>>>>> (or
> >> > >>>>>>>>> at least a common point of view?) rather than a historical
> >> > >>> document.
> >> > >>>>>>>>> Perhaps he and it can be given equal share of the
> >> consideration.
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> >> > >>>>>>>>>
> >> > >>>>>>>>>> Hey Andrew -
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> I largely agree with that statement.
> >> > >>>>>>>>>> My intention was to let the differences be worked out
> within
> >> > >> the
> >> > >>>>>>>>>> individual components once they were identified and
> subtasks
> >> > >>>> created.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> My reference to HSSO was really referring to a SSO *server*
> >> > >> based
> >> > >>>>>>> design
> >> > >>>>>>>>>> which was not clearly articulated in the earlier documents.
> >> > >>>>>>>>>> We aren't trying to compare and contrast one design over
> >> > >> another
> >> > >>>>>>> anymore.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> Let's move this collaboration along as we've mapped out and
> >> the
> >> > >>>>>>>>>> differences in the details will reveal themselves and be
> >> > >>> addressed
> >> > >>>>>>> within
> >> > >>>>>>>>>> their components.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> I've actually been looking forward to you weighing in on
> the
> >> > >>>>>>>>>> actual discussion points in this thread.
> >> > >>>>>>>>>> Could you do that?
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> At this point, I am most interested in your thoughts on a
> >> > >> single
> >> > >>>>>>>>>> jira
> >> > >>>>>>> to
> >> > >>>>>>>>>> represent all of this work and whether we should start
> >> > >> discussing
> >> > >>>>>>>>>> the
> >> > >>>>>>> SSO
> >> > >>>>>>>>>> Tokens.
> >> > >>>>>>>>>> If you think there are discussion points missing from that
> >> > >> list,
> >> > >>>>>>>>>> feel
> >> > >>>>>>> free
> >> > >>>>>>>>>> to add to it.
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> thanks,
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> --larry
> >> > >>>>>>>>>>
> >> > >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
> >> > >> apurtell@apache.org>
> >> > >>>>>>> wrote:
> >> > >>>>>>>>>>
> >> > >>>>>>>>>>> Hi Larry,
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let me
> >> > >> point
> >> > >>>>>>>>>>> out
> >> > >>>>>>> that,
> >> > >>>>>>>>>>> while the differences between the competing JIRAs have
> been
> >> > >>>>>>>>>>> reduced
> >> > >>>>>>> for
> >> > >>>>>>>>>>> sure, there were some key differences that didn't just
> >> > >>> disappear.
> >> > >>>>>>>>>>> Subsequent discussion will make that clear. I also
> disagree
> >> > >> with
> >> > >>>>>>>>>>> your characterization that we have simply endorsed all of
> the
> >> > >>>>>>>>>>> design
> >> > >>>>>>> decisions
> >> > >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an
> inch. We
> >> > >>> are
> >> > >>>>>>> here to
> >> > >>>>>>>>>>> engage in a collaborative process as peers. I've been
> >> > >> encouraged
> >> > >>>>>>>>>>> by
> >> > >>>>>>> the
> >> > >>>>>>>>>>> spirit of the discussions up to this point and hope that
> can
> >> > >>>>>>>>>>> continue beyond one design summit.
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> >> > >>>>>>>>>>> <lm...@hortonworks.com>
> >> > >>>>>>>>>> wrote:
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>>> Hi Kai -
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> I think that I need to clarify something...
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> This is not an update for 9533 but a continuation of the
> >> > >>>>>>>>>>>> discussions
> >> > >>>>>>>>>> that
> >> > >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
> >> > >>>>>>>>>>>> We've agreed to leave our previous designs behind and
> >> > >> therefore
> >> > >>>>>>>>>>>> we
> >> > >>>>>>>>>> aren't
> >> > >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS
> approach
> >> or
> >> > >>> an
> >> > >>>>>>> HSSO vs
> >> > >>>>>>>>>>>> TAS discussion.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> Your latest design revision actually makes it clear that
> you
> >> > >>> are
> >> > >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
> >> > >> comparing
> >> > >>>>>>>>>>>> and
> >> > >>>>>>>>>> contrasting
> >> > >>>>>>>>>>>> is not going to add any value.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> What we need you to do at this point, is to look at those
> >> > >>>>>>>>>>>> high-level components described on this thread and
> comment
> >> on
> >> > >>>>>>>>>>>> whether we need additional components or any that are
> listed
> >> > >>>>>>>>>>>> that don't seem
> >> > >>>>>>> necessary
> >> > >>>>>>>>>> to
> >> > >>>>>>>>>>>> you and why.
> >> > >>>>>>>>>>>> In other words, we need to define and agree on the work
> that
> >> > >>> has
> >> > >>>>>>>>>>>> to
> >> > >>>>>>> be
> >> > >>>>>>>>>>>> done.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> We also need to determine those components that need to
> be
> >> > >> done
> >> > >>>>>>> before
> >> > >>>>>>>>>>>> anything else can be started.
> >> > >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens
> are
> >> > >>>>>>>>>>>> central to
> >> > >>>>>>>>>> all
> >> > >>>>>>>>>>>> the other components and should probably be defined and
> >> POC'd
> >> > >>> in
> >> > >>>>>>> short
> >> > >>>>>>>>>>>> order.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> Personally, I think that continuing the separation of
> 9533
> >> > >> and
> >> > >>>>>>>>>>>> 9392
> >> > >>>>>>> will
> >> > >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be
> enough
> >> > >>>>>>> differences
> >> > >>>>>>>>>>>> between the two to justify separate jiras anymore. It
> may be
> >> > >>>>>>>>>>>> best to
> >> > >>>>>>>>>> file a
> >> > >>>>>>>>>>>> new one that reflects a single vision without the extra
> >> cruft
> >> > >>>>>>>>>>>> that
> >> > >>>>>>> has
> >> > >>>>>>>>>>>> built up in either of the existing ones. We would
> certainly
> >> > >>>>>>>>>>>> reference
> >> > >>>>>>>>>> the
> >> > >>>>>>>>>>>> existing ones within the new one. This approach would
> align
> >> > >>> with
> >> > >>>>>>>>>>>> the
> >> > >>>>>>>>>> spirit
> >> > >>>>>>>>>>>> of the discussions up to this point.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> I am prepared to start a discussion around the shape of
> the
> >> > >> two
> >> > >>>>>>> Hadoop
> >> > >>>>>>>>>> SSO
> >> > >>>>>>>>>>>> tokens: identity and access. If this is what others feel
> the
> >> > >>>>>>>>>>>> next
> >> > >>>>>>> topic
> >> > >>>>>>>>>>>> should be.
> >> > >>>>>>>>>>>> If we can identify a jira home for it, we can do it
> there -
> >> > >>>>>>> otherwise we
> >> > >>>>>>>>>>>> can create another DISCUSS thread for it.
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> thanks,
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> --larry
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
> >> > >> kai.zheng@intel.com>
> >> > >>>>>>> wrote:
> >> > >>>>>>>>>>>>
> >> > >>>>>>>>>>>>> Hi Larry,
> >> > >>>>>>>>>>>>>
> >> > >>>>>>>>>>>>> Thanks for the update. Good to see that with this
> update we
> >> > >>> are
> >> > >>>>>>>>>>>>> now
> >> > >>>>>>>>>>>> aligned on most points.
> >> > >>>>>>>>>>>>>
> >> > >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392.
> >> The
> >> > >>>>>>>>>>>>> new
> >> > >>>>>>>>>>>> revision incorporates feedback and suggestions in related
> >> > >>>>>>>>>>>> discussion
> >> > >>>>>>>>>> with
> >> > >>>>>>>>>>>> the community, particularly from Microsoft and others
> >> > >> attending
> >> > >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
> >> > >>> Summary
> >> > >>>>>>>>>>>> of the
> >> > >>>>>>>>>> changes:
> >> > >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens,
> Identity
> >> > >>> Token
> >> > >>>>>>> plus
> >> > >>>>>>>>>>>> Access Token, particularly considering our authorization
> >> > >>>>>>>>>>>> framework
> >> > >>>>>>> and
> >> > >>>>>>>>>>>> compatibility with HSSO;
> >> > >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
> >> > >>>> authorization
> >> > >>>>>>>>>>>> framework into the flow that issues access tokens for
> >> clients
> >> > >>>>>>>>>>>> with
> >> > >>>>>>>>>> identity
> >> > >>>>>>>>>>>> tokens to access services;
> >> > >>>>>>>>>>>>> 3.    Refined proxy access token and the
> >> proxy/impersonation
> >> > >>>> flow;
> >> > >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access
> to
> >> > >>>> Hadoop
> >> > >>>>>>> web
> >> > >>>>>>>>>>>> services;
> >> > >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>>
> >> > >>>>>>>>> --
> >> > >>>>>>>>> Best regards,
> >> > >>>>>>>>>
> >> > >>>>>>>>> - Andy
> >> > >>>>>>>>>
> >> > >>>>>>>>> Problems worthy of attack prove their worth by hitting
> back. -
> >> > >>> Piet
> >> > >>>>>>>>> Hein (via Tom White)
> >> > >>>>>>>>
> >> > >>>>>>>
> >> > >>>>>>>
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> --
> >> > >>>>>> Alejandro
> >> > >>>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
> >> > >>>>
> >> > >>>>
> >> > >>>
> >> > >>>
> >> > >>> --
> >> > >>> Alejandro
> >> > >>>
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Alejandro
> >> >
> >> >
> >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Chris Douglas <cd...@apache.org>.

On Tue, Sep 3, 2013 at 5:20 AM, Larry McCay <lm...@hortonworks.com> wrote:
> One outstanding question for me - how do we go about getting the branches
> created?

Once a group has converged on a purpose- ideally with some initial
code from JIRA- please go ahead and create the feature branch in svn.
There's no ceremony. -C

> On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
>
>> Near the bottom of the bylaws, it states that addition of a "New Branch
>> Committer" requires "Lazy consensus of active PMC members."  I think this
>> means that you'll need to get a PMC member to sponsor the vote for you.
>>  Regular committer votes happen on the private PMC mailing list, and I
>> assume it would be the same for a branch committer vote.
>>
>> http://hadoop.apache.org/bylaws.html
>>
>> Chris Nauroth
>> Hortonworks
>> http://hortonworks.com/
>>
>>
>>
>> On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com>
>> wrote:
>>
>> > That sounds perfect!
>> > I have been thinking of late that we would maybe need an incubator
>> project
>> > or something for this - which would be unfortunate.
>> >
>> > This would allow us to move much more quickly with a set of patches
>> broken
>> > up into consumable/understandable chunks that are made functional more
>> > easily within the branch.
>> > I assume that we need to start a separate thread for DISCUSS or VOTE to
>> > start that process - correct?
>> >
>> > On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com>
>> wrote:
>> >
>> > > yep, that is what I meant. Thanks Chris
>> > >
>> > >
>> > > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <
>> cnauroth@hortonworks.com
>> > >wrote:
>> > >
>> > >> Perhaps this is also a good opportunity to try out the new "branch
>> > >> committers" clause in the bylaws, enabling non-committers who are
>> > working
>> > >> on this to commit to the feature branch.
>> > >>
>> > >>
>> > >>
>> >
>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E
>> > >>
>> > >> Chris Nauroth
>> > >> Hortonworks
>> > >> http://hortonworks.com/
>> > >>
>> > >>
>> > >>
>> > >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <tucu@cloudera.com
>> > >>> wrote:
>> > >>
>> > >>> Larry,
>> > >>>
>> > >>> Sorry for the delay answering. Thanks for laying down things, yes, it
>> > >> makes
>> > >>> sense.
>> > >>>
>> > >>> Given the large scope of the changes, number of JIRAs and number of
>> > >>> developers involved, wouldn't make sense to create a feature branch
>> for
>> > >> all
>> > >>> this work not to destabilize (more ;) trunk?
>> > >>>
>> > >>> Thanks again.
>> > >>>
>> > >>>
>> > >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lmccay@hortonworks.com
>> >
>> > >>> wrote:
>> > >>>
>> > >>>> The following JIRA was filed to provide a token and basic authority
>> > >>>> implementation for this effort:
>> > >>>> https://issues.apache.org/jira/browse/HADOOP-9781
>> > >>>>
>> > >>>> I have attached an initial patch though have yet to submit it as one
>> > >>> since
>> > >>>> it is dependent on the patch for CMF that was posted to:
>> > >>>> https://issues.apache.org/jira/browse/HADOOP-9534
>> > >>>> and this patch still has a couple outstanding issues - javac
>> warnings
>> > >> for
>> > >>>> com.sun classes for certification generation and 11 javadoc
>> warnings.
>> > >>>>
>> > >>>> Please feel free to review the patches and raise any questions or
>> > >>> concerns
>> > >>>> related to them.
>> > >>>>
>> > >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com>
>> > >> wrote:
>> > >>>>
>> > >>>>> Hello All -
>> > >>>>>
>> > >>>>> In an effort to scope an initial iteration that provides value to
>> the
>> > >>>> community while focusing on the pluggable authentication aspects,
>> I've
>> > >>>> written a description for "Iteration 1". It identifies the goal of
>> the
>> > >>>> iteration, the endstate and a set of initial usecases. It also
>> > >> enumerates
>> > >>>> the components that are required for each usecase. There is a scope
>> > >>> section
>> > >>>> that details specific things that should be kept out of the first
>> > >>>> iteration. This is certainly up for discussion. There may be some of
>> > >>> these
>> > >>>> things that can be contributed in short order. If we can add some
>> > >> things
>> > >>> in
>> > >>>> without unnecessary complexity for the identified usecases then we
>> > >>> should.
>> > >>>>>
>> > >>>>> @Alejandro - please review this and see whether it satisfies your
>> > >> point
>> > >>>> for a definition of what we are building.
>> > >>>>>
>> > >>>>> In addition to the document that I will paste here as text and
>> > >> attach a
>> > >>>> pdf version, we have a couple patches for components that are
>> > >> identified
>> > >>> in
>> > >>>> the document.
>> > >>>>> Specifically, COMP-7 and COMP-8.
>> > >>>>>
>> > >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was
>> > >> filed
>> > >>>> specifically for that functionality.
>> > >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as the
>> > >> token
>> > >>>> format and a basic JsonWebTokenAuthority that can issue and verify
>> > >> these
>> > >>>> tokens.
>> > >>>>>
>> > >>>>> Since there is no JIRA for this yet, I will likely file a new JIRA
>> > >> for
>> > >>> a
>> > >>>> SSO token implementation.
>> > >>>>>
>> > >>>>> Both of these patches assume to be modules within
>> > >>>> hadoop-common/hadoop-common-project.
>> > >>>>> While they are relatively small, I think that they will be pulled
>> in
>> > >> by
>> > >>>> other modules such as hadoop-auth which would likely not want a
>> > >>> dependency
>> > >>>> on something larger like
>> > >>> hadoop-common/hadoop-common-project/hadoop-common.
>> > >>>>>
>> > >>>>> This is certainly something that we should discuss within the
>> > >> community
>> > >>>> for this effort though - that being, exactly how to add these
>> > libraries
>> > >>> so
>> > >>>> that they are most easily consumed by existing projects.
>> > >>>>>
>> > >>>>> Anyway, the following is the Iteration-1 document - it is also
>> > >> attached
>> > >>>> as a pdf:
>> > >>>>>
>> > >>>>> Iteration 1: Pluggable User Authentication and Federation
>> > >>>>>
>> > >>>>> Introduction
>> > >>>>> The intent of this effort is to bootstrap the development of
>> > >> pluggable
>> > >>>> token-based authentication mechanisms to support certain goals of
>> > >>>> enterprise authentication integrations. By restricting the scope of
>> > >> this
>> > >>>> effort, we hope to provide immediate benefit to the community while
>> > >>> keeping
>> > >>>> the initial contribution to a manageable size that can be easily
>> > >>> reviewed,
>> > >>>> understood and extended with further development through follow up
>> > >> JIRAs
>> > >>>> and related iterations.
>> > >>>>>
>> > >>>>> Iteration Endstate
>> > >>>>> Once complete, this effort will have extended the authentication
>> > >>>> mechanisms - for all client types - from the existing: Simple,
>> > Kerberos
>> > >>> and
>> > >>>> Plain (for RPC) to include LDAP authentication and SAML based
>> > >> federation.
>> > >>>> In addition, the ability to provide additional/custom authentication
>> > >>>> mechanisms will be enabled for users to plug in their preferred
>> > >>> mechanisms.
>> > >>>>>
>> > >>>>> Project Scope
>> > >>>>> The scope of this effort is a subset of the features covered by the
>> > >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates
>> on
>> > >>>> enabling Hadoop to issue, accept/validate SSO tokens of its own. The
>> > >>>> pluggable authentication mechanism within SASL/RPC layer and the
>> > >>>> authentication filter pluggability for REST and UI components will
>> be
>> > >>>> leveraged and extended to support the results of this effort.
>> > >>>>>
>> > >>>>> Out of Scope
>> > >>>>> In order to scope the initial deliverable as the minimally viable
>> > >>>> product, a handful of things have been simplified or left out of
>> scope
>> > >>> for
>> > >>>> this effort. This is not meant to say that these aspects are not
>> > useful
>> > >>> or
>> > >>>> not needed but that they are not necessary for this iteration. We do
>> > >>>> however need to ensure that we don’t do anything to preclude adding
>> > >> them
>> > >>> in
>> > >>>> future iterations.
>> > >>>>> 1. Additional Attributes - the result of authentication will
>> continue
>> > >>> to
>> > >>>> use the existing hadoop tokens and identity representations.
>> > Additional
>> > >>>> attributes used for finer grained authorization decisions will be
>> > added
>> > >>>> through follow-up efforts.
>> > >>>>> 2. Token revocation - the ability to revoke issued identity tokens
>> > >> will
>> > >>>> be added later
>> > >>>>> 3. Multi-factor authentication - this will likely require
>> additional
>> > >>>> attributes and is not necessary for this iteration.
>> > >>>>> 4. Authorization changes - we will require additional attributes
>> for
>> > >>> the
>> > >>>> fine-grained access control plans. This is not needed for this
>> > >> iteration.
>> > >>>>> 5. Domains - we assume a single flat domain for all users
>> > >>>>> 6. Kinit alternative - we can leverage existing REST clients such
>> as
>> > >>>> cURL to retrieve tokens through authentication and federation for
>> the
>> > >>> time
>> > >>>> being
>> > >>>>> 7. A specific authentication framework isn’t really necessary
>> within
>> > >>> the
>> > >>>> REST endpoints for this iteration. If one is available then we can
>> use
>> > >> it
>> > >>>> otherwise we can leverage existing things like Apache Shiro within a
>> > >>>> servlet filter.
>> > >>>>>
>> > >>>>> In Scope
>> > >>>>> What is in scope for this effort is defined by the usecases
>> described
>> > >>>> below. Components required for supporting the usecases are
>> summarized
>> > >> for
>> > >>>> each client type. Each component is a candidate for a JIRA subtask -
>> > >>> though
>> > >>>> multiple components are likely to be included in a JIRA to
>> represent a
>> > >>> set
>> > >>>> of functionality rather than individual JIRAs per component.
>> > >>>>>
>> > >>>>> Terminology and Naming
>> > >>>>> The terms and names of components within this document are merely
>> > >>>> descriptive of the functionality that they represent. Any similarity
>> > or
>> > >>>> difference in names or terms from those that are found in other
>> > >> documents
>> > >>>> are not intended to make any statement about those other documents
>> or
>> > >> the
>> > >>>> descriptions within. This document represents the pluggable
>> > >>> authentication
>> > >>>> mechanisms and server functionality required to replace Kerberos.
>> > >>>>>
>> > >>>>> Ultimately, the naming of the implementation classes will be a
>> > >> product
>> > >>>> of the patches accepted by the community.
>> > >>>>>
>> > >>>>> Usecases:
>> > >>>>> client types: REST, CLI, UI
>> > >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
>> > >>>> federation/SAML
>> > >>>>>
>> > >>>>> Simple and Kerberos
>> > >>>>> Simple and Kerberos usecases continue to work as they do today. The
>> > >>>> addition of Authentication/LDAP and Federation/SAML are added
>> through
>> > >> the
>> > >>>> existing pluggability points either as they are or with required
>> > >>> extension.
>> > >>>> Either way, continued support for Simple and Kerberos must not
>> require
>> > >>>> changes to existing deployments in the field as a result of this
>> > >> effort.
>> > >>>>>
>> > >>>>> REST
>> > >>>>> USECASE REST-1 Authentication/LDAP:
>> > >>>>> For REST clients, we will provide the ability to:
>> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
>> exposed
>> > >> by
>> > >>>> an AuthenticationServer instance via REST calls to:
>> > >>>>>   a. authenticate - passing username/password returning a hadoop
>> > >>>> id_token
>> > >>>>>   b. get-access-token - from the TokenGrantingService by passing
>> the
>> > >>>> hadoop id_token as an Authorization: Bearer token along with the
>> > >> desired
>> > >>>> service name (master service name) returning a hadoop access token
>> > >>>>> 2. Successfully invoke a hadoop service REST API passing the hadoop
>> > >>>> access token through an HTTP header as an Authorization Bearer token
>> > >>>>>   a. validation of the incoming token on the service endpoint is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>> 3. Successfully block access to a REST resource when presenting a
>> > >>> hadoop
>> > >>>> access token intended for a different service
>> > >>>>>   a. validation of the incoming token on the service endpoint is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>>
>> > >>>>> USECASE REST-2 Federation/SAML:
>> > >>>>> We will also provide federation capabilities for REST clients such
>> > >>> that:
>> > >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?)
>> and
>> > >>>> persist in a permissions protected file - ie.
>> > >> ~/.hadoop_tokens/.idp_token
>> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
>> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
>> > instance
>> > >>> via
>> > >>>> REST calls to:
>> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
>> Bearer
>> > >>>> token returning a hadoop id_token
>> > >>>>>      - can copy and paste from commandline or use cat to include
>> > >>>> persisted token through "--Header Authorization: Bearer 'cat
>> > >>>> ~/.hadoop_tokens/.id_token'"
>> > >>>>>   b. get-access-token - from the TokenGrantingService by passing
>> the
>> > >>>> hadoop id_token as an Authorization: Bearer token along with the
>> > >> desired
>> > >>>> service name (master service name) to the TokenGrantingService
>> > >> returning
>> > >>> a
>> > >>>> hadoop access token
>> > >>>>> 3. Successfully invoke a hadoop service REST API passing the hadoop
>> > >>>> access token through an HTTP header as an Authorization Bearer token
>> > >>>>>   a. validation of the incoming token on the service endpoint is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>> 4. Successfully block access to a REST resource when presenting a
>> > >>> hadoop
>> > >>>> access token intended for a different service
>> > >>>>>   a. validation of the incoming token on the service endpoint is
>> > >>>> accomplished by an SSOAuthenticationHandler
>> > >>>>>
>> > >>>>> REQUIRED COMPONENTS for REST USECASES:
>> > >>>>> COMP-1. REST client - cURL or similar
>> > >>>>> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP
>> endpoint
>> > >>>> example - returning hadoop id_token
>> > >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
>> > >>> shibboleth
>> > >>>> SP?|OpenSAML? - returning hadoop id_token
>> > >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop
>> access
>> > >>>> tokens from hadoop id_tokens
>> > >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
>> > >>>> tokens
>> > >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
>> > >>>>> COMP-7. hadoop token and authority implementations
>> > >>>>> COMP-8. core services for crypto support for signing, verifying and
>> > >> PKI
>> > >>>> management
>> > >>>>>
>> > >>>>> CLI
>> > >>>>> USECASE CLI-1 Authentication/LDAP:
>> > >>>>> For CLI/RPC clients, we will provide the ability to:
>> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
>> exposed
>> > >> by
>> > >>>> an AuthenticationServer instance via REST calls to:
>> > >>>>>   a. authenticate - passing username/password returning a hadoop
>> > >>>> id_token
>> > >>>>>      - for RPC clients we need to persist the returned hadoop
>> > >> identity
>> > >>>> token in a file protected by fs permissions so that it may be
>> > leveraged
>> > >>>> until expiry
>> > >>>>>      - directing the returned response to a file may suffice for
>> now
>> > >>>> something like ">~/.hadoop_tokens/.id_token"
>> > >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
>> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
>> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is
>> passed
>> > >> as
>> > >>>> Authorization: Bearer token to the get-access-token REST endpoint
>> > >> exposed
>> > >>>> by TokenGrantingService returning a hadoop access token
>> > >>>>>   b. RPC server side validates the presented hadoop access token
>> and
>> > >>>> continues to serve request
>> > >>>>>   c. Successfully invoke a hadoop service RPC API
>> > >>>>>
>> > >>>>> USECASE CLI-2 Federation/SAML:
>> > >>>>> For CLI/RPC clients, we will provide the ability to:
>> > >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?)
>> and
>> > >>>> persist in a permissions protected file - ie.
>> > >> ~/.hadoop_tokens/.idp_token
>> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
>> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
>> > instance
>> > >>> via
>> > >>>> REST calls to:
>> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
>> Bearer
>> > >>>> token returning a hadoop id_token
>> > >>>>>      - can copy and paste from commandline or use cat to include
>> > >>>> previously persisted token through "--Header Authorization: Bearer
>> > 'cat
>> > >>>> ~/.hadoop_tokens/.id_token'"
>> > >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
>> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
>> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is
>> passed
>> > >> as
>> > >>>> Authorization: Bearer token to the get-access-token REST endpoint
>> > >> exposed
>> > >>>> by TokenGrantingService returning a hadoop access token
>> > >>>>>   b. RPC server side validates the presented hadoop access token
>> and
>> > >>>> continues to serve request
>> > >>>>>   c. Successfully invoke a hadoop service RPC API
>> > >>>>>
>> > >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required for
>> > >>> REST):
>> > >>>>> COMP-9. TokenAuth Method negotiation, etc
>> > >>>>> COMP-10. Client side implementation to leverage REST endpoint for
>> > >>>> acquiring hadoop access tokens given a hadoop id_token
>> > >>>>> COMP-11. Server side implementation to validate incoming hadoop
>> > >> access
>> > >>>> tokens
>> > >>>>>
>> > >>>>> UI
>> > >>>>> Various Hadoop services have their own web UI consoles for
>> > >>>> administration and end user interactions. These consoles need to
>> also
>> > >>>> benefit from the pluggability of authentication mechansims to be on
>> > par
>> > >>>> with the access control of the cluster REST and RPC APIs.
>> > >>>>> Web consoles are protected with an WebSSOAuthenticationHandler
>> which
>> > >>>> will be configured for either authentication or federation.
>> > >>>>>
>> > >>>>> USECASE UI-1 Authentication/LDAP:
>> > >>>>> For the authentication usecase:
>> > >>>>> 1. User’s browser requests access to a UI console page
>> > >>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
>> > >> the
>> > >>>> browser to an IdP web endpoint exposed by the AuthenticationServer
>> > >>> passing
>> > >>>> the requested url as the redirect_url
>> > >>>>> 3. IdP web endpoint presents the user with a FORM over https
>> > >>>>>   a. user provides username/password and submits the FORM
>> > >>>>> 4. AuthenticationServer authenticates the user with provided
>> > >>> credentials
>> > >>>> against the configured LDAP server and:
>> > >>>>>   a. leverages a servlet filter or other authentication mechanism
>> > >> for
>> > >>>> the endpoint and authenticates the user with a simple LDAP bind with
>> > >>>> username and password
>> > >>>>>   b. acquires a hadoop id_token and uses it to acquire the required
>> > >>>> hadoop access token which is added as a cookie
>> > >>>>>   c. redirects the browser to the original service UI resource via
>> > >> the
>> > >>>> provided redirect_url
>> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>> > >>> interrogates
>> > >>>> the incoming request again for an authcookie that contains an access
>> > >>> token
>> > >>>> upon finding one:
>> > >>>>>   a. validates the incoming token
>> > >>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
>> > >>>> contract
>> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
>> > >>> expected
>> > >>>> token
>> > >>>>>   d. serves requested resource for valid tokens
>> > >>>>>   e. subsequent requests are handled by the AuthenticationFilter
>> > >>>> recognition of the hadoop auth cookie
>> > >>>>>
>> > >>>>> USECASE UI-2 Federation/SAML:
>> > >>>>> For the federation usecase:
>> > >>>>> 1. User’s browser requests access to a UI console page
>> > >>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
>> > >> the
>> > >>>> browser to an SP web endpoint exposed by the AuthenticationServer
>> > >> passing
>> > >>>> the requested url as the redirect_url. This endpoint:
>> > >>>>>   a. is dedicated to redirecting to the external IdP passing the
>> > >>>> required parameters which may include a redirect_url back to itself
>> as
>> > >>> well
>> > >>>> as encoding the original redirect_url so that it can determine it on
>> > >> the
>> > >>>> way back to the client
>> > >>>>> 3. the IdP:
>> > >>>>>   a. challenges the user for credentials and authenticates the user
>> > >>>>>   b. creates appropriate token/cookie and redirects back to the
>> > >>>> AuthenticationServer endpoint
>> > >>>>> 4. AuthenticationServer endpoint:
>> > >>>>>   a. extracts the expected token/cookie from the incoming request
>> > >> and
>> > >>>> validates it
>> > >>>>>   b. creates a hadoop id_token
>> > >>>>>   c. acquires a hadoop access token for the id_token
>> > >>>>>   d. creates appropriate cookie and redirects back to the original
>> > >>>> redirect_url - being the requested resource
>> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>> > >>> interrogates
>> > >>>> the incoming request again for an authcookie that contains an access
>> > >>> token
>> > >>>> upon finding one:
>> > >>>>>   a. validates the incoming token
>> > >>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
>> > >>>> contrac
>> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
>> > >>> expected
>> > >>>> token
>> > >>>>>   d. serves requested resource for valid tokens
>> > >>>>>   e. subsequent requests are handled by the AuthenticationFilter
>> > >>>> recognition of the hadoop auth cookie
>> > >>>>> REQUIRED COMPONENTS for UI USECASES:
>> > >>>>> COMP-12. WebSSOAuthenticationHandler
>> > >>>>> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM
>> based
>> > >>>> login
>> > >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party
>> > >>> token
>> > >>>> federation
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
>> > >> Brian.Swan@microsoft.com>
>> > >>>> wrote:
>> > >>>>> Thanks, Larry. That is what I was trying to say, but you've said it
>> > >>>> better and in more detail. :-) To extract from what you are saying:
>> > "If
>> > >>> we
>> > >>>> were to reframe the immediate scope to the lowest common denominator
>> > of
>> > >>>> what is needed for accepting tokens in authentication plugins then
>> we
>> > >>>> gain... an end-state for the lowest common denominator that enables
>> > >> code
>> > >>>> patches in the near-term is the best of both worlds."
>> > >>>>>
>> > >>>>> -Brian
>> > >>>>>
>> > >>>>> -----Original Message-----
>> > >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>> > >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
>> > >>>>> To: common-dev@hadoop.apache.org
>> > >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
>> > >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>> > >>>>>
>> > >>>>> It seems to me that we can have the best of both worlds here...it's
>> > >> all
>> > >>>> about the scoping.
>> > >>>>>
>> > >>>>> If we were to reframe the immediate scope to the lowest common
>> > >>>> denominator of what is needed for accepting tokens in authentication
>> > >>>> plugins then we gain:
>> > >>>>>
>> > >>>>> 1. a very manageable scope to define and agree upon 2. a
>> deliverable
>> > >>>> that should be useful in and of itself 3. a foundation for community
>> > >>>> collaboration that we build on for higher level solutions built on
>> > this
>> > >>>> lowest common denominator and experience as a working community
>> > >>>>>
>> > >>>>> So, to Alejandro's point, perhaps we need to define what would make
>> > >> #2
>> > >>>> above true - this could serve as the "what" we are building instead
>> of
>> > >>> the
>> > >>>> "how" to build it.
>> > >>>>> Including:
>> > >>>>> a. project structure within hadoop-common-project/common-security
>> or
>> > >>> the
>> > >>>> like b. the usecases that would need to be enabled to make it a self
>> > >>>> contained and useful contribution - without higher level solutions
>> c.
>> > >> the
>> > >>>> JIRA/s for contributing patches d. what specific patches will be
>> > needed
>> > >>> to
>> > >>>> accomplished the usecases in #b
>> > >>>>>
>> > >>>>> In other words, an end-state for the lowest common denominator that
>> > >>>> enables code patches in the near-term is the best of both worlds.
>> > >>>>>
>> > >>>>> I think this may be a good way to bootstrap the collaboration
>> process
>> > >>>> for our emerging security community rather than trying to tackle a
>> > huge
>> > >>>> vision all at once.
>> > >>>>>
>> > >>>>> @Alejandro - if you have something else in mind that would
>> bootstrap
>> > >>>> this process - that would great - please advise.
>> > >>>>>
>> > >>>>> thoughts?
>> > >>>>>
>> > >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
>> > >>>> wrote:
>> > >>>>>
>> > >>>>>> Hi Alejandro, all-
>> > >>>>>>
>> > >>>>>> There seems to be agreement on the broad stroke description of the
>> > >>>> components needed to achieve pluggable token authentication (I'm
>> sure
>> > >>> I'll
>> > >>>> be corrected if that isn't the case). However, discussion of the
>> > >> details
>> > >>> of
>> > >>>> those components doesn't seem to be moving forward. I think this is
>> > >>> because
>> > >>>> the details are really best understood through code. I also see *a*
>> > >> (i.e.
>> > >>>> one of many possible) token format and pluggable authentication
>> > >>> mechanisms
>> > >>>> within the RPC layer as components that can have immediate benefit
>> to
>> > >>>> Hadoop users AND still allow flexibility in the larger design. So, I
>> > >>> think
>> > >>>> the best way to move the conversation of "what we are aiming for"
>> > >> forward
>> > >>>> is to start looking at code for these components. I am especially
>> > >>>> interested in moving forward with pluggable authentication
>> mechanisms
>> > >>>> within the RPC layer and would love to see what others have done in
>> > >> this
>> > >>>> area (if anything).
>> > >>>>>>
>> > >>>>>> Thanks.
>> > >>>>>>
>> > >>>>>> -Brian
>> > >>>>>>
>> > >>>>>> -----Original Message-----
>> > >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
>> > >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
>> > >>>>>> To: Larry McCay
>> > >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
>> > >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>> > >>>>>>
>> > >>>>>> Larry, all,
>> > >>>>>>
>> > >>>>>> Still is not clear to me what is the end state we are aiming for,
>> > >> or
>> > >>>> that we even agree on that.
>> > >>>>>>
>> > >>>>>> IMO, Instead trying to agree what to do, we should first  agree on
>> > >>> the
>> > >>>> final state, then we see what should be changed to there there, then
>> > we
>> > >>> see
>> > >>>> how we change things to get there.
>> > >>>>>>
>> > >>>>>> The different documents out there focus more on how.
>> > >>>>>>
>> > >>>>>> We not try to say how before we know what.
>> > >>>>>>
>> > >>>>>> Thx.
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
>> > >> lmccay@hortonworks.com
>> > >>>>
>> > >>>> wrote:
>> > >>>>>>
>> > >>>>>>> All -
>> > >>>>>>>
>> > >>>>>>> After combing through this thread - as well as the summit session
>> > >>>>>>> summary thread, I think that we have the following two items that
>> > >> we
>> > >>>>>>> can probably move forward with:
>> > >>>>>>>
>> > >>>>>>> 1. TokenAuth method - assuming this means the pluggable
>> > >>>>>>> authentication mechanisms within the RPC layer (2 votes: Kai and
>> > >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and
>> myself)
>> > >>>>>>>
>> > >>>>>>> I propose that we attack both of these aspects as one. Let's
>> > >> provide
>> > >>>>>>> the structure and interfaces of the pluggable framework for use
>> in
>> > >>>>>>> the RPC layer through leveraging Daryn's pluggability work and
>> POC
>> > >>> it
>> > >>>>>>> with a particular token format (not necessarily the only format
>> > >> ever
>> > >>>>>>> supported - we just need one to start). If there has already been
>> > >>>>>>> work done in this area by anyone then please speak up and commit
>> > >> to
>> > >>>>>>> providing a patch - so that we don't duplicate effort.
>> > >>>>>>>
>> > >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we can
>> > >> look
>> > >>>>>>> at to discern the pluggability mechanism details? Documentation
>> of
>> > >>> it
>> > >>>>>>> would be great as well.
>> > >>>>>>> @Kai - do you have existing code for the pluggable token
>> > >>>>>>> authentication mechanism - if not, we can take a stab at
>> > >>> representing
>> > >>>>>>> it with interfaces and/or POC code.
>> > >>>>>>> I can standup and say that we have a token format that we have
>> > >> been
>> > >>>>>>> working with already and can provide a patch that represents it
>> > >> as a
>> > >>>>>>> contribution to test out the pluggable tokenAuth.
>> > >>>>>>>
>> > >>>>>>> These patches will provide progress toward code being the central
>> > >>>>>>> discussion vehicle. As a community, we can then incrementally
>> > >> build
>> > >>>>>>> on that foundation in order to collaboratively deliver the common
>> > >>>> vision.
>> > >>>>>>>
>> > >>>>>>> In the absence of any other home for posting such patches, let's
>> > >>>>>>> assume that they will be attached to HADOOP-9392 - or a dedicated
>> > >>>>>>> subtask for this particular aspect/s - I will leave that detail
>> to
>> > >>>> Kai.
>> > >>>>>>>
>> > >>>>>>> @Alejandro, being the only voice on this thread that isn't
>> > >>>>>>> represented in the votes above, please feel free to agree or
>> > >>> disagree
>> > >>>> with this direction.
>> > >>>>>>>
>> > >>>>>>> thanks,
>> > >>>>>>>
>> > >>>>>>> --larry
>> > >>>>>>>
>> > >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
>> > >>>> wrote:
>> > >>>>>>>
>> > >>>>>>>> Hi Andy -
>> > >>>>>>>>
>> > >>>>>>>>> Happy Fourth of July to you and yours.
>> > >>>>>>>>
>> > >>>>>>>> Same to you and yours. :-)
>> > >>>>>>>> We had some fun in the sun for a change - we've had nothing but
>> > >>> rain
>> > >>>>>>>> on
>> > >>>>>>> the east coast lately.
>> > >>>>>>>>
>> > >>>>>>>>> My concern here is there may have been a misinterpretation or
>> > >> lack
>> > >>>>>>>>> of consensus on what is meant by "clean slate"
>> > >>>>>>>>
>> > >>>>>>>>
>> > >>>>>>>> Apparently so.
>> > >>>>>>>> On the pre-summit call, I stated that I was interested in
>> > >>>>>>>> reconciling
>> > >>>>>>> the jiras so that we had one to work from.
>> > >>>>>>>>
>> > >>>>>>>> You recommended that we set them aside for the time being - with
>> > >>> the
>> > >>>>>>> understanding that work would continue on your side (and our's as
>> > >>>>>>> well) - and approach the community discussion from a clean slate.
>> > >>>>>>>> We seemed to do this at the summit session quite well.
>> > >>>>>>>> It was my understanding that this community discussion would
>> live
>> > >>>>>>>> beyond
>> > >>>>>>> the summit and continue on this list.
>> > >>>>>>>>
>> > >>>>>>>> While closing the summit session we agreed to follow up on
>> > >>>>>>>> common-dev
>> > >>>>>>> with first a summary then a discussion of the moving parts.
>> > >>>>>>>>
>> > >>>>>>>> I never expected the previous work to be abandoned and fully
>> > >>>>>>>> expected it
>> > >>>>>>> to inform the discussion that happened here.
>> > >>>>>>>>
>> > >>>>>>>> If you would like to reframe what clean slate was supposed to
>> > >> mean
>> > >>>>>>>> or
>> > >>>>>>> describe what it means now - that would be welcome - before I
>> > >> waste
>> > >>>>>>> anymore time trying to facilitate a community discussion that is
>> > >>>>>>> apparently not wanted.
>> > >>>>>>>>
>> > >>>>>>>>> Nowhere in this
>> > >>>>>>>>> picture are self appointed "master JIRAs" and such, which have
>> > >>> been
>> > >>>>>>>>> disappointing to see crop up, we should be collaboratively
>> > >> coding
>> > >>>>>>>>> not planting flags.
>> > >>>>>>>>
>> > >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
>> > >>>>>>>> It has certainly not been anyone's intention to disappoint.
>> > >>>>>>>> Any mention of a new JIRA was just to have a clear context to
>> > >>> gather
>> > >>>>>>>> the
>> > >>>>>>> agreed upon points - previous and/or existing JIRAs would easily
>> > >> be
>> > >>>> linked.
>> > >>>>>>>>
>> > >>>>>>>> Planting flags... I need to go back and read my discussion point
>> > >>>>>>>> about the
>> > >>>>>>> JIRA and see how this is the impression that was made.
>> > >>>>>>>> That is not how I define success. The only flags that count is
>> > >>> code.
>> > >>>>>>> What we are lacking is the roadmap on which to put the code.
>> > >>>>>>>>
>> > >>>>>>>>> I read Kai's latest document as something approaching today's
>> > >>>>>>>>> consensus
>> > >>>>>>> (or
>> > >>>>>>>>> at least a common point of view?) rather than a historical
>> > >>> document.
>> > >>>>>>>>> Perhaps he and it can be given equal share of the
>> consideration.
>> > >>>>>>>>
>> > >>>>>>>> I definitely read it as something that has evolved into
>> something
>> > >>>>>>> approaching what we have been talking about so far. There has not
>> > >>>>>>> however been enough discussion anywhere near the level of detail
>> > >> in
>> > >>>>>>> that document and more details are needed for each component in
>> > >> the
>> > >>>> design.
>> > >>>>>>>> Why the work in that document should not be fed into the
>> > >> community
>> > >>>>>>> discussion as anyone else's would be - I fail to understand.
>> > >>>>>>>>
>> > >>>>>>>> My suggestion continues to be that you should take that document
>> > >>> and
>> > >>>>>>> speak to the inventory of moving parts as we agreed.
>> > >>>>>>>> As these are agreed upon, we will ensure that the appropriate
>> > >>>>>>>> subtasks
>> > >>>>>>> are filed against whatever JIRA is to host them - don't really
>> > >> care
>> > >>>>>>> much which it is.
>> > >>>>>>>>
>> > >>>>>>>> I don't really want to continue with two separate JIRAs - as I
>> > >>>>>>>> stated
>> > >>>>>>> long ago - but until we understand what the pieces are and how
>> > >> they
>> > >>>>>>> relate then they can't be consolidated.
>> > >>>>>>>> Even if 9533 ended up being repurposed as the server instance of
>> > >>> the
>> > >>>>>>> work - it should be a subtask of a larger one - if that is to be
>> > >>>>>>> 9392, so be it.
>> > >>>>>>>> We still need to define all the pieces of the larger picture
>> > >> before
>> > >>>>>>>> that
>> > >>>>>>> can be done.
>> > >>>>>>>>
>> > >>>>>>>> What I thought was the clean slate approach to the discussion
>> > >>> seemed
>> > >>>>>>>> a
>> > >>>>>>> very reasonable way to make all this happen.
>> > >>>>>>>> If you would like to restate what you intended by it or
>> something
>> > >>>>>>>> else
>> > >>>>>>> equally as reasonable as a way to move forward that would be
>> > >>> awesome.
>> > >>>>>>>>
>> > >>>>>>>> I will be happy to work toward the roadmap with everyone once it
>> > >> is
>> > >>>>>>> articulated, understood and actionable.
>> > >>>>>>>> In the meantime, I have work to do.
>> > >>>>>>>>
>> > >>>>>>>> thanks,
>> > >>>>>>>>
>> > >>>>>>>> --larry
>> > >>>>>>>>
>> > >>>>>>>> BTW - I meant to quote you in an earlier response and ended up
>> > >>>>>>>> saying it
>> > >>>>>>> was Aaron instead. Not sure what happened there. :-)
>> > >>>>>>>>
>> > >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <apurtell@apache.org
>> >
>> > >>>> wrote:
>> > >>>>>>>>
>> > >>>>>>>>> Hi Larry (and all),
>> > >>>>>>>>>
>> > >>>>>>>>> Happy Fourth of July to you and yours.
>> > >>>>>>>>>
>> > >>>>>>>>> In our shop Kai and Tianyou are already doing the coding, so
>> I'd
>> > >>>>>>>>> defer
>> > >>>>>>> to
>> > >>>>>>>>> them on the detailed points.
>> > >>>>>>>>>
>> > >>>>>>>>> My concern here is there may have been a misinterpretation or
>> > >> lack
>> > >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully that
>> > >> can
>> > >>>>>>>>> be
>> > >>>>>>> quickly
>> > >>>>>>>>> cleared up. Certainly we did not mean ignore all that came
>> > >> before.
>> > >>>>>>>>> The
>> > >>>>>>> idea
>> > >>>>>>>>> was to reset discussions to find common ground and new
>> direction
>> > >>>>>>>>> where
>> > >>>>>>> we
>> > >>>>>>>>> are working together, not in conflict, on an agreed upon set of
>> > >>>>>>>>> design points and tasks. There's been a lot of good discussion
>> > >> and
>> > >>>>>>>>> design preceeding that we should figure out how to port over.
>> > >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs" and
>> > >>> such,
>> > >>>>>>>>> which have been disappointing to see crop up, we should be
>> > >>>>>>>>> collaboratively coding not planting flags.
>> > >>>>>>>>>
>> > >>>>>>>>> I read Kai's latest document as something approaching today's
>> > >>>>>>>>> consensus
>> > >>>>>>> (or
>> > >>>>>>>>> at least a common point of view?) rather than a historical
>> > >>> document.
>> > >>>>>>>>> Perhaps he and it can be given equal share of the
>> consideration.
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>> > >>>>>>>>>
>> > >>>>>>>>>> Hey Andrew -
>> > >>>>>>>>>>
>> > >>>>>>>>>> I largely agree with that statement.
>> > >>>>>>>>>> My intention was to let the differences be worked out within
>> > >> the
>> > >>>>>>>>>> individual components once they were identified and subtasks
>> > >>>> created.
>> > >>>>>>>>>>
>> > >>>>>>>>>> My reference to HSSO was really referring to a SSO *server*
>> > >> based
>> > >>>>>>> design
>> > >>>>>>>>>> which was not clearly articulated in the earlier documents.
>> > >>>>>>>>>> We aren't trying to compare and contrast one design over
>> > >> another
>> > >>>>>>> anymore.
>> > >>>>>>>>>>
>> > >>>>>>>>>> Let's move this collaboration along as we've mapped out and
>> the
>> > >>>>>>>>>> differences in the details will reveal themselves and be
>> > >>> addressed
>> > >>>>>>> within
>> > >>>>>>>>>> their components.
>> > >>>>>>>>>>
>> > >>>>>>>>>> I've actually been looking forward to you weighing in on the
>> > >>>>>>>>>> actual discussion points in this thread.
>> > >>>>>>>>>> Could you do that?
>> > >>>>>>>>>>
>> > >>>>>>>>>> At this point, I am most interested in your thoughts on a
>> > >> single
>> > >>>>>>>>>> jira
>> > >>>>>>> to
>> > >>>>>>>>>> represent all of this work and whether we should start
>> > >> discussing
>> > >>>>>>>>>> the
>> > >>>>>>> SSO
>> > >>>>>>>>>> Tokens.
>> > >>>>>>>>>> If you think there are discussion points missing from that
>> > >> list,
>> > >>>>>>>>>> feel
>> > >>>>>>> free
>> > >>>>>>>>>> to add to it.
>> > >>>>>>>>>>
>> > >>>>>>>>>> thanks,
>> > >>>>>>>>>>
>> > >>>>>>>>>> --larry
>> > >>>>>>>>>>
>> > >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
>> > >> apurtell@apache.org>
>> > >>>>>>> wrote:
>> > >>>>>>>>>>
>> > >>>>>>>>>>> Hi Larry,
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let me
>> > >> point
>> > >>>>>>>>>>> out
>> > >>>>>>> that,
>> > >>>>>>>>>>> while the differences between the competing JIRAs have been
>> > >>>>>>>>>>> reduced
>> > >>>>>>> for
>> > >>>>>>>>>>> sure, there were some key differences that didn't just
>> > >>> disappear.
>> > >>>>>>>>>>> Subsequent discussion will make that clear. I also disagree
>> > >> with
>> > >>>>>>>>>>> your characterization that we have simply endorsed all of the
>> > >>>>>>>>>>> design
>> > >>>>>>> decisions
>> > >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an inch. We
>> > >>> are
>> > >>>>>>> here to
>> > >>>>>>>>>>> engage in a collaborative process as peers. I've been
>> > >> encouraged
>> > >>>>>>>>>>> by
>> > >>>>>>> the
>> > >>>>>>>>>>> spirit of the discussions up to this point and hope that can
>> > >>>>>>>>>>> continue beyond one design summit.
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>
>> > >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
>> > >>>>>>>>>>> <lm...@hortonworks.com>
>> > >>>>>>>>>> wrote:
>> > >>>>>>>>>>>
>> > >>>>>>>>>>>> Hi Kai -
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> I think that I need to clarify something...
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> This is not an update for 9533 but a continuation of the
>> > >>>>>>>>>>>> discussions
>> > >>>>>>>>>> that
>> > >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
>> > >>>>>>>>>>>> We've agreed to leave our previous designs behind and
>> > >> therefore
>> > >>>>>>>>>>>> we
>> > >>>>>>>>>> aren't
>> > >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS approach
>> or
>> > >>> an
>> > >>>>>>> HSSO vs
>> > >>>>>>>>>>>> TAS discussion.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Your latest design revision actually makes it clear that you
>> > >>> are
>> > >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
>> > >> comparing
>> > >>>>>>>>>>>> and
>> > >>>>>>>>>> contrasting
>> > >>>>>>>>>>>> is not going to add any value.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> What we need you to do at this point, is to look at those
>> > >>>>>>>>>>>> high-level components described on this thread and comment
>> on
>> > >>>>>>>>>>>> whether we need additional components or any that are listed
>> > >>>>>>>>>>>> that don't seem
>> > >>>>>>> necessary
>> > >>>>>>>>>> to
>> > >>>>>>>>>>>> you and why.
>> > >>>>>>>>>>>> In other words, we need to define and agree on the work that
>> > >>> has
>> > >>>>>>>>>>>> to
>> > >>>>>>> be
>> > >>>>>>>>>>>> done.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> We also need to determine those components that need to be
>> > >> done
>> > >>>>>>> before
>> > >>>>>>>>>>>> anything else can be started.
>> > >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
>> > >>>>>>>>>>>> central to
>> > >>>>>>>>>> all
>> > >>>>>>>>>>>> the other components and should probably be defined and
>> POC'd
>> > >>> in
>> > >>>>>>> short
>> > >>>>>>>>>>>> order.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> Personally, I think that continuing the separation of 9533
>> > >> and
>> > >>>>>>>>>>>> 9392
>> > >>>>>>> will
>> > >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be enough
>> > >>>>>>> differences
>> > >>>>>>>>>>>> between the two to justify separate jiras anymore. It may be
>> > >>>>>>>>>>>> best to
>> > >>>>>>>>>> file a
>> > >>>>>>>>>>>> new one that reflects a single vision without the extra
>> cruft
>> > >>>>>>>>>>>> that
>> > >>>>>>> has
>> > >>>>>>>>>>>> built up in either of the existing ones. We would certainly
>> > >>>>>>>>>>>> reference
>> > >>>>>>>>>> the
>> > >>>>>>>>>>>> existing ones within the new one. This approach would align
>> > >>> with
>> > >>>>>>>>>>>> the
>> > >>>>>>>>>> spirit
>> > >>>>>>>>>>>> of the discussions up to this point.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> I am prepared to start a discussion around the shape of the
>> > >> two
>> > >>>>>>> Hadoop
>> > >>>>>>>>>> SSO
>> > >>>>>>>>>>>> tokens: identity and access. If this is what others feel the
>> > >>>>>>>>>>>> next
>> > >>>>>>> topic
>> > >>>>>>>>>>>> should be.
>> > >>>>>>>>>>>> If we can identify a jira home for it, we can do it there -
>> > >>>>>>> otherwise we
>> > >>>>>>>>>>>> can create another DISCUSS thread for it.
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> thanks,
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> --larry
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
>> > >> kai.zheng@intel.com>
>> > >>>>>>> wrote:
>> > >>>>>>>>>>>>
>> > >>>>>>>>>>>>> Hi Larry,
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> Thanks for the update. Good to see that with this update we
>> > >>> are
>> > >>>>>>>>>>>>> now
>> > >>>>>>>>>>>> aligned on most points.
>> > >>>>>>>>>>>>>
>> > >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392.
>> The
>> > >>>>>>>>>>>>> new
>> > >>>>>>>>>>>> revision incorporates feedback and suggestions in related
>> > >>>>>>>>>>>> discussion
>> > >>>>>>>>>> with
>> > >>>>>>>>>>>> the community, particularly from Microsoft and others
>> > >> attending
>> > >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
>> > >>> Summary
>> > >>>>>>>>>>>> of the
>> > >>>>>>>>>> changes:
>> > >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens, Identity
>> > >>> Token
>> > >>>>>>> plus
>> > >>>>>>>>>>>> Access Token, particularly considering our authorization
>> > >>>>>>>>>>>> framework
>> > >>>>>>> and
>> > >>>>>>>>>>>> compatibility with HSSO;
>> > >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
>> > >>>> authorization
>> > >>>>>>>>>>>> framework into the flow that issues access tokens for
>> clients
>> > >>>>>>>>>>>> with
>> > >>>>>>>>>> identity
>> > >>>>>>>>>>>> tokens to access services;
>> > >>>>>>>>>>>>> 3.    Refined proxy access token and the
>> proxy/impersonation
>> > >>>> flow;
>> > >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access to
>> > >>>> Hadoop
>> > >>>>>>> web
>> > >>>>>>>>>>>> services;
>> > >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>>
>> > >>>>>>>>> --
>> > >>>>>>>>> Best regards,
>> > >>>>>>>>>
>> > >>>>>>>>> - Andy
>> > >>>>>>>>>
>> > >>>>>>>>> Problems worthy of attack prove their worth by hitting back. -
>> > >>> Piet
>> > >>>>>>>>> Hein (via Tom White)
>> > >>>>>>>>
>> > >>>>>>>
>> > >>>>>>>
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> --
>> > >>>>>> Alejandro
>> > >>>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
>> > >>>>
>> > >>>>
>> > >>>
>> > >>>
>> > >>> --
>> > >>> Alejandro
>> > >>>
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Alejandro
>> >
>> >
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

All -

Given that we have moved forward with the branch committerships for the
initial set of security branch contributors, I think that we should propose
a branch for iteration-1 as described in this thread.

My proposal is that we limit the scope of this initial branch to be only
that which is required for the pluggable authentication mechanism as
described in iteration-1. We will then create a separate branch in order to
introduce whole new services - such as: TAS Server Instances and a Key
Management Service.

This will make the ability to review each branch easier and the merging of
each into trunk less destabilizing/risky.

In terms of check-in philosophy, we should take a review then check-in
approach to the branch with lazy consensus - wherein we do not need to
explicitly +1 every check-in to the branch but we will honor any -1's with
discussion to resolve before checking in. This will provide us each with
the opportunity to track the work being done and ensure that we understand
it and find that it meets the intended goals.

I am excited to get this work really moving and look forward to working on
it with you all.

One outstanding question for me - how do we go about getting the branches
created?

Off the top of my head, I believe there to be a need for 3 for the related
security efforts actually: pluggable authentication/sso, security services
and cryptographic filesystem.

thanks!

--larry


On Tue, Aug 6, 2013 at 6:22 PM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Near the bottom of the bylaws, it states that addition of a "New Branch
> Committer" requires "Lazy consensus of active PMC members."  I think this
> means that you'll need to get a PMC member to sponsor the vote for you.
>  Regular committer votes happen on the private PMC mailing list, and I
> assume it would be the same for a branch committer vote.
>
> http://hadoop.apache.org/bylaws.html
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com>
> wrote:
>
> > That sounds perfect!
> > I have been thinking of late that we would maybe need an incubator
> project
> > or something for this - which would be unfortunate.
> >
> > This would allow us to move much more quickly with a set of patches
> broken
> > up into consumable/understandable chunks that are made functional more
> > easily within the branch.
> > I assume that we need to start a separate thread for DISCUSS or VOTE to
> > start that process - correct?
> >
> > On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com>
> wrote:
> >
> > > yep, that is what I meant. Thanks Chris
> > >
> > >
> > > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <
> cnauroth@hortonworks.com
> > >wrote:
> > >
> > >> Perhaps this is also a good opportunity to try out the new "branch
> > >> committers" clause in the bylaws, enabling non-committers who are
> > working
> > >> on this to commit to the feature branch.
> > >>
> > >>
> > >>
> >
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E
> > >>
> > >> Chris Nauroth
> > >> Hortonworks
> > >> http://hortonworks.com/
> > >>
> > >>
> > >>
> > >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <tucu@cloudera.com
> > >>> wrote:
> > >>
> > >>> Larry,
> > >>>
> > >>> Sorry for the delay answering. Thanks for laying down things, yes, it
> > >> makes
> > >>> sense.
> > >>>
> > >>> Given the large scope of the changes, number of JIRAs and number of
> > >>> developers involved, wouldn't make sense to create a feature branch
> for
> > >> all
> > >>> this work not to destabilize (more ;) trunk?
> > >>>
> > >>> Thanks again.
> > >>>
> > >>>
> > >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lmccay@hortonworks.com
> >
> > >>> wrote:
> > >>>
> > >>>> The following JIRA was filed to provide a token and basic authority
> > >>>> implementation for this effort:
> > >>>> https://issues.apache.org/jira/browse/HADOOP-9781
> > >>>>
> > >>>> I have attached an initial patch though have yet to submit it as one
> > >>> since
> > >>>> it is dependent on the patch for CMF that was posted to:
> > >>>> https://issues.apache.org/jira/browse/HADOOP-9534
> > >>>> and this patch still has a couple outstanding issues - javac
> warnings
> > >> for
> > >>>> com.sun classes for certification generation and 11 javadoc
> warnings.
> > >>>>
> > >>>> Please feel free to review the patches and raise any questions or
> > >>> concerns
> > >>>> related to them.
> > >>>>
> > >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com>
> > >> wrote:
> > >>>>
> > >>>>> Hello All -
> > >>>>>
> > >>>>> In an effort to scope an initial iteration that provides value to
> the
> > >>>> community while focusing on the pluggable authentication aspects,
> I've
> > >>>> written a description for "Iteration 1". It identifies the goal of
> the
> > >>>> iteration, the endstate and a set of initial usecases. It also
> > >> enumerates
> > >>>> the components that are required for each usecase. There is a scope
> > >>> section
> > >>>> that details specific things that should be kept out of the first
> > >>>> iteration. This is certainly up for discussion. There may be some of
> > >>> these
> > >>>> things that can be contributed in short order. If we can add some
> > >> things
> > >>> in
> > >>>> without unnecessary complexity for the identified usecases then we
> > >>> should.
> > >>>>>
> > >>>>> @Alejandro - please review this and see whether it satisfies your
> > >> point
> > >>>> for a definition of what we are building.
> > >>>>>
> > >>>>> In addition to the document that I will paste here as text and
> > >> attach a
> > >>>> pdf version, we have a couple patches for components that are
> > >> identified
> > >>> in
> > >>>> the document.
> > >>>>> Specifically, COMP-7 and COMP-8.
> > >>>>>
> > >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was
> > >> filed
> > >>>> specifically for that functionality.
> > >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as the
> > >> token
> > >>>> format and a basic JsonWebTokenAuthority that can issue and verify
> > >> these
> > >>>> tokens.
> > >>>>>
> > >>>>> Since there is no JIRA for this yet, I will likely file a new JIRA
> > >> for
> > >>> a
> > >>>> SSO token implementation.
> > >>>>>
> > >>>>> Both of these patches assume to be modules within
> > >>>> hadoop-common/hadoop-common-project.
> > >>>>> While they are relatively small, I think that they will be pulled
> in
> > >> by
> > >>>> other modules such as hadoop-auth which would likely not want a
> > >>> dependency
> > >>>> on something larger like
> > >>> hadoop-common/hadoop-common-project/hadoop-common.
> > >>>>>
> > >>>>> This is certainly something that we should discuss within the
> > >> community
> > >>>> for this effort though - that being, exactly how to add these
> > libraries
> > >>> so
> > >>>> that they are most easily consumed by existing projects.
> > >>>>>
> > >>>>> Anyway, the following is the Iteration-1 document - it is also
> > >> attached
> > >>>> as a pdf:
> > >>>>>
> > >>>>> Iteration 1: Pluggable User Authentication and Federation
> > >>>>>
> > >>>>> Introduction
> > >>>>> The intent of this effort is to bootstrap the development of
> > >> pluggable
> > >>>> token-based authentication mechanisms to support certain goals of
> > >>>> enterprise authentication integrations. By restricting the scope of
> > >> this
> > >>>> effort, we hope to provide immediate benefit to the community while
> > >>> keeping
> > >>>> the initial contribution to a manageable size that can be easily
> > >>> reviewed,
> > >>>> understood and extended with further development through follow up
> > >> JIRAs
> > >>>> and related iterations.
> > >>>>>
> > >>>>> Iteration Endstate
> > >>>>> Once complete, this effort will have extended the authentication
> > >>>> mechanisms - for all client types - from the existing: Simple,
> > Kerberos
> > >>> and
> > >>>> Plain (for RPC) to include LDAP authentication and SAML based
> > >> federation.
> > >>>> In addition, the ability to provide additional/custom authentication
> > >>>> mechanisms will be enabled for users to plug in their preferred
> > >>> mechanisms.
> > >>>>>
> > >>>>> Project Scope
> > >>>>> The scope of this effort is a subset of the features covered by the
> > >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates
> on
> > >>>> enabling Hadoop to issue, accept/validate SSO tokens of its own. The
> > >>>> pluggable authentication mechanism within SASL/RPC layer and the
> > >>>> authentication filter pluggability for REST and UI components will
> be
> > >>>> leveraged and extended to support the results of this effort.
> > >>>>>
> > >>>>> Out of Scope
> > >>>>> In order to scope the initial deliverable as the minimally viable
> > >>>> product, a handful of things have been simplified or left out of
> scope
> > >>> for
> > >>>> this effort. This is not meant to say that these aspects are not
> > useful
> > >>> or
> > >>>> not needed but that they are not necessary for this iteration. We do
> > >>>> however need to ensure that we don’t do anything to preclude adding
> > >> them
> > >>> in
> > >>>> future iterations.
> > >>>>> 1. Additional Attributes - the result of authentication will
> continue
> > >>> to
> > >>>> use the existing hadoop tokens and identity representations.
> > Additional
> > >>>> attributes used for finer grained authorization decisions will be
> > added
> > >>>> through follow-up efforts.
> > >>>>> 2. Token revocation - the ability to revoke issued identity tokens
> > >> will
> > >>>> be added later
> > >>>>> 3. Multi-factor authentication - this will likely require
> additional
> > >>>> attributes and is not necessary for this iteration.
> > >>>>> 4. Authorization changes - we will require additional attributes
> for
> > >>> the
> > >>>> fine-grained access control plans. This is not needed for this
> > >> iteration.
> > >>>>> 5. Domains - we assume a single flat domain for all users
> > >>>>> 6. Kinit alternative - we can leverage existing REST clients such
> as
> > >>>> cURL to retrieve tokens through authentication and federation for
> the
> > >>> time
> > >>>> being
> > >>>>> 7. A specific authentication framework isn’t really necessary
> within
> > >>> the
> > >>>> REST endpoints for this iteration. If one is available then we can
> use
> > >> it
> > >>>> otherwise we can leverage existing things like Apache Shiro within a
> > >>>> servlet filter.
> > >>>>>
> > >>>>> In Scope
> > >>>>> What is in scope for this effort is defined by the usecases
> described
> > >>>> below. Components required for supporting the usecases are
> summarized
> > >> for
> > >>>> each client type. Each component is a candidate for a JIRA subtask -
> > >>> though
> > >>>> multiple components are likely to be included in a JIRA to
> represent a
> > >>> set
> > >>>> of functionality rather than individual JIRAs per component.
> > >>>>>
> > >>>>> Terminology and Naming
> > >>>>> The terms and names of components within this document are merely
> > >>>> descriptive of the functionality that they represent. Any similarity
> > or
> > >>>> difference in names or terms from those that are found in other
> > >> documents
> > >>>> are not intended to make any statement about those other documents
> or
> > >> the
> > >>>> descriptions within. This document represents the pluggable
> > >>> authentication
> > >>>> mechanisms and server functionality required to replace Kerberos.
> > >>>>>
> > >>>>> Ultimately, the naming of the implementation classes will be a
> > >> product
> > >>>> of the patches accepted by the community.
> > >>>>>
> > >>>>> Usecases:
> > >>>>> client types: REST, CLI, UI
> > >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
> > >>>> federation/SAML
> > >>>>>
> > >>>>> Simple and Kerberos
> > >>>>> Simple and Kerberos usecases continue to work as they do today. The
> > >>>> addition of Authentication/LDAP and Federation/SAML are added
> through
> > >> the
> > >>>> existing pluggability points either as they are or with required
> > >>> extension.
> > >>>> Either way, continued support for Simple and Kerberos must not
> require
> > >>>> changes to existing deployments in the field as a result of this
> > >> effort.
> > >>>>>
> > >>>>> REST
> > >>>>> USECASE REST-1 Authentication/LDAP:
> > >>>>> For REST clients, we will provide the ability to:
> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
> exposed
> > >> by
> > >>>> an AuthenticationServer instance via REST calls to:
> > >>>>>   a. authenticate - passing username/password returning a hadoop
> > >>>> id_token
> > >>>>>   b. get-access-token - from the TokenGrantingService by passing
> the
> > >>>> hadoop id_token as an Authorization: Bearer token along with the
> > >> desired
> > >>>> service name (master service name) returning a hadoop access token
> > >>>>> 2. Successfully invoke a hadoop service REST API passing the hadoop
> > >>>> access token through an HTTP header as an Authorization Bearer token
> > >>>>>   a. validation of the incoming token on the service endpoint is
> > >>>> accomplished by an SSOAuthenticationHandler
> > >>>>> 3. Successfully block access to a REST resource when presenting a
> > >>> hadoop
> > >>>> access token intended for a different service
> > >>>>>   a. validation of the incoming token on the service endpoint is
> > >>>> accomplished by an SSOAuthenticationHandler
> > >>>>>
> > >>>>> USECASE REST-2 Federation/SAML:
> > >>>>> We will also provide federation capabilities for REST clients such
> > >>> that:
> > >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?)
> and
> > >>>> persist in a permissions protected file - ie.
> > >> ~/.hadoop_tokens/.idp_token
> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> > instance
> > >>> via
> > >>>> REST calls to:
> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
> Bearer
> > >>>> token returning a hadoop id_token
> > >>>>>      - can copy and paste from commandline or use cat to include
> > >>>> persisted token through "--Header Authorization: Bearer 'cat
> > >>>> ~/.hadoop_tokens/.id_token'"
> > >>>>>   b. get-access-token - from the TokenGrantingService by passing
> the
> > >>>> hadoop id_token as an Authorization: Bearer token along with the
> > >> desired
> > >>>> service name (master service name) to the TokenGrantingService
> > >> returning
> > >>> a
> > >>>> hadoop access token
> > >>>>> 3. Successfully invoke a hadoop service REST API passing the hadoop
> > >>>> access token through an HTTP header as an Authorization Bearer token
> > >>>>>   a. validation of the incoming token on the service endpoint is
> > >>>> accomplished by an SSOAuthenticationHandler
> > >>>>> 4. Successfully block access to a REST resource when presenting a
> > >>> hadoop
> > >>>> access token intended for a different service
> > >>>>>   a. validation of the incoming token on the service endpoint is
> > >>>> accomplished by an SSOAuthenticationHandler
> > >>>>>
> > >>>>> REQUIRED COMPONENTS for REST USECASES:
> > >>>>> COMP-1. REST client - cURL or similar
> > >>>>> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP
> endpoint
> > >>>> example - returning hadoop id_token
> > >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
> > >>> shibboleth
> > >>>> SP?|OpenSAML? - returning hadoop id_token
> > >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop
> access
> > >>>> tokens from hadoop id_tokens
> > >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
> > >>>> tokens
> > >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
> > >>>>> COMP-7. hadoop token and authority implementations
> > >>>>> COMP-8. core services for crypto support for signing, verifying and
> > >> PKI
> > >>>> management
> > >>>>>
> > >>>>> CLI
> > >>>>> USECASE CLI-1 Authentication/LDAP:
> > >>>>> For CLI/RPC clients, we will provide the ability to:
> > >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint
> exposed
> > >> by
> > >>>> an AuthenticationServer instance via REST calls to:
> > >>>>>   a. authenticate - passing username/password returning a hadoop
> > >>>> id_token
> > >>>>>      - for RPC clients we need to persist the returned hadoop
> > >> identity
> > >>>> token in a file protected by fs permissions so that it may be
> > leveraged
> > >>>> until expiry
> > >>>>>      - directing the returned response to a file may suffice for
> now
> > >>>> something like ">~/.hadoop_tokens/.id_token"
> > >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is
> passed
> > >> as
> > >>>> Authorization: Bearer token to the get-access-token REST endpoint
> > >> exposed
> > >>>> by TokenGrantingService returning a hadoop access token
> > >>>>>   b. RPC server side validates the presented hadoop access token
> and
> > >>>> continues to serve request
> > >>>>>   c. Successfully invoke a hadoop service RPC API
> > >>>>>
> > >>>>> USECASE CLI-2 Federation/SAML:
> > >>>>> For CLI/RPC clients, we will provide the ability to:
> > >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?)
> and
> > >>>> persist in a permissions protected file - ie.
> > >> ~/.hadoop_tokens/.idp_token
> > >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
> > >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> > instance
> > >>> via
> > >>>> REST calls to:
> > >>>>>   a. federate - passing a SAML assertion as an Authorization:
> Bearer
> > >>>> token returning a hadoop id_token
> > >>>>>      - can copy and paste from commandline or use cat to include
> > >>>> previously persisted token through "--Header Authorization: Bearer
> > 'cat
> > >>>> ~/.hadoop_tokens/.id_token'"
> > >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> > >>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
> > >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is
> passed
> > >> as
> > >>>> Authorization: Bearer token to the get-access-token REST endpoint
> > >> exposed
> > >>>> by TokenGrantingService returning a hadoop access token
> > >>>>>   b. RPC server side validates the presented hadoop access token
> and
> > >>>> continues to serve request
> > >>>>>   c. Successfully invoke a hadoop service RPC API
> > >>>>>
> > >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required for
> > >>> REST):
> > >>>>> COMP-9. TokenAuth Method negotiation, etc
> > >>>>> COMP-10. Client side implementation to leverage REST endpoint for
> > >>>> acquiring hadoop access tokens given a hadoop id_token
> > >>>>> COMP-11. Server side implementation to validate incoming hadoop
> > >> access
> > >>>> tokens
> > >>>>>
> > >>>>> UI
> > >>>>> Various Hadoop services have their own web UI consoles for
> > >>>> administration and end user interactions. These consoles need to
> also
> > >>>> benefit from the pluggability of authentication mechansims to be on
> > par
> > >>>> with the access control of the cluster REST and RPC APIs.
> > >>>>> Web consoles are protected with an WebSSOAuthenticationHandler
> which
> > >>>> will be configured for either authentication or federation.
> > >>>>>
> > >>>>> USECASE UI-1 Authentication/LDAP:
> > >>>>> For the authentication usecase:
> > >>>>> 1. User’s browser requests access to a UI console page
> > >>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
> > >> the
> > >>>> browser to an IdP web endpoint exposed by the AuthenticationServer
> > >>> passing
> > >>>> the requested url as the redirect_url
> > >>>>> 3. IdP web endpoint presents the user with a FORM over https
> > >>>>>   a. user provides username/password and submits the FORM
> > >>>>> 4. AuthenticationServer authenticates the user with provided
> > >>> credentials
> > >>>> against the configured LDAP server and:
> > >>>>>   a. leverages a servlet filter or other authentication mechanism
> > >> for
> > >>>> the endpoint and authenticates the user with a simple LDAP bind with
> > >>>> username and password
> > >>>>>   b. acquires a hadoop id_token and uses it to acquire the required
> > >>>> hadoop access token which is added as a cookie
> > >>>>>   c. redirects the browser to the original service UI resource via
> > >> the
> > >>>> provided redirect_url
> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> > >>> interrogates
> > >>>> the incoming request again for an authcookie that contains an access
> > >>> token
> > >>>> upon finding one:
> > >>>>>   a. validates the incoming token
> > >>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
> > >>>> contract
> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
> > >>> expected
> > >>>> token
> > >>>>>   d. serves requested resource for valid tokens
> > >>>>>   e. subsequent requests are handled by the AuthenticationFilter
> > >>>> recognition of the hadoop auth cookie
> > >>>>>
> > >>>>> USECASE UI-2 Federation/SAML:
> > >>>>> For the federation usecase:
> > >>>>> 1. User’s browser requests access to a UI console page
> > >>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
> > >> the
> > >>>> browser to an SP web endpoint exposed by the AuthenticationServer
> > >> passing
> > >>>> the requested url as the redirect_url. This endpoint:
> > >>>>>   a. is dedicated to redirecting to the external IdP passing the
> > >>>> required parameters which may include a redirect_url back to itself
> as
> > >>> well
> > >>>> as encoding the original redirect_url so that it can determine it on
> > >> the
> > >>>> way back to the client
> > >>>>> 3. the IdP:
> > >>>>>   a. challenges the user for credentials and authenticates the user
> > >>>>>   b. creates appropriate token/cookie and redirects back to the
> > >>>> AuthenticationServer endpoint
> > >>>>> 4. AuthenticationServer endpoint:
> > >>>>>   a. extracts the expected token/cookie from the incoming request
> > >> and
> > >>>> validates it
> > >>>>>   b. creates a hadoop id_token
> > >>>>>   c. acquires a hadoop access token for the id_token
> > >>>>>   d. creates appropriate cookie and redirects back to the original
> > >>>> redirect_url - being the requested resource
> > >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> > >>> interrogates
> > >>>> the incoming request again for an authcookie that contains an access
> > >>> token
> > >>>> upon finding one:
> > >>>>>   a. validates the incoming token
> > >>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
> > >>>> contrac
> > >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
> > >>> expected
> > >>>> token
> > >>>>>   d. serves requested resource for valid tokens
> > >>>>>   e. subsequent requests are handled by the AuthenticationFilter
> > >>>> recognition of the hadoop auth cookie
> > >>>>> REQUIRED COMPONENTS for UI USECASES:
> > >>>>> COMP-12. WebSSOAuthenticationHandler
> > >>>>> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM
> based
> > >>>> login
> > >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party
> > >>> token
> > >>>> federation
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
> > >> Brian.Swan@microsoft.com>
> > >>>> wrote:
> > >>>>> Thanks, Larry. That is what I was trying to say, but you've said it
> > >>>> better and in more detail. :-) To extract from what you are saying:
> > "If
> > >>> we
> > >>>> were to reframe the immediate scope to the lowest common denominator
> > of
> > >>>> what is needed for accepting tokens in authentication plugins then
> we
> > >>>> gain... an end-state for the lowest common denominator that enables
> > >> code
> > >>>> patches in the near-term is the best of both worlds."
> > >>>>>
> > >>>>> -Brian
> > >>>>>
> > >>>>> -----Original Message-----
> > >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
> > >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
> > >>>>> To: common-dev@hadoop.apache.org
> > >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> > >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > >>>>>
> > >>>>> It seems to me that we can have the best of both worlds here...it's
> > >> all
> > >>>> about the scoping.
> > >>>>>
> > >>>>> If we were to reframe the immediate scope to the lowest common
> > >>>> denominator of what is needed for accepting tokens in authentication
> > >>>> plugins then we gain:
> > >>>>>
> > >>>>> 1. a very manageable scope to define and agree upon 2. a
> deliverable
> > >>>> that should be useful in and of itself 3. a foundation for community
> > >>>> collaboration that we build on for higher level solutions built on
> > this
> > >>>> lowest common denominator and experience as a working community
> > >>>>>
> > >>>>> So, to Alejandro's point, perhaps we need to define what would make
> > >> #2
> > >>>> above true - this could serve as the "what" we are building instead
> of
> > >>> the
> > >>>> "how" to build it.
> > >>>>> Including:
> > >>>>> a. project structure within hadoop-common-project/common-security
> or
> > >>> the
> > >>>> like b. the usecases that would need to be enabled to make it a self
> > >>>> contained and useful contribution - without higher level solutions
> c.
> > >> the
> > >>>> JIRA/s for contributing patches d. what specific patches will be
> > needed
> > >>> to
> > >>>> accomplished the usecases in #b
> > >>>>>
> > >>>>> In other words, an end-state for the lowest common denominator that
> > >>>> enables code patches in the near-term is the best of both worlds.
> > >>>>>
> > >>>>> I think this may be a good way to bootstrap the collaboration
> process
> > >>>> for our emerging security community rather than trying to tackle a
> > huge
> > >>>> vision all at once.
> > >>>>>
> > >>>>> @Alejandro - if you have something else in mind that would
> bootstrap
> > >>>> this process - that would great - please advise.
> > >>>>>
> > >>>>> thoughts?
> > >>>>>
> > >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
> > >>>> wrote:
> > >>>>>
> > >>>>>> Hi Alejandro, all-
> > >>>>>>
> > >>>>>> There seems to be agreement on the broad stroke description of the
> > >>>> components needed to achieve pluggable token authentication (I'm
> sure
> > >>> I'll
> > >>>> be corrected if that isn't the case). However, discussion of the
> > >> details
> > >>> of
> > >>>> those components doesn't seem to be moving forward. I think this is
> > >>> because
> > >>>> the details are really best understood through code. I also see *a*
> > >> (i.e.
> > >>>> one of many possible) token format and pluggable authentication
> > >>> mechanisms
> > >>>> within the RPC layer as components that can have immediate benefit
> to
> > >>>> Hadoop users AND still allow flexibility in the larger design. So, I
> > >>> think
> > >>>> the best way to move the conversation of "what we are aiming for"
> > >> forward
> > >>>> is to start looking at code for these components. I am especially
> > >>>> interested in moving forward with pluggable authentication
> mechanisms
> > >>>> within the RPC layer and would love to see what others have done in
> > >> this
> > >>>> area (if anything).
> > >>>>>>
> > >>>>>> Thanks.
> > >>>>>>
> > >>>>>> -Brian
> > >>>>>>
> > >>>>>> -----Original Message-----
> > >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> > >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
> > >>>>>> To: Larry McCay
> > >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> > >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > >>>>>>
> > >>>>>> Larry, all,
> > >>>>>>
> > >>>>>> Still is not clear to me what is the end state we are aiming for,
> > >> or
> > >>>> that we even agree on that.
> > >>>>>>
> > >>>>>> IMO, Instead trying to agree what to do, we should first  agree on
> > >>> the
> > >>>> final state, then we see what should be changed to there there, then
> > we
> > >>> see
> > >>>> how we change things to get there.
> > >>>>>>
> > >>>>>> The different documents out there focus more on how.
> > >>>>>>
> > >>>>>> We not try to say how before we know what.
> > >>>>>>
> > >>>>>> Thx.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
> > >> lmccay@hortonworks.com
> > >>>>
> > >>>> wrote:
> > >>>>>>
> > >>>>>>> All -
> > >>>>>>>
> > >>>>>>> After combing through this thread - as well as the summit session
> > >>>>>>> summary thread, I think that we have the following two items that
> > >> we
> > >>>>>>> can probably move forward with:
> > >>>>>>>
> > >>>>>>> 1. TokenAuth method - assuming this means the pluggable
> > >>>>>>> authentication mechanisms within the RPC layer (2 votes: Kai and
> > >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and
> myself)
> > >>>>>>>
> > >>>>>>> I propose that we attack both of these aspects as one. Let's
> > >> provide
> > >>>>>>> the structure and interfaces of the pluggable framework for use
> in
> > >>>>>>> the RPC layer through leveraging Daryn's pluggability work and
> POC
> > >>> it
> > >>>>>>> with a particular token format (not necessarily the only format
> > >> ever
> > >>>>>>> supported - we just need one to start). If there has already been
> > >>>>>>> work done in this area by anyone then please speak up and commit
> > >> to
> > >>>>>>> providing a patch - so that we don't duplicate effort.
> > >>>>>>>
> > >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we can
> > >> look
> > >>>>>>> at to discern the pluggability mechanism details? Documentation
> of
> > >>> it
> > >>>>>>> would be great as well.
> > >>>>>>> @Kai - do you have existing code for the pluggable token
> > >>>>>>> authentication mechanism - if not, we can take a stab at
> > >>> representing
> > >>>>>>> it with interfaces and/or POC code.
> > >>>>>>> I can standup and say that we have a token format that we have
> > >> been
> > >>>>>>> working with already and can provide a patch that represents it
> > >> as a
> > >>>>>>> contribution to test out the pluggable tokenAuth.
> > >>>>>>>
> > >>>>>>> These patches will provide progress toward code being the central
> > >>>>>>> discussion vehicle. As a community, we can then incrementally
> > >> build
> > >>>>>>> on that foundation in order to collaboratively deliver the common
> > >>>> vision.
> > >>>>>>>
> > >>>>>>> In the absence of any other home for posting such patches, let's
> > >>>>>>> assume that they will be attached to HADOOP-9392 - or a dedicated
> > >>>>>>> subtask for this particular aspect/s - I will leave that detail
> to
> > >>>> Kai.
> > >>>>>>>
> > >>>>>>> @Alejandro, being the only voice on this thread that isn't
> > >>>>>>> represented in the votes above, please feel free to agree or
> > >>> disagree
> > >>>> with this direction.
> > >>>>>>>
> > >>>>>>> thanks,
> > >>>>>>>
> > >>>>>>> --larry
> > >>>>>>>
> > >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Andy -
> > >>>>>>>>
> > >>>>>>>>> Happy Fourth of July to you and yours.
> > >>>>>>>>
> > >>>>>>>> Same to you and yours. :-)
> > >>>>>>>> We had some fun in the sun for a change - we've had nothing but
> > >>> rain
> > >>>>>>>> on
> > >>>>>>> the east coast lately.
> > >>>>>>>>
> > >>>>>>>>> My concern here is there may have been a misinterpretation or
> > >> lack
> > >>>>>>>>> of consensus on what is meant by "clean slate"
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Apparently so.
> > >>>>>>>> On the pre-summit call, I stated that I was interested in
> > >>>>>>>> reconciling
> > >>>>>>> the jiras so that we had one to work from.
> > >>>>>>>>
> > >>>>>>>> You recommended that we set them aside for the time being - with
> > >>> the
> > >>>>>>> understanding that work would continue on your side (and our's as
> > >>>>>>> well) - and approach the community discussion from a clean slate.
> > >>>>>>>> We seemed to do this at the summit session quite well.
> > >>>>>>>> It was my understanding that this community discussion would
> live
> > >>>>>>>> beyond
> > >>>>>>> the summit and continue on this list.
> > >>>>>>>>
> > >>>>>>>> While closing the summit session we agreed to follow up on
> > >>>>>>>> common-dev
> > >>>>>>> with first a summary then a discussion of the moving parts.
> > >>>>>>>>
> > >>>>>>>> I never expected the previous work to be abandoned and fully
> > >>>>>>>> expected it
> > >>>>>>> to inform the discussion that happened here.
> > >>>>>>>>
> > >>>>>>>> If you would like to reframe what clean slate was supposed to
> > >> mean
> > >>>>>>>> or
> > >>>>>>> describe what it means now - that would be welcome - before I
> > >> waste
> > >>>>>>> anymore time trying to facilitate a community discussion that is
> > >>>>>>> apparently not wanted.
> > >>>>>>>>
> > >>>>>>>>> Nowhere in this
> > >>>>>>>>> picture are self appointed "master JIRAs" and such, which have
> > >>> been
> > >>>>>>>>> disappointing to see crop up, we should be collaboratively
> > >> coding
> > >>>>>>>>> not planting flags.
> > >>>>>>>>
> > >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
> > >>>>>>>> It has certainly not been anyone's intention to disappoint.
> > >>>>>>>> Any mention of a new JIRA was just to have a clear context to
> > >>> gather
> > >>>>>>>> the
> > >>>>>>> agreed upon points - previous and/or existing JIRAs would easily
> > >> be
> > >>>> linked.
> > >>>>>>>>
> > >>>>>>>> Planting flags... I need to go back and read my discussion point
> > >>>>>>>> about the
> > >>>>>>> JIRA and see how this is the impression that was made.
> > >>>>>>>> That is not how I define success. The only flags that count is
> > >>> code.
> > >>>>>>> What we are lacking is the roadmap on which to put the code.
> > >>>>>>>>
> > >>>>>>>>> I read Kai's latest document as something approaching today's
> > >>>>>>>>> consensus
> > >>>>>>> (or
> > >>>>>>>>> at least a common point of view?) rather than a historical
> > >>> document.
> > >>>>>>>>> Perhaps he and it can be given equal share of the
> consideration.
> > >>>>>>>>
> > >>>>>>>> I definitely read it as something that has evolved into
> something
> > >>>>>>> approaching what we have been talking about so far. There has not
> > >>>>>>> however been enough discussion anywhere near the level of detail
> > >> in
> > >>>>>>> that document and more details are needed for each component in
> > >> the
> > >>>> design.
> > >>>>>>>> Why the work in that document should not be fed into the
> > >> community
> > >>>>>>> discussion as anyone else's would be - I fail to understand.
> > >>>>>>>>
> > >>>>>>>> My suggestion continues to be that you should take that document
> > >>> and
> > >>>>>>> speak to the inventory of moving parts as we agreed.
> > >>>>>>>> As these are agreed upon, we will ensure that the appropriate
> > >>>>>>>> subtasks
> > >>>>>>> are filed against whatever JIRA is to host them - don't really
> > >> care
> > >>>>>>> much which it is.
> > >>>>>>>>
> > >>>>>>>> I don't really want to continue with two separate JIRAs - as I
> > >>>>>>>> stated
> > >>>>>>> long ago - but until we understand what the pieces are and how
> > >> they
> > >>>>>>> relate then they can't be consolidated.
> > >>>>>>>> Even if 9533 ended up being repurposed as the server instance of
> > >>> the
> > >>>>>>> work - it should be a subtask of a larger one - if that is to be
> > >>>>>>> 9392, so be it.
> > >>>>>>>> We still need to define all the pieces of the larger picture
> > >> before
> > >>>>>>>> that
> > >>>>>>> can be done.
> > >>>>>>>>
> > >>>>>>>> What I thought was the clean slate approach to the discussion
> > >>> seemed
> > >>>>>>>> a
> > >>>>>>> very reasonable way to make all this happen.
> > >>>>>>>> If you would like to restate what you intended by it or
> something
> > >>>>>>>> else
> > >>>>>>> equally as reasonable as a way to move forward that would be
> > >>> awesome.
> > >>>>>>>>
> > >>>>>>>> I will be happy to work toward the roadmap with everyone once it
> > >> is
> > >>>>>>> articulated, understood and actionable.
> > >>>>>>>> In the meantime, I have work to do.
> > >>>>>>>>
> > >>>>>>>> thanks,
> > >>>>>>>>
> > >>>>>>>> --larry
> > >>>>>>>>
> > >>>>>>>> BTW - I meant to quote you in an earlier response and ended up
> > >>>>>>>> saying it
> > >>>>>>> was Aaron instead. Not sure what happened there. :-)
> > >>>>>>>>
> > >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <apurtell@apache.org
> >
> > >>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi Larry (and all),
> > >>>>>>>>>
> > >>>>>>>>> Happy Fourth of July to you and yours.
> > >>>>>>>>>
> > >>>>>>>>> In our shop Kai and Tianyou are already doing the coding, so
> I'd
> > >>>>>>>>> defer
> > >>>>>>> to
> > >>>>>>>>> them on the detailed points.
> > >>>>>>>>>
> > >>>>>>>>> My concern here is there may have been a misinterpretation or
> > >> lack
> > >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully that
> > >> can
> > >>>>>>>>> be
> > >>>>>>> quickly
> > >>>>>>>>> cleared up. Certainly we did not mean ignore all that came
> > >> before.
> > >>>>>>>>> The
> > >>>>>>> idea
> > >>>>>>>>> was to reset discussions to find common ground and new
> direction
> > >>>>>>>>> where
> > >>>>>>> we
> > >>>>>>>>> are working together, not in conflict, on an agreed upon set of
> > >>>>>>>>> design points and tasks. There's been a lot of good discussion
> > >> and
> > >>>>>>>>> design preceeding that we should figure out how to port over.
> > >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs" and
> > >>> such,
> > >>>>>>>>> which have been disappointing to see crop up, we should be
> > >>>>>>>>> collaboratively coding not planting flags.
> > >>>>>>>>>
> > >>>>>>>>> I read Kai's latest document as something approaching today's
> > >>>>>>>>> consensus
> > >>>>>>> (or
> > >>>>>>>>> at least a common point of view?) rather than a historical
> > >>> document.
> > >>>>>>>>> Perhaps he and it can be given equal share of the
> consideration.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hey Andrew -
> > >>>>>>>>>>
> > >>>>>>>>>> I largely agree with that statement.
> > >>>>>>>>>> My intention was to let the differences be worked out within
> > >> the
> > >>>>>>>>>> individual components once they were identified and subtasks
> > >>>> created.
> > >>>>>>>>>>
> > >>>>>>>>>> My reference to HSSO was really referring to a SSO *server*
> > >> based
> > >>>>>>> design
> > >>>>>>>>>> which was not clearly articulated in the earlier documents.
> > >>>>>>>>>> We aren't trying to compare and contrast one design over
> > >> another
> > >>>>>>> anymore.
> > >>>>>>>>>>
> > >>>>>>>>>> Let's move this collaboration along as we've mapped out and
> the
> > >>>>>>>>>> differences in the details will reveal themselves and be
> > >>> addressed
> > >>>>>>> within
> > >>>>>>>>>> their components.
> > >>>>>>>>>>
> > >>>>>>>>>> I've actually been looking forward to you weighing in on the
> > >>>>>>>>>> actual discussion points in this thread.
> > >>>>>>>>>> Could you do that?
> > >>>>>>>>>>
> > >>>>>>>>>> At this point, I am most interested in your thoughts on a
> > >> single
> > >>>>>>>>>> jira
> > >>>>>>> to
> > >>>>>>>>>> represent all of this work and whether we should start
> > >> discussing
> > >>>>>>>>>> the
> > >>>>>>> SSO
> > >>>>>>>>>> Tokens.
> > >>>>>>>>>> If you think there are discussion points missing from that
> > >> list,
> > >>>>>>>>>> feel
> > >>>>>>> free
> > >>>>>>>>>> to add to it.
> > >>>>>>>>>>
> > >>>>>>>>>> thanks,
> > >>>>>>>>>>
> > >>>>>>>>>> --larry
> > >>>>>>>>>>
> > >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
> > >> apurtell@apache.org>
> > >>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Larry,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let me
> > >> point
> > >>>>>>>>>>> out
> > >>>>>>> that,
> > >>>>>>>>>>> while the differences between the competing JIRAs have been
> > >>>>>>>>>>> reduced
> > >>>>>>> for
> > >>>>>>>>>>> sure, there were some key differences that didn't just
> > >>> disappear.
> > >>>>>>>>>>> Subsequent discussion will make that clear. I also disagree
> > >> with
> > >>>>>>>>>>> your characterization that we have simply endorsed all of the
> > >>>>>>>>>>> design
> > >>>>>>> decisions
> > >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an inch. We
> > >>> are
> > >>>>>>> here to
> > >>>>>>>>>>> engage in a collaborative process as peers. I've been
> > >> encouraged
> > >>>>>>>>>>> by
> > >>>>>>> the
> > >>>>>>>>>>> spirit of the discussions up to this point and hope that can
> > >>>>>>>>>>> continue beyond one design summit.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> > >>>>>>>>>>> <lm...@hortonworks.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi Kai -
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I think that I need to clarify something...
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This is not an update for 9533 but a continuation of the
> > >>>>>>>>>>>> discussions
> > >>>>>>>>>> that
> > >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
> > >>>>>>>>>>>> We've agreed to leave our previous designs behind and
> > >> therefore
> > >>>>>>>>>>>> we
> > >>>>>>>>>> aren't
> > >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS approach
> or
> > >>> an
> > >>>>>>> HSSO vs
> > >>>>>>>>>>>> TAS discussion.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Your latest design revision actually makes it clear that you
> > >>> are
> > >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
> > >> comparing
> > >>>>>>>>>>>> and
> > >>>>>>>>>> contrasting
> > >>>>>>>>>>>> is not going to add any value.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> What we need you to do at this point, is to look at those
> > >>>>>>>>>>>> high-level components described on this thread and comment
> on
> > >>>>>>>>>>>> whether we need additional components or any that are listed
> > >>>>>>>>>>>> that don't seem
> > >>>>>>> necessary
> > >>>>>>>>>> to
> > >>>>>>>>>>>> you and why.
> > >>>>>>>>>>>> In other words, we need to define and agree on the work that
> > >>> has
> > >>>>>>>>>>>> to
> > >>>>>>> be
> > >>>>>>>>>>>> done.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> We also need to determine those components that need to be
> > >> done
> > >>>>>>> before
> > >>>>>>>>>>>> anything else can be started.
> > >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> > >>>>>>>>>>>> central to
> > >>>>>>>>>> all
> > >>>>>>>>>>>> the other components and should probably be defined and
> POC'd
> > >>> in
> > >>>>>>> short
> > >>>>>>>>>>>> order.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Personally, I think that continuing the separation of 9533
> > >> and
> > >>>>>>>>>>>> 9392
> > >>>>>>> will
> > >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be enough
> > >>>>>>> differences
> > >>>>>>>>>>>> between the two to justify separate jiras anymore. It may be
> > >>>>>>>>>>>> best to
> > >>>>>>>>>> file a
> > >>>>>>>>>>>> new one that reflects a single vision without the extra
> cruft
> > >>>>>>>>>>>> that
> > >>>>>>> has
> > >>>>>>>>>>>> built up in either of the existing ones. We would certainly
> > >>>>>>>>>>>> reference
> > >>>>>>>>>> the
> > >>>>>>>>>>>> existing ones within the new one. This approach would align
> > >>> with
> > >>>>>>>>>>>> the
> > >>>>>>>>>> spirit
> > >>>>>>>>>>>> of the discussions up to this point.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I am prepared to start a discussion around the shape of the
> > >> two
> > >>>>>>> Hadoop
> > >>>>>>>>>> SSO
> > >>>>>>>>>>>> tokens: identity and access. If this is what others feel the
> > >>>>>>>>>>>> next
> > >>>>>>> topic
> > >>>>>>>>>>>> should be.
> > >>>>>>>>>>>> If we can identify a jira home for it, we can do it there -
> > >>>>>>> otherwise we
> > >>>>>>>>>>>> can create another DISCUSS thread for it.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> thanks,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> --larry
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
> > >> kai.zheng@intel.com>
> > >>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi Larry,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks for the update. Good to see that with this update we
> > >>> are
> > >>>>>>>>>>>>> now
> > >>>>>>>>>>>> aligned on most points.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392.
> The
> > >>>>>>>>>>>>> new
> > >>>>>>>>>>>> revision incorporates feedback and suggestions in related
> > >>>>>>>>>>>> discussion
> > >>>>>>>>>> with
> > >>>>>>>>>>>> the community, particularly from Microsoft and others
> > >> attending
> > >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
> > >>> Summary
> > >>>>>>>>>>>> of the
> > >>>>>>>>>> changes:
> > >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens, Identity
> > >>> Token
> > >>>>>>> plus
> > >>>>>>>>>>>> Access Token, particularly considering our authorization
> > >>>>>>>>>>>> framework
> > >>>>>>> and
> > >>>>>>>>>>>> compatibility with HSSO;
> > >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
> > >>>> authorization
> > >>>>>>>>>>>> framework into the flow that issues access tokens for
> clients
> > >>>>>>>>>>>> with
> > >>>>>>>>>> identity
> > >>>>>>>>>>>> tokens to access services;
> > >>>>>>>>>>>>> 3.    Refined proxy access token and the
> proxy/impersonation
> > >>>> flow;
> > >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access to
> > >>>> Hadoop
> > >>>>>>> web
> > >>>>>>>>>>>> services;
> > >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> --
> > >>>>>>>>> Best regards,
> > >>>>>>>>>
> > >>>>>>>>> - Andy
> > >>>>>>>>>
> > >>>>>>>>> Problems worthy of attack prove their worth by hitting back. -
> > >>> Piet
> > >>>>>>>>> Hein (via Tom White)
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> --
> > >>>>>> Alejandro
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> Alejandro
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > Alejandro
> >
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Chris Nauroth <cn...@hortonworks.com>.

Near the bottom of the bylaws, it states that addition of a "New Branch
Committer" requires "Lazy consensus of active PMC members."  I think this
means that you'll need to get a PMC member to sponsor the vote for you.
 Regular committer votes happen on the private PMC mailing list, and I
assume it would be the same for a branch committer vote.

http://hadoop.apache.org/bylaws.html

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Tue, Aug 6, 2013 at 2:48 PM, Larry McCay <lm...@hortonworks.com> wrote:

> That sounds perfect!
> I have been thinking of late that we would maybe need an incubator project
> or something for this - which would be unfortunate.
>
> This would allow us to move much more quickly with a set of patches broken
> up into consumable/understandable chunks that are made functional more
> easily within the branch.
> I assume that we need to start a separate thread for DISCUSS or VOTE to
> start that process - correct?
>
> On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:
>
> > yep, that is what I meant. Thanks Chris
> >
> >
> > On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <cnauroth@hortonworks.com
> >wrote:
> >
> >> Perhaps this is also a good opportunity to try out the new "branch
> >> committers" clause in the bylaws, enabling non-committers who are
> working
> >> on this to commit to the feature branch.
> >>
> >>
> >>
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E
> >>
> >> Chris Nauroth
> >> Hortonworks
> >> http://hortonworks.com/
> >>
> >>
> >>
> >> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <tucu@cloudera.com
> >>> wrote:
> >>
> >>> Larry,
> >>>
> >>> Sorry for the delay answering. Thanks for laying down things, yes, it
> >> makes
> >>> sense.
> >>>
> >>> Given the large scope of the changes, number of JIRAs and number of
> >>> developers involved, wouldn't make sense to create a feature branch for
> >> all
> >>> this work not to destabilize (more ;) trunk?
> >>>
> >>> Thanks again.
> >>>
> >>>
> >>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lm...@hortonworks.com>
> >>> wrote:
> >>>
> >>>> The following JIRA was filed to provide a token and basic authority
> >>>> implementation for this effort:
> >>>> https://issues.apache.org/jira/browse/HADOOP-9781
> >>>>
> >>>> I have attached an initial patch though have yet to submit it as one
> >>> since
> >>>> it is dependent on the patch for CMF that was posted to:
> >>>> https://issues.apache.org/jira/browse/HADOOP-9534
> >>>> and this patch still has a couple outstanding issues - javac warnings
> >> for
> >>>> com.sun classes for certification generation and 11 javadoc warnings.
> >>>>
> >>>> Please feel free to review the patches and raise any questions or
> >>> concerns
> >>>> related to them.
> >>>>
> >>>> On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com>
> >> wrote:
> >>>>
> >>>>> Hello All -
> >>>>>
> >>>>> In an effort to scope an initial iteration that provides value to the
> >>>> community while focusing on the pluggable authentication aspects, I've
> >>>> written a description for "Iteration 1". It identifies the goal of the
> >>>> iteration, the endstate and a set of initial usecases. It also
> >> enumerates
> >>>> the components that are required for each usecase. There is a scope
> >>> section
> >>>> that details specific things that should be kept out of the first
> >>>> iteration. This is certainly up for discussion. There may be some of
> >>> these
> >>>> things that can be contributed in short order. If we can add some
> >> things
> >>> in
> >>>> without unnecessary complexity for the identified usecases then we
> >>> should.
> >>>>>
> >>>>> @Alejandro - please review this and see whether it satisfies your
> >> point
> >>>> for a definition of what we are building.
> >>>>>
> >>>>> In addition to the document that I will paste here as text and
> >> attach a
> >>>> pdf version, we have a couple patches for components that are
> >> identified
> >>> in
> >>>> the document.
> >>>>> Specifically, COMP-7 and COMP-8.
> >>>>>
> >>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was
> >> filed
> >>>> specifically for that functionality.
> >>>>> COMP-7 is a small set of classes to introduce JsonWebToken as the
> >> token
> >>>> format and a basic JsonWebTokenAuthority that can issue and verify
> >> these
> >>>> tokens.
> >>>>>
> >>>>> Since there is no JIRA for this yet, I will likely file a new JIRA
> >> for
> >>> a
> >>>> SSO token implementation.
> >>>>>
> >>>>> Both of these patches assume to be modules within
> >>>> hadoop-common/hadoop-common-project.
> >>>>> While they are relatively small, I think that they will be pulled in
> >> by
> >>>> other modules such as hadoop-auth which would likely not want a
> >>> dependency
> >>>> on something larger like
> >>> hadoop-common/hadoop-common-project/hadoop-common.
> >>>>>
> >>>>> This is certainly something that we should discuss within the
> >> community
> >>>> for this effort though - that being, exactly how to add these
> libraries
> >>> so
> >>>> that they are most easily consumed by existing projects.
> >>>>>
> >>>>> Anyway, the following is the Iteration-1 document - it is also
> >> attached
> >>>> as a pdf:
> >>>>>
> >>>>> Iteration 1: Pluggable User Authentication and Federation
> >>>>>
> >>>>> Introduction
> >>>>> The intent of this effort is to bootstrap the development of
> >> pluggable
> >>>> token-based authentication mechanisms to support certain goals of
> >>>> enterprise authentication integrations. By restricting the scope of
> >> this
> >>>> effort, we hope to provide immediate benefit to the community while
> >>> keeping
> >>>> the initial contribution to a manageable size that can be easily
> >>> reviewed,
> >>>> understood and extended with further development through follow up
> >> JIRAs
> >>>> and related iterations.
> >>>>>
> >>>>> Iteration Endstate
> >>>>> Once complete, this effort will have extended the authentication
> >>>> mechanisms - for all client types - from the existing: Simple,
> Kerberos
> >>> and
> >>>> Plain (for RPC) to include LDAP authentication and SAML based
> >> federation.
> >>>> In addition, the ability to provide additional/custom authentication
> >>>> mechanisms will be enabled for users to plug in their preferred
> >>> mechanisms.
> >>>>>
> >>>>> Project Scope
> >>>>> The scope of this effort is a subset of the features covered by the
> >>>> overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on
> >>>> enabling Hadoop to issue, accept/validate SSO tokens of its own. The
> >>>> pluggable authentication mechanism within SASL/RPC layer and the
> >>>> authentication filter pluggability for REST and UI components will be
> >>>> leveraged and extended to support the results of this effort.
> >>>>>
> >>>>> Out of Scope
> >>>>> In order to scope the initial deliverable as the minimally viable
> >>>> product, a handful of things have been simplified or left out of scope
> >>> for
> >>>> this effort. This is not meant to say that these aspects are not
> useful
> >>> or
> >>>> not needed but that they are not necessary for this iteration. We do
> >>>> however need to ensure that we don’t do anything to preclude adding
> >> them
> >>> in
> >>>> future iterations.
> >>>>> 1. Additional Attributes - the result of authentication will continue
> >>> to
> >>>> use the existing hadoop tokens and identity representations.
> Additional
> >>>> attributes used for finer grained authorization decisions will be
> added
> >>>> through follow-up efforts.
> >>>>> 2. Token revocation - the ability to revoke issued identity tokens
> >> will
> >>>> be added later
> >>>>> 3. Multi-factor authentication - this will likely require additional
> >>>> attributes and is not necessary for this iteration.
> >>>>> 4. Authorization changes - we will require additional attributes for
> >>> the
> >>>> fine-grained access control plans. This is not needed for this
> >> iteration.
> >>>>> 5. Domains - we assume a single flat domain for all users
> >>>>> 6. Kinit alternative - we can leverage existing REST clients such as
> >>>> cURL to retrieve tokens through authentication and federation for the
> >>> time
> >>>> being
> >>>>> 7. A specific authentication framework isn’t really necessary within
> >>> the
> >>>> REST endpoints for this iteration. If one is available then we can use
> >> it
> >>>> otherwise we can leverage existing things like Apache Shiro within a
> >>>> servlet filter.
> >>>>>
> >>>>> In Scope
> >>>>> What is in scope for this effort is defined by the usecases described
> >>>> below. Components required for supporting the usecases are summarized
> >> for
> >>>> each client type. Each component is a candidate for a JIRA subtask -
> >>> though
> >>>> multiple components are likely to be included in a JIRA to represent a
> >>> set
> >>>> of functionality rather than individual JIRAs per component.
> >>>>>
> >>>>> Terminology and Naming
> >>>>> The terms and names of components within this document are merely
> >>>> descriptive of the functionality that they represent. Any similarity
> or
> >>>> difference in names or terms from those that are found in other
> >> documents
> >>>> are not intended to make any statement about those other documents or
> >> the
> >>>> descriptions within. This document represents the pluggable
> >>> authentication
> >>>> mechanisms and server functionality required to replace Kerberos.
> >>>>>
> >>>>> Ultimately, the naming of the implementation classes will be a
> >> product
> >>>> of the patches accepted by the community.
> >>>>>
> >>>>> Usecases:
> >>>>> client types: REST, CLI, UI
> >>>>> authentication types: Simple, Kerberos, authentication/LDAP,
> >>>> federation/SAML
> >>>>>
> >>>>> Simple and Kerberos
> >>>>> Simple and Kerberos usecases continue to work as they do today. The
> >>>> addition of Authentication/LDAP and Federation/SAML are added through
> >> the
> >>>> existing pluggability points either as they are or with required
> >>> extension.
> >>>> Either way, continued support for Simple and Kerberos must not require
> >>>> changes to existing deployments in the field as a result of this
> >> effort.
> >>>>>
> >>>>> REST
> >>>>> USECASE REST-1 Authentication/LDAP:
> >>>>> For REST clients, we will provide the ability to:
> >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed
> >> by
> >>>> an AuthenticationServer instance via REST calls to:
> >>>>>   a. authenticate - passing username/password returning a hadoop
> >>>> id_token
> >>>>>   b. get-access-token - from the TokenGrantingService by passing the
> >>>> hadoop id_token as an Authorization: Bearer token along with the
> >> desired
> >>>> service name (master service name) returning a hadoop access token
> >>>>> 2. Successfully invoke a hadoop service REST API passing the hadoop
> >>>> access token through an HTTP header as an Authorization Bearer token
> >>>>>   a. validation of the incoming token on the service endpoint is
> >>>> accomplished by an SSOAuthenticationHandler
> >>>>> 3. Successfully block access to a REST resource when presenting a
> >>> hadoop
> >>>> access token intended for a different service
> >>>>>   a. validation of the incoming token on the service endpoint is
> >>>> accomplished by an SSOAuthenticationHandler
> >>>>>
> >>>>> USECASE REST-2 Federation/SAML:
> >>>>> We will also provide federation capabilities for REST clients such
> >>> that:
> >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> >>>> persist in a permissions protected file - ie.
> >> ~/.hadoop_tokens/.idp_token
> >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
> >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> instance
> >>> via
> >>>> REST calls to:
> >>>>>   a. federate - passing a SAML assertion as an Authorization: Bearer
> >>>> token returning a hadoop id_token
> >>>>>      - can copy and paste from commandline or use cat to include
> >>>> persisted token through "--Header Authorization: Bearer 'cat
> >>>> ~/.hadoop_tokens/.id_token'"
> >>>>>   b. get-access-token - from the TokenGrantingService by passing the
> >>>> hadoop id_token as an Authorization: Bearer token along with the
> >> desired
> >>>> service name (master service name) to the TokenGrantingService
> >> returning
> >>> a
> >>>> hadoop access token
> >>>>> 3. Successfully invoke a hadoop service REST API passing the hadoop
> >>>> access token through an HTTP header as an Authorization Bearer token
> >>>>>   a. validation of the incoming token on the service endpoint is
> >>>> accomplished by an SSOAuthenticationHandler
> >>>>> 4. Successfully block access to a REST resource when presenting a
> >>> hadoop
> >>>> access token intended for a different service
> >>>>>   a. validation of the incoming token on the service endpoint is
> >>>> accomplished by an SSOAuthenticationHandler
> >>>>>
> >>>>> REQUIRED COMPONENTS for REST USECASES:
> >>>>> COMP-1. REST client - cURL or similar
> >>>>> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint
> >>>> example - returning hadoop id_token
> >>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
> >>> shibboleth
> >>>> SP?|OpenSAML? - returning hadoop id_token
> >>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access
> >>>> tokens from hadoop id_tokens
> >>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
> >>>> tokens
> >>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
> >>>>> COMP-7. hadoop token and authority implementations
> >>>>> COMP-8. core services for crypto support for signing, verifying and
> >> PKI
> >>>> management
> >>>>>
> >>>>> CLI
> >>>>> USECASE CLI-1 Authentication/LDAP:
> >>>>> For CLI/RPC clients, we will provide the ability to:
> >>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed
> >> by
> >>>> an AuthenticationServer instance via REST calls to:
> >>>>>   a. authenticate - passing username/password returning a hadoop
> >>>> id_token
> >>>>>      - for RPC clients we need to persist the returned hadoop
> >> identity
> >>>> token in a file protected by fs permissions so that it may be
> leveraged
> >>>> until expiry
> >>>>>      - directing the returned response to a file may suffice for now
> >>>> something like ">~/.hadoop_tokens/.id_token"
> >>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> >>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
> >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed
> >> as
> >>>> Authorization: Bearer token to the get-access-token REST endpoint
> >> exposed
> >>>> by TokenGrantingService returning a hadoop access token
> >>>>>   b. RPC server side validates the presented hadoop access token and
> >>>> continues to serve request
> >>>>>   c. Successfully invoke a hadoop service RPC API
> >>>>>
> >>>>> USECASE CLI-2 Federation/SAML:
> >>>>> For CLI/RPC clients, we will provide the ability to:
> >>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> >>>> persist in a permissions protected file - ie.
> >> ~/.hadoop_tokens/.idp_token
> >>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
> >>>> endpoint exposed by an AuthenticationServer(FederationServer?)
> instance
> >>> via
> >>>> REST calls to:
> >>>>>   a. federate - passing a SAML assertion as an Authorization: Bearer
> >>>> token returning a hadoop id_token
> >>>>>      - can copy and paste from commandline or use cat to include
> >>>> previously persisted token through "--Header Authorization: Bearer
> 'cat
> >>>> ~/.hadoop_tokens/.id_token'"
> >>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> >>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
> >>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed
> >> as
> >>>> Authorization: Bearer token to the get-access-token REST endpoint
> >> exposed
> >>>> by TokenGrantingService returning a hadoop access token
> >>>>>   b. RPC server side validates the presented hadoop access token and
> >>>> continues to serve request
> >>>>>   c. Successfully invoke a hadoop service RPC API
> >>>>>
> >>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required for
> >>> REST):
> >>>>> COMP-9. TokenAuth Method negotiation, etc
> >>>>> COMP-10. Client side implementation to leverage REST endpoint for
> >>>> acquiring hadoop access tokens given a hadoop id_token
> >>>>> COMP-11. Server side implementation to validate incoming hadoop
> >> access
> >>>> tokens
> >>>>>
> >>>>> UI
> >>>>> Various Hadoop services have their own web UI consoles for
> >>>> administration and end user interactions. These consoles need to also
> >>>> benefit from the pluggability of authentication mechansims to be on
> par
> >>>> with the access control of the cluster REST and RPC APIs.
> >>>>> Web consoles are protected with an WebSSOAuthenticationHandler which
> >>>> will be configured for either authentication or federation.
> >>>>>
> >>>>> USECASE UI-1 Authentication/LDAP:
> >>>>> For the authentication usecase:
> >>>>> 1. User’s browser requests access to a UI console page
> >>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
> >> the
> >>>> browser to an IdP web endpoint exposed by the AuthenticationServer
> >>> passing
> >>>> the requested url as the redirect_url
> >>>>> 3. IdP web endpoint presents the user with a FORM over https
> >>>>>   a. user provides username/password and submits the FORM
> >>>>> 4. AuthenticationServer authenticates the user with provided
> >>> credentials
> >>>> against the configured LDAP server and:
> >>>>>   a. leverages a servlet filter or other authentication mechanism
> >> for
> >>>> the endpoint and authenticates the user with a simple LDAP bind with
> >>>> username and password
> >>>>>   b. acquires a hadoop id_token and uses it to acquire the required
> >>>> hadoop access token which is added as a cookie
> >>>>>   c. redirects the browser to the original service UI resource via
> >> the
> >>>> provided redirect_url
> >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> >>> interrogates
> >>>> the incoming request again for an authcookie that contains an access
> >>> token
> >>>> upon finding one:
> >>>>>   a. validates the incoming token
> >>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
> >>>> contract
> >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
> >>> expected
> >>>> token
> >>>>>   d. serves requested resource for valid tokens
> >>>>>   e. subsequent requests are handled by the AuthenticationFilter
> >>>> recognition of the hadoop auth cookie
> >>>>>
> >>>>> USECASE UI-2 Federation/SAML:
> >>>>> For the federation usecase:
> >>>>> 1. User’s browser requests access to a UI console page
> >>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
> >> the
> >>>> browser to an SP web endpoint exposed by the AuthenticationServer
> >> passing
> >>>> the requested url as the redirect_url. This endpoint:
> >>>>>   a. is dedicated to redirecting to the external IdP passing the
> >>>> required parameters which may include a redirect_url back to itself as
> >>> well
> >>>> as encoding the original redirect_url so that it can determine it on
> >> the
> >>>> way back to the client
> >>>>> 3. the IdP:
> >>>>>   a. challenges the user for credentials and authenticates the user
> >>>>>   b. creates appropriate token/cookie and redirects back to the
> >>>> AuthenticationServer endpoint
> >>>>> 4. AuthenticationServer endpoint:
> >>>>>   a. extracts the expected token/cookie from the incoming request
> >> and
> >>>> validates it
> >>>>>   b. creates a hadoop id_token
> >>>>>   c. acquires a hadoop access token for the id_token
> >>>>>   d. creates appropriate cookie and redirects back to the original
> >>>> redirect_url - being the requested resource
> >>>>> 5. WebSSOAuthenticationHandler for the original UI resource
> >>> interrogates
> >>>> the incoming request again for an authcookie that contains an access
> >>> token
> >>>> upon finding one:
> >>>>>   a. validates the incoming token
> >>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
> >>>> contrac
> >>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
> >>> expected
> >>>> token
> >>>>>   d. serves requested resource for valid tokens
> >>>>>   e. subsequent requests are handled by the AuthenticationFilter
> >>>> recognition of the hadoop auth cookie
> >>>>> REQUIRED COMPONENTS for UI USECASES:
> >>>>> COMP-12. WebSSOAuthenticationHandler
> >>>>> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based
> >>>> login
> >>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party
> >>> token
> >>>> federation
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
> >> Brian.Swan@microsoft.com>
> >>>> wrote:
> >>>>> Thanks, Larry. That is what I was trying to say, but you've said it
> >>>> better and in more detail. :-) To extract from what you are saying:
> "If
> >>> we
> >>>> were to reframe the immediate scope to the lowest common denominator
> of
> >>>> what is needed for accepting tokens in authentication plugins then we
> >>>> gain... an end-state for the lowest common denominator that enables
> >> code
> >>>> patches in the near-term is the best of both worlds."
> >>>>>
> >>>>> -Brian
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
> >>>>> Sent: Wednesday, July 10, 2013 10:40 AM
> >>>>> To: common-dev@hadoop.apache.org
> >>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> >>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >>>>>
> >>>>> It seems to me that we can have the best of both worlds here...it's
> >> all
> >>>> about the scoping.
> >>>>>
> >>>>> If we were to reframe the immediate scope to the lowest common
> >>>> denominator of what is needed for accepting tokens in authentication
> >>>> plugins then we gain:
> >>>>>
> >>>>> 1. a very manageable scope to define and agree upon 2. a deliverable
> >>>> that should be useful in and of itself 3. a foundation for community
> >>>> collaboration that we build on for higher level solutions built on
> this
> >>>> lowest common denominator and experience as a working community
> >>>>>
> >>>>> So, to Alejandro's point, perhaps we need to define what would make
> >> #2
> >>>> above true - this could serve as the "what" we are building instead of
> >>> the
> >>>> "how" to build it.
> >>>>> Including:
> >>>>> a. project structure within hadoop-common-project/common-security or
> >>> the
> >>>> like b. the usecases that would need to be enabled to make it a self
> >>>> contained and useful contribution - without higher level solutions c.
> >> the
> >>>> JIRA/s for contributing patches d. what specific patches will be
> needed
> >>> to
> >>>> accomplished the usecases in #b
> >>>>>
> >>>>> In other words, an end-state for the lowest common denominator that
> >>>> enables code patches in the near-term is the best of both worlds.
> >>>>>
> >>>>> I think this may be a good way to bootstrap the collaboration process
> >>>> for our emerging security community rather than trying to tackle a
> huge
> >>>> vision all at once.
> >>>>>
> >>>>> @Alejandro - if you have something else in mind that would bootstrap
> >>>> this process - that would great - please advise.
> >>>>>
> >>>>> thoughts?
> >>>>>
> >>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
> >>>> wrote:
> >>>>>
> >>>>>> Hi Alejandro, all-
> >>>>>>
> >>>>>> There seems to be agreement on the broad stroke description of the
> >>>> components needed to achieve pluggable token authentication (I'm sure
> >>> I'll
> >>>> be corrected if that isn't the case). However, discussion of the
> >> details
> >>> of
> >>>> those components doesn't seem to be moving forward. I think this is
> >>> because
> >>>> the details are really best understood through code. I also see *a*
> >> (i.e.
> >>>> one of many possible) token format and pluggable authentication
> >>> mechanisms
> >>>> within the RPC layer as components that can have immediate benefit to
> >>>> Hadoop users AND still allow flexibility in the larger design. So, I
> >>> think
> >>>> the best way to move the conversation of "what we are aiming for"
> >> forward
> >>>> is to start looking at code for these components. I am especially
> >>>> interested in moving forward with pluggable authentication mechanisms
> >>>> within the RPC layer and would love to see what others have done in
> >> this
> >>>> area (if anything).
> >>>>>>
> >>>>>> Thanks.
> >>>>>>
> >>>>>> -Brian
> >>>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> >>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
> >>>>>> To: Larry McCay
> >>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> >>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >>>>>>
> >>>>>> Larry, all,
> >>>>>>
> >>>>>> Still is not clear to me what is the end state we are aiming for,
> >> or
> >>>> that we even agree on that.
> >>>>>>
> >>>>>> IMO, Instead trying to agree what to do, we should first  agree on
> >>> the
> >>>> final state, then we see what should be changed to there there, then
> we
> >>> see
> >>>> how we change things to get there.
> >>>>>>
> >>>>>> The different documents out there focus more on how.
> >>>>>>
> >>>>>> We not try to say how before we know what.
> >>>>>>
> >>>>>> Thx.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
> >> lmccay@hortonworks.com
> >>>>
> >>>> wrote:
> >>>>>>
> >>>>>>> All -
> >>>>>>>
> >>>>>>> After combing through this thread - as well as the summit session
> >>>>>>> summary thread, I think that we have the following two items that
> >> we
> >>>>>>> can probably move forward with:
> >>>>>>>
> >>>>>>> 1. TokenAuth method - assuming this means the pluggable
> >>>>>>> authentication mechanisms within the RPC layer (2 votes: Kai and
> >>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
> >>>>>>>
> >>>>>>> I propose that we attack both of these aspects as one. Let's
> >> provide
> >>>>>>> the structure and interfaces of the pluggable framework for use in
> >>>>>>> the RPC layer through leveraging Daryn's pluggability work and POC
> >>> it
> >>>>>>> with a particular token format (not necessarily the only format
> >> ever
> >>>>>>> supported - we just need one to start). If there has already been
> >>>>>>> work done in this area by anyone then please speak up and commit
> >> to
> >>>>>>> providing a patch - so that we don't duplicate effort.
> >>>>>>>
> >>>>>>> @Daryn - is there a particular Jira or set of Jiras that we can
> >> look
> >>>>>>> at to discern the pluggability mechanism details? Documentation of
> >>> it
> >>>>>>> would be great as well.
> >>>>>>> @Kai - do you have existing code for the pluggable token
> >>>>>>> authentication mechanism - if not, we can take a stab at
> >>> representing
> >>>>>>> it with interfaces and/or POC code.
> >>>>>>> I can standup and say that we have a token format that we have
> >> been
> >>>>>>> working with already and can provide a patch that represents it
> >> as a
> >>>>>>> contribution to test out the pluggable tokenAuth.
> >>>>>>>
> >>>>>>> These patches will provide progress toward code being the central
> >>>>>>> discussion vehicle. As a community, we can then incrementally
> >> build
> >>>>>>> on that foundation in order to collaboratively deliver the common
> >>>> vision.
> >>>>>>>
> >>>>>>> In the absence of any other home for posting such patches, let's
> >>>>>>> assume that they will be attached to HADOOP-9392 - or a dedicated
> >>>>>>> subtask for this particular aspect/s - I will leave that detail to
> >>>> Kai.
> >>>>>>>
> >>>>>>> @Alejandro, being the only voice on this thread that isn't
> >>>>>>> represented in the votes above, please feel free to agree or
> >>> disagree
> >>>> with this direction.
> >>>>>>>
> >>>>>>> thanks,
> >>>>>>>
> >>>>>>> --larry
> >>>>>>>
> >>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Andy -
> >>>>>>>>
> >>>>>>>>> Happy Fourth of July to you and yours.
> >>>>>>>>
> >>>>>>>> Same to you and yours. :-)
> >>>>>>>> We had some fun in the sun for a change - we've had nothing but
> >>> rain
> >>>>>>>> on
> >>>>>>> the east coast lately.
> >>>>>>>>
> >>>>>>>>> My concern here is there may have been a misinterpretation or
> >> lack
> >>>>>>>>> of consensus on what is meant by "clean slate"
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Apparently so.
> >>>>>>>> On the pre-summit call, I stated that I was interested in
> >>>>>>>> reconciling
> >>>>>>> the jiras so that we had one to work from.
> >>>>>>>>
> >>>>>>>> You recommended that we set them aside for the time being - with
> >>> the
> >>>>>>> understanding that work would continue on your side (and our's as
> >>>>>>> well) - and approach the community discussion from a clean slate.
> >>>>>>>> We seemed to do this at the summit session quite well.
> >>>>>>>> It was my understanding that this community discussion would live
> >>>>>>>> beyond
> >>>>>>> the summit and continue on this list.
> >>>>>>>>
> >>>>>>>> While closing the summit session we agreed to follow up on
> >>>>>>>> common-dev
> >>>>>>> with first a summary then a discussion of the moving parts.
> >>>>>>>>
> >>>>>>>> I never expected the previous work to be abandoned and fully
> >>>>>>>> expected it
> >>>>>>> to inform the discussion that happened here.
> >>>>>>>>
> >>>>>>>> If you would like to reframe what clean slate was supposed to
> >> mean
> >>>>>>>> or
> >>>>>>> describe what it means now - that would be welcome - before I
> >> waste
> >>>>>>> anymore time trying to facilitate a community discussion that is
> >>>>>>> apparently not wanted.
> >>>>>>>>
> >>>>>>>>> Nowhere in this
> >>>>>>>>> picture are self appointed "master JIRAs" and such, which have
> >>> been
> >>>>>>>>> disappointing to see crop up, we should be collaboratively
> >> coding
> >>>>>>>>> not planting flags.
> >>>>>>>>
> >>>>>>>> I don't know what you mean by self-appointed master JIRAs.
> >>>>>>>> It has certainly not been anyone's intention to disappoint.
> >>>>>>>> Any mention of a new JIRA was just to have a clear context to
> >>> gather
> >>>>>>>> the
> >>>>>>> agreed upon points - previous and/or existing JIRAs would easily
> >> be
> >>>> linked.
> >>>>>>>>
> >>>>>>>> Planting flags... I need to go back and read my discussion point
> >>>>>>>> about the
> >>>>>>> JIRA and see how this is the impression that was made.
> >>>>>>>> That is not how I define success. The only flags that count is
> >>> code.
> >>>>>>> What we are lacking is the roadmap on which to put the code.
> >>>>>>>>
> >>>>>>>>> I read Kai's latest document as something approaching today's
> >>>>>>>>> consensus
> >>>>>>> (or
> >>>>>>>>> at least a common point of view?) rather than a historical
> >>> document.
> >>>>>>>>> Perhaps he and it can be given equal share of the consideration.
> >>>>>>>>
> >>>>>>>> I definitely read it as something that has evolved into something
> >>>>>>> approaching what we have been talking about so far. There has not
> >>>>>>> however been enough discussion anywhere near the level of detail
> >> in
> >>>>>>> that document and more details are needed for each component in
> >> the
> >>>> design.
> >>>>>>>> Why the work in that document should not be fed into the
> >> community
> >>>>>>> discussion as anyone else's would be - I fail to understand.
> >>>>>>>>
> >>>>>>>> My suggestion continues to be that you should take that document
> >>> and
> >>>>>>> speak to the inventory of moving parts as we agreed.
> >>>>>>>> As these are agreed upon, we will ensure that the appropriate
> >>>>>>>> subtasks
> >>>>>>> are filed against whatever JIRA is to host them - don't really
> >> care
> >>>>>>> much which it is.
> >>>>>>>>
> >>>>>>>> I don't really want to continue with two separate JIRAs - as I
> >>>>>>>> stated
> >>>>>>> long ago - but until we understand what the pieces are and how
> >> they
> >>>>>>> relate then they can't be consolidated.
> >>>>>>>> Even if 9533 ended up being repurposed as the server instance of
> >>> the
> >>>>>>> work - it should be a subtask of a larger one - if that is to be
> >>>>>>> 9392, so be it.
> >>>>>>>> We still need to define all the pieces of the larger picture
> >> before
> >>>>>>>> that
> >>>>>>> can be done.
> >>>>>>>>
> >>>>>>>> What I thought was the clean slate approach to the discussion
> >>> seemed
> >>>>>>>> a
> >>>>>>> very reasonable way to make all this happen.
> >>>>>>>> If you would like to restate what you intended by it or something
> >>>>>>>> else
> >>>>>>> equally as reasonable as a way to move forward that would be
> >>> awesome.
> >>>>>>>>
> >>>>>>>> I will be happy to work toward the roadmap with everyone once it
> >> is
> >>>>>>> articulated, understood and actionable.
> >>>>>>>> In the meantime, I have work to do.
> >>>>>>>>
> >>>>>>>> thanks,
> >>>>>>>>
> >>>>>>>> --larry
> >>>>>>>>
> >>>>>>>> BTW - I meant to quote you in an earlier response and ended up
> >>>>>>>> saying it
> >>>>>>> was Aaron instead. Not sure what happened there. :-)
> >>>>>>>>
> >>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Larry (and all),
> >>>>>>>>>
> >>>>>>>>> Happy Fourth of July to you and yours.
> >>>>>>>>>
> >>>>>>>>> In our shop Kai and Tianyou are already doing the coding, so I'd
> >>>>>>>>> defer
> >>>>>>> to
> >>>>>>>>> them on the detailed points.
> >>>>>>>>>
> >>>>>>>>> My concern here is there may have been a misinterpretation or
> >> lack
> >>>>>>>>> of consensus on what is meant by "clean slate". Hopefully that
> >> can
> >>>>>>>>> be
> >>>>>>> quickly
> >>>>>>>>> cleared up. Certainly we did not mean ignore all that came
> >> before.
> >>>>>>>>> The
> >>>>>>> idea
> >>>>>>>>> was to reset discussions to find common ground and new direction
> >>>>>>>>> where
> >>>>>>> we
> >>>>>>>>> are working together, not in conflict, on an agreed upon set of
> >>>>>>>>> design points and tasks. There's been a lot of good discussion
> >> and
> >>>>>>>>> design preceeding that we should figure out how to port over.
> >>>>>>>>> Nowhere in this picture are self appointed "master JIRAs" and
> >>> such,
> >>>>>>>>> which have been disappointing to see crop up, we should be
> >>>>>>>>> collaboratively coding not planting flags.
> >>>>>>>>>
> >>>>>>>>> I read Kai's latest document as something approaching today's
> >>>>>>>>> consensus
> >>>>>>> (or
> >>>>>>>>> at least a common point of view?) rather than a historical
> >>> document.
> >>>>>>>>> Perhaps he and it can be given equal share of the consideration.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> >>>>>>>>>
> >>>>>>>>>> Hey Andrew -
> >>>>>>>>>>
> >>>>>>>>>> I largely agree with that statement.
> >>>>>>>>>> My intention was to let the differences be worked out within
> >> the
> >>>>>>>>>> individual components once they were identified and subtasks
> >>>> created.
> >>>>>>>>>>
> >>>>>>>>>> My reference to HSSO was really referring to a SSO *server*
> >> based
> >>>>>>> design
> >>>>>>>>>> which was not clearly articulated in the earlier documents.
> >>>>>>>>>> We aren't trying to compare and contrast one design over
> >> another
> >>>>>>> anymore.
> >>>>>>>>>>
> >>>>>>>>>> Let's move this collaboration along as we've mapped out and the
> >>>>>>>>>> differences in the details will reveal themselves and be
> >>> addressed
> >>>>>>> within
> >>>>>>>>>> their components.
> >>>>>>>>>>
> >>>>>>>>>> I've actually been looking forward to you weighing in on the
> >>>>>>>>>> actual discussion points in this thread.
> >>>>>>>>>> Could you do that?
> >>>>>>>>>>
> >>>>>>>>>> At this point, I am most interested in your thoughts on a
> >> single
> >>>>>>>>>> jira
> >>>>>>> to
> >>>>>>>>>> represent all of this work and whether we should start
> >> discussing
> >>>>>>>>>> the
> >>>>>>> SSO
> >>>>>>>>>> Tokens.
> >>>>>>>>>> If you think there are discussion points missing from that
> >> list,
> >>>>>>>>>> feel
> >>>>>>> free
> >>>>>>>>>> to add to it.
> >>>>>>>>>>
> >>>>>>>>>> thanks,
> >>>>>>>>>>
> >>>>>>>>>> --larry
> >>>>>>>>>>
> >>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
> >> apurtell@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Larry,
> >>>>>>>>>>>
> >>>>>>>>>>> Of course I'll let Kai speak for himself. However, let me
> >> point
> >>>>>>>>>>> out
> >>>>>>> that,
> >>>>>>>>>>> while the differences between the competing JIRAs have been
> >>>>>>>>>>> reduced
> >>>>>>> for
> >>>>>>>>>>> sure, there were some key differences that didn't just
> >>> disappear.
> >>>>>>>>>>> Subsequent discussion will make that clear. I also disagree
> >> with
> >>>>>>>>>>> your characterization that we have simply endorsed all of the
> >>>>>>>>>>> design
> >>>>>>> decisions
> >>>>>>>>>>> of the so-called HSSO, this is taking a mile from an inch. We
> >>> are
> >>>>>>> here to
> >>>>>>>>>>> engage in a collaborative process as peers. I've been
> >> encouraged
> >>>>>>>>>>> by
> >>>>>>> the
> >>>>>>>>>>> spirit of the discussions up to this point and hope that can
> >>>>>>>>>>> continue beyond one design summit.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> >>>>>>>>>>> <lm...@hortonworks.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Kai -
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think that I need to clarify something...
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is not an update for 9533 but a continuation of the
> >>>>>>>>>>>> discussions
> >>>>>>>>>> that
> >>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
> >>>>>>>>>>>> We've agreed to leave our previous designs behind and
> >> therefore
> >>>>>>>>>>>> we
> >>>>>>>>>> aren't
> >>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS approach or
> >>> an
> >>>>>>> HSSO vs
> >>>>>>>>>>>> TAS discussion.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Your latest design revision actually makes it clear that you
> >>> are
> >>>>>>>>>>>> now targeting exactly what was described as HSSO - so
> >> comparing
> >>>>>>>>>>>> and
> >>>>>>>>>> contrasting
> >>>>>>>>>>>> is not going to add any value.
> >>>>>>>>>>>>
> >>>>>>>>>>>> What we need you to do at this point, is to look at those
> >>>>>>>>>>>> high-level components described on this thread and comment on
> >>>>>>>>>>>> whether we need additional components or any that are listed
> >>>>>>>>>>>> that don't seem
> >>>>>>> necessary
> >>>>>>>>>> to
> >>>>>>>>>>>> you and why.
> >>>>>>>>>>>> In other words, we need to define and agree on the work that
> >>> has
> >>>>>>>>>>>> to
> >>>>>>> be
> >>>>>>>>>>>> done.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We also need to determine those components that need to be
> >> done
> >>>>>>> before
> >>>>>>>>>>>> anything else can be started.
> >>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> >>>>>>>>>>>> central to
> >>>>>>>>>> all
> >>>>>>>>>>>> the other components and should probably be defined and POC'd
> >>> in
> >>>>>>> short
> >>>>>>>>>>>> order.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Personally, I think that continuing the separation of 9533
> >> and
> >>>>>>>>>>>> 9392
> >>>>>>> will
> >>>>>>>>>>>> do this effort a disservice. There doesn't seem to be enough
> >>>>>>> differences
> >>>>>>>>>>>> between the two to justify separate jiras anymore. It may be
> >>>>>>>>>>>> best to
> >>>>>>>>>> file a
> >>>>>>>>>>>> new one that reflects a single vision without the extra cruft
> >>>>>>>>>>>> that
> >>>>>>> has
> >>>>>>>>>>>> built up in either of the existing ones. We would certainly
> >>>>>>>>>>>> reference
> >>>>>>>>>> the
> >>>>>>>>>>>> existing ones within the new one. This approach would align
> >>> with
> >>>>>>>>>>>> the
> >>>>>>>>>> spirit
> >>>>>>>>>>>> of the discussions up to this point.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am prepared to start a discussion around the shape of the
> >> two
> >>>>>>> Hadoop
> >>>>>>>>>> SSO
> >>>>>>>>>>>> tokens: identity and access. If this is what others feel the
> >>>>>>>>>>>> next
> >>>>>>> topic
> >>>>>>>>>>>> should be.
> >>>>>>>>>>>> If we can identify a jira home for it, we can do it there -
> >>>>>>> otherwise we
> >>>>>>>>>>>> can create another DISCUSS thread for it.
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks,
> >>>>>>>>>>>>
> >>>>>>>>>>>> --larry
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
> >> kai.zheng@intel.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Larry,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the update. Good to see that with this update we
> >>> are
> >>>>>>>>>>>>> now
> >>>>>>>>>>>> aligned on most points.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
> >>>>>>>>>>>>> new
> >>>>>>>>>>>> revision incorporates feedback and suggestions in related
> >>>>>>>>>>>> discussion
> >>>>>>>>>> with
> >>>>>>>>>>>> the community, particularly from Microsoft and others
> >> attending
> >>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
> >>> Summary
> >>>>>>>>>>>> of the
> >>>>>>>>>> changes:
> >>>>>>>>>>>>> 1.    Revised the approach to now use two tokens, Identity
> >>> Token
> >>>>>>> plus
> >>>>>>>>>>>> Access Token, particularly considering our authorization
> >>>>>>>>>>>> framework
> >>>>>>> and
> >>>>>>>>>>>> compatibility with HSSO;
> >>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
> >>>> authorization
> >>>>>>>>>>>> framework into the flow that issues access tokens for clients
> >>>>>>>>>>>> with
> >>>>>>>>>> identity
> >>>>>>>>>>>> tokens to access services;
> >>>>>>>>>>>>> 3.    Refined proxy access token and the proxy/impersonation
> >>>> flow;
> >>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access to
> >>>> Hadoop
> >>>>>>> web
> >>>>>>>>>>>> services;
> >>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Best regards,
> >>>>>>>>>
> >>>>>>>>> - Andy
> >>>>>>>>>
> >>>>>>>>> Problems worthy of attack prove their worth by hitting back. -
> >>> Piet
> >>>>>>>>> Hein (via Tom White)
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Alejandro
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
> >>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Alejandro
> >>>
> >>
> >
> >
> >
> > --
> > Alejandro
>
>

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

That sounds perfect!
I have been thinking of late that we would maybe need an incubator project or something for this - which would be unfortunate.

This would allow us to move much more quickly with a set of patches broken up into consumable/understandable chunks that are made functional more easily within the branch.
I assume that we need to start a separate thread for DISCUSS or VOTE to start that process - correct?

On Aug 6, 2013, at 4:15 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> yep, that is what I meant. Thanks Chris
> 
> 
> On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <cn...@hortonworks.com>wrote:
> 
>> Perhaps this is also a good opportunity to try out the new "branch
>> committers" clause in the bylaws, enabling non-committers who are working
>> on this to commit to the feature branch.
>> 
>> 
>> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E
>> 
>> Chris Nauroth
>> Hortonworks
>> http://hortonworks.com/
>> 
>> 
>> 
>> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <tucu@cloudera.com
>>> wrote:
>> 
>>> Larry,
>>> 
>>> Sorry for the delay answering. Thanks for laying down things, yes, it
>> makes
>>> sense.
>>> 
>>> Given the large scope of the changes, number of JIRAs and number of
>>> developers involved, wouldn't make sense to create a feature branch for
>> all
>>> this work not to destabilize (more ;) trunk?
>>> 
>>> Thanks again.
>>> 
>>> 
>>> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lm...@hortonworks.com>
>>> wrote:
>>> 
>>>> The following JIRA was filed to provide a token and basic authority
>>>> implementation for this effort:
>>>> https://issues.apache.org/jira/browse/HADOOP-9781
>>>> 
>>>> I have attached an initial patch though have yet to submit it as one
>>> since
>>>> it is dependent on the patch for CMF that was posted to:
>>>> https://issues.apache.org/jira/browse/HADOOP-9534
>>>> and this patch still has a couple outstanding issues - javac warnings
>> for
>>>> com.sun classes for certification generation and 11 javadoc warnings.
>>>> 
>>>> Please feel free to review the patches and raise any questions or
>>> concerns
>>>> related to them.
>>>> 
>>>> On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com>
>> wrote:
>>>> 
>>>>> Hello All -
>>>>> 
>>>>> In an effort to scope an initial iteration that provides value to the
>>>> community while focusing on the pluggable authentication aspects, I've
>>>> written a description for "Iteration 1". It identifies the goal of the
>>>> iteration, the endstate and a set of initial usecases. It also
>> enumerates
>>>> the components that are required for each usecase. There is a scope
>>> section
>>>> that details specific things that should be kept out of the first
>>>> iteration. This is certainly up for discussion. There may be some of
>>> these
>>>> things that can be contributed in short order. If we can add some
>> things
>>> in
>>>> without unnecessary complexity for the identified usecases then we
>>> should.
>>>>> 
>>>>> @Alejandro - please review this and see whether it satisfies your
>> point
>>>> for a definition of what we are building.
>>>>> 
>>>>> In addition to the document that I will paste here as text and
>> attach a
>>>> pdf version, we have a couple patches for components that are
>> identified
>>> in
>>>> the document.
>>>>> Specifically, COMP-7 and COMP-8.
>>>>> 
>>>>> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was
>> filed
>>>> specifically for that functionality.
>>>>> COMP-7 is a small set of classes to introduce JsonWebToken as the
>> token
>>>> format and a basic JsonWebTokenAuthority that can issue and verify
>> these
>>>> tokens.
>>>>> 
>>>>> Since there is no JIRA for this yet, I will likely file a new JIRA
>> for
>>> a
>>>> SSO token implementation.
>>>>> 
>>>>> Both of these patches assume to be modules within
>>>> hadoop-common/hadoop-common-project.
>>>>> While they are relatively small, I think that they will be pulled in
>> by
>>>> other modules such as hadoop-auth which would likely not want a
>>> dependency
>>>> on something larger like
>>> hadoop-common/hadoop-common-project/hadoop-common.
>>>>> 
>>>>> This is certainly something that we should discuss within the
>> community
>>>> for this effort though - that being, exactly how to add these libraries
>>> so
>>>> that they are most easily consumed by existing projects.
>>>>> 
>>>>> Anyway, the following is the Iteration-1 document - it is also
>> attached
>>>> as a pdf:
>>>>> 
>>>>> Iteration 1: Pluggable User Authentication and Federation
>>>>> 
>>>>> Introduction
>>>>> The intent of this effort is to bootstrap the development of
>> pluggable
>>>> token-based authentication mechanisms to support certain goals of
>>>> enterprise authentication integrations. By restricting the scope of
>> this
>>>> effort, we hope to provide immediate benefit to the community while
>>> keeping
>>>> the initial contribution to a manageable size that can be easily
>>> reviewed,
>>>> understood and extended with further development through follow up
>> JIRAs
>>>> and related iterations.
>>>>> 
>>>>> Iteration Endstate
>>>>> Once complete, this effort will have extended the authentication
>>>> mechanisms - for all client types - from the existing: Simple, Kerberos
>>> and
>>>> Plain (for RPC) to include LDAP authentication and SAML based
>> federation.
>>>> In addition, the ability to provide additional/custom authentication
>>>> mechanisms will be enabled for users to plug in their preferred
>>> mechanisms.
>>>>> 
>>>>> Project Scope
>>>>> The scope of this effort is a subset of the features covered by the
>>>> overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on
>>>> enabling Hadoop to issue, accept/validate SSO tokens of its own. The
>>>> pluggable authentication mechanism within SASL/RPC layer and the
>>>> authentication filter pluggability for REST and UI components will be
>>>> leveraged and extended to support the results of this effort.
>>>>> 
>>>>> Out of Scope
>>>>> In order to scope the initial deliverable as the minimally viable
>>>> product, a handful of things have been simplified or left out of scope
>>> for
>>>> this effort. This is not meant to say that these aspects are not useful
>>> or
>>>> not needed but that they are not necessary for this iteration. We do
>>>> however need to ensure that we don’t do anything to preclude adding
>> them
>>> in
>>>> future iterations.
>>>>> 1. Additional Attributes - the result of authentication will continue
>>> to
>>>> use the existing hadoop tokens and identity representations. Additional
>>>> attributes used for finer grained authorization decisions will be added
>>>> through follow-up efforts.
>>>>> 2. Token revocation - the ability to revoke issued identity tokens
>> will
>>>> be added later
>>>>> 3. Multi-factor authentication - this will likely require additional
>>>> attributes and is not necessary for this iteration.
>>>>> 4. Authorization changes - we will require additional attributes for
>>> the
>>>> fine-grained access control plans. This is not needed for this
>> iteration.
>>>>> 5. Domains - we assume a single flat domain for all users
>>>>> 6. Kinit alternative - we can leverage existing REST clients such as
>>>> cURL to retrieve tokens through authentication and federation for the
>>> time
>>>> being
>>>>> 7. A specific authentication framework isn’t really necessary within
>>> the
>>>> REST endpoints for this iteration. If one is available then we can use
>> it
>>>> otherwise we can leverage existing things like Apache Shiro within a
>>>> servlet filter.
>>>>> 
>>>>> In Scope
>>>>> What is in scope for this effort is defined by the usecases described
>>>> below. Components required for supporting the usecases are summarized
>> for
>>>> each client type. Each component is a candidate for a JIRA subtask -
>>> though
>>>> multiple components are likely to be included in a JIRA to represent a
>>> set
>>>> of functionality rather than individual JIRAs per component.
>>>>> 
>>>>> Terminology and Naming
>>>>> The terms and names of components within this document are merely
>>>> descriptive of the functionality that they represent. Any similarity or
>>>> difference in names or terms from those that are found in other
>> documents
>>>> are not intended to make any statement about those other documents or
>> the
>>>> descriptions within. This document represents the pluggable
>>> authentication
>>>> mechanisms and server functionality required to replace Kerberos.
>>>>> 
>>>>> Ultimately, the naming of the implementation classes will be a
>> product
>>>> of the patches accepted by the community.
>>>>> 
>>>>> Usecases:
>>>>> client types: REST, CLI, UI
>>>>> authentication types: Simple, Kerberos, authentication/LDAP,
>>>> federation/SAML
>>>>> 
>>>>> Simple and Kerberos
>>>>> Simple and Kerberos usecases continue to work as they do today. The
>>>> addition of Authentication/LDAP and Federation/SAML are added through
>> the
>>>> existing pluggability points either as they are or with required
>>> extension.
>>>> Either way, continued support for Simple and Kerberos must not require
>>>> changes to existing deployments in the field as a result of this
>> effort.
>>>>> 
>>>>> REST
>>>>> USECASE REST-1 Authentication/LDAP:
>>>>> For REST clients, we will provide the ability to:
>>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed
>> by
>>>> an AuthenticationServer instance via REST calls to:
>>>>>   a. authenticate - passing username/password returning a hadoop
>>>> id_token
>>>>>   b. get-access-token - from the TokenGrantingService by passing the
>>>> hadoop id_token as an Authorization: Bearer token along with the
>> desired
>>>> service name (master service name) returning a hadoop access token
>>>>> 2. Successfully invoke a hadoop service REST API passing the hadoop
>>>> access token through an HTTP header as an Authorization Bearer token
>>>>>   a. validation of the incoming token on the service endpoint is
>>>> accomplished by an SSOAuthenticationHandler
>>>>> 3. Successfully block access to a REST resource when presenting a
>>> hadoop
>>>> access token intended for a different service
>>>>>   a. validation of the incoming token on the service endpoint is
>>>> accomplished by an SSOAuthenticationHandler
>>>>> 
>>>>> USECASE REST-2 Federation/SAML:
>>>>> We will also provide federation capabilities for REST clients such
>>> that:
>>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
>>>> persist in a permissions protected file - ie.
>> ~/.hadoop_tokens/.idp_token
>>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
>>>> endpoint exposed by an AuthenticationServer(FederationServer?) instance
>>> via
>>>> REST calls to:
>>>>>   a. federate - passing a SAML assertion as an Authorization: Bearer
>>>> token returning a hadoop id_token
>>>>>      - can copy and paste from commandline or use cat to include
>>>> persisted token through "--Header Authorization: Bearer 'cat
>>>> ~/.hadoop_tokens/.id_token'"
>>>>>   b. get-access-token - from the TokenGrantingService by passing the
>>>> hadoop id_token as an Authorization: Bearer token along with the
>> desired
>>>> service name (master service name) to the TokenGrantingService
>> returning
>>> a
>>>> hadoop access token
>>>>> 3. Successfully invoke a hadoop service REST API passing the hadoop
>>>> access token through an HTTP header as an Authorization Bearer token
>>>>>   a. validation of the incoming token on the service endpoint is
>>>> accomplished by an SSOAuthenticationHandler
>>>>> 4. Successfully block access to a REST resource when presenting a
>>> hadoop
>>>> access token intended for a different service
>>>>>   a. validation of the incoming token on the service endpoint is
>>>> accomplished by an SSOAuthenticationHandler
>>>>> 
>>>>> REQUIRED COMPONENTS for REST USECASES:
>>>>> COMP-1. REST client - cURL or similar
>>>>> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint
>>>> example - returning hadoop id_token
>>>>> COMP-3. REST endpoint for federation with SAML Bearer token -
>>> shibboleth
>>>> SP?|OpenSAML? - returning hadoop id_token
>>>>> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access
>>>> tokens from hadoop id_tokens
>>>>> COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
>>>> tokens
>>>>> COMP-6. some source of a SAML assertion - shibboleth IdP?
>>>>> COMP-7. hadoop token and authority implementations
>>>>> COMP-8. core services for crypto support for signing, verifying and
>> PKI
>>>> management
>>>>> 
>>>>> CLI
>>>>> USECASE CLI-1 Authentication/LDAP:
>>>>> For CLI/RPC clients, we will provide the ability to:
>>>>> 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed
>> by
>>>> an AuthenticationServer instance via REST calls to:
>>>>>   a. authenticate - passing username/password returning a hadoop
>>>> id_token
>>>>>      - for RPC clients we need to persist the returned hadoop
>> identity
>>>> token in a file protected by fs permissions so that it may be leveraged
>>>> until expiry
>>>>>      - directing the returned response to a file may suffice for now
>>>> something like ">~/.hadoop_tokens/.id_token"
>>>>> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
>>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
>>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed
>> as
>>>> Authorization: Bearer token to the get-access-token REST endpoint
>> exposed
>>>> by TokenGrantingService returning a hadoop access token
>>>>>   b. RPC server side validates the presented hadoop access token and
>>>> continues to serve request
>>>>>   c. Successfully invoke a hadoop service RPC API
>>>>> 
>>>>> USECASE CLI-2 Federation/SAML:
>>>>> For CLI/RPC clients, we will provide the ability to:
>>>>> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
>>>> persist in a permissions protected file - ie.
>> ~/.hadoop_tokens/.idp_token
>>>>> 2. use cURL to Federate a token from a trusted IdP through an SP
>>>> endpoint exposed by an AuthenticationServer(FederationServer?) instance
>>> via
>>>> REST calls to:
>>>>>   a. federate - passing a SAML assertion as an Authorization: Bearer
>>>> token returning a hadoop id_token
>>>>>      - can copy and paste from commandline or use cat to include
>>>> previously persisted token through "--Header Authorization: Bearer 'cat
>>>> ~/.hadoop_tokens/.id_token'"
>>>>> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
>>>>>   a. RPC client negotiates a TokenAuth method through SASL layer,
>>>> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed
>> as
>>>> Authorization: Bearer token to the get-access-token REST endpoint
>> exposed
>>>> by TokenGrantingService returning a hadoop access token
>>>>>   b. RPC server side validates the presented hadoop access token and
>>>> continues to serve request
>>>>>   c. Successfully invoke a hadoop service RPC API
>>>>> 
>>>>> REQUIRED COMPONENTS for CLI USECASES - (beyond those required for
>>> REST):
>>>>> COMP-9. TokenAuth Method negotiation, etc
>>>>> COMP-10. Client side implementation to leverage REST endpoint for
>>>> acquiring hadoop access tokens given a hadoop id_token
>>>>> COMP-11. Server side implementation to validate incoming hadoop
>> access
>>>> tokens
>>>>> 
>>>>> UI
>>>>> Various Hadoop services have their own web UI consoles for
>>>> administration and end user interactions. These consoles need to also
>>>> benefit from the pluggability of authentication mechansims to be on par
>>>> with the access control of the cluster REST and RPC APIs.
>>>>> Web consoles are protected with an WebSSOAuthenticationHandler which
>>>> will be configured for either authentication or federation.
>>>>> 
>>>>> USECASE UI-1 Authentication/LDAP:
>>>>> For the authentication usecase:
>>>>> 1. User’s browser requests access to a UI console page
>>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
>> the
>>>> browser to an IdP web endpoint exposed by the AuthenticationServer
>>> passing
>>>> the requested url as the redirect_url
>>>>> 3. IdP web endpoint presents the user with a FORM over https
>>>>>   a. user provides username/password and submits the FORM
>>>>> 4. AuthenticationServer authenticates the user with provided
>>> credentials
>>>> against the configured LDAP server and:
>>>>>   a. leverages a servlet filter or other authentication mechanism
>> for
>>>> the endpoint and authenticates the user with a simple LDAP bind with
>>>> username and password
>>>>>   b. acquires a hadoop id_token and uses it to acquire the required
>>>> hadoop access token which is added as a cookie
>>>>>   c. redirects the browser to the original service UI resource via
>> the
>>>> provided redirect_url
>>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>>> interrogates
>>>> the incoming request again for an authcookie that contains an access
>>> token
>>>> upon finding one:
>>>>>   a. validates the incoming token
>>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
>>>> contract
>>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
>>> expected
>>>> token
>>>>>   d. serves requested resource for valid tokens
>>>>>   e. subsequent requests are handled by the AuthenticationFilter
>>>> recognition of the hadoop auth cookie
>>>>> 
>>>>> USECASE UI-2 Federation/SAML:
>>>>> For the federation usecase:
>>>>> 1. User’s browser requests access to a UI console page
>>>>> 2. WebSSOAuthenticationHandler intercepts the request and redirects
>> the
>>>> browser to an SP web endpoint exposed by the AuthenticationServer
>> passing
>>>> the requested url as the redirect_url. This endpoint:
>>>>>   a. is dedicated to redirecting to the external IdP passing the
>>>> required parameters which may include a redirect_url back to itself as
>>> well
>>>> as encoding the original redirect_url so that it can determine it on
>> the
>>>> way back to the client
>>>>> 3. the IdP:
>>>>>   a. challenges the user for credentials and authenticates the user
>>>>>   b. creates appropriate token/cookie and redirects back to the
>>>> AuthenticationServer endpoint
>>>>> 4. AuthenticationServer endpoint:
>>>>>   a. extracts the expected token/cookie from the incoming request
>> and
>>>> validates it
>>>>>   b. creates a hadoop id_token
>>>>>   c. acquires a hadoop access token for the id_token
>>>>>   d. creates appropriate cookie and redirects back to the original
>>>> redirect_url - being the requested resource
>>>>> 5. WebSSOAuthenticationHandler for the original UI resource
>>> interrogates
>>>> the incoming request again for an authcookie that contains an access
>>> token
>>>> upon finding one:
>>>>>   a. validates the incoming token
>>>>>   b. returns the AuthenticationToken as per AuthenticationHandler
>>>> contrac
>>>>>   c. AuthenticationFilter adds the hadoop auth cookie with the
>>> expected
>>>> token
>>>>>   d. serves requested resource for valid tokens
>>>>>   e. subsequent requests are handled by the AuthenticationFilter
>>>> recognition of the hadoop auth cookie
>>>>> REQUIRED COMPONENTS for UI USECASES:
>>>>> COMP-12. WebSSOAuthenticationHandler
>>>>> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based
>>>> login
>>>>> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party
>>> token
>>>> federation
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
>> Brian.Swan@microsoft.com>
>>>> wrote:
>>>>> Thanks, Larry. That is what I was trying to say, but you've said it
>>>> better and in more detail. :-) To extract from what you are saying: "If
>>> we
>>>> were to reframe the immediate scope to the lowest common denominator of
>>>> what is needed for accepting tokens in authentication plugins then we
>>>> gain... an end-state for the lowest common denominator that enables
>> code
>>>> patches in the near-term is the best of both worlds."
>>>>> 
>>>>> -Brian
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>>>>> Sent: Wednesday, July 10, 2013 10:40 AM
>>>>> To: common-dev@hadoop.apache.org
>>>>> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>>>>> 
>>>>> It seems to me that we can have the best of both worlds here...it's
>> all
>>>> about the scoping.
>>>>> 
>>>>> If we were to reframe the immediate scope to the lowest common
>>>> denominator of what is needed for accepting tokens in authentication
>>>> plugins then we gain:
>>>>> 
>>>>> 1. a very manageable scope to define and agree upon 2. a deliverable
>>>> that should be useful in and of itself 3. a foundation for community
>>>> collaboration that we build on for higher level solutions built on this
>>>> lowest common denominator and experience as a working community
>>>>> 
>>>>> So, to Alejandro's point, perhaps we need to define what would make
>> #2
>>>> above true - this could serve as the "what" we are building instead of
>>> the
>>>> "how" to build it.
>>>>> Including:
>>>>> a. project structure within hadoop-common-project/common-security or
>>> the
>>>> like b. the usecases that would need to be enabled to make it a self
>>>> contained and useful contribution - without higher level solutions c.
>> the
>>>> JIRA/s for contributing patches d. what specific patches will be needed
>>> to
>>>> accomplished the usecases in #b
>>>>> 
>>>>> In other words, an end-state for the lowest common denominator that
>>>> enables code patches in the near-term is the best of both worlds.
>>>>> 
>>>>> I think this may be a good way to bootstrap the collaboration process
>>>> for our emerging security community rather than trying to tackle a huge
>>>> vision all at once.
>>>>> 
>>>>> @Alejandro - if you have something else in mind that would bootstrap
>>>> this process - that would great - please advise.
>>>>> 
>>>>> thoughts?
>>>>> 
>>>>> On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
>>>> wrote:
>>>>> 
>>>>>> Hi Alejandro, all-
>>>>>> 
>>>>>> There seems to be agreement on the broad stroke description of the
>>>> components needed to achieve pluggable token authentication (I'm sure
>>> I'll
>>>> be corrected if that isn't the case). However, discussion of the
>> details
>>> of
>>>> those components doesn't seem to be moving forward. I think this is
>>> because
>>>> the details are really best understood through code. I also see *a*
>> (i.e.
>>>> one of many possible) token format and pluggable authentication
>>> mechanisms
>>>> within the RPC layer as components that can have immediate benefit to
>>>> Hadoop users AND still allow flexibility in the larger design. So, I
>>> think
>>>> the best way to move the conversation of "what we are aiming for"
>> forward
>>>> is to start looking at code for these components. I am especially
>>>> interested in moving forward with pluggable authentication mechanisms
>>>> within the RPC layer and would love to see what others have done in
>> this
>>>> area (if anything).
>>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> -Brian
>>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
>>>>>> Sent: Wednesday, July 10, 2013 8:15 AM
>>>>>> To: Larry McCay
>>>>>> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
>>>>>> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>>>>>> 
>>>>>> Larry, all,
>>>>>> 
>>>>>> Still is not clear to me what is the end state we are aiming for,
>> or
>>>> that we even agree on that.
>>>>>> 
>>>>>> IMO, Instead trying to agree what to do, we should first  agree on
>>> the
>>>> final state, then we see what should be changed to there there, then we
>>> see
>>>> how we change things to get there.
>>>>>> 
>>>>>> The different documents out there focus more on how.
>>>>>> 
>>>>>> We not try to say how before we know what.
>>>>>> 
>>>>>> Thx.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
>> lmccay@hortonworks.com
>>>> 
>>>> wrote:
>>>>>> 
>>>>>>> All -
>>>>>>> 
>>>>>>> After combing through this thread - as well as the summit session
>>>>>>> summary thread, I think that we have the following two items that
>> we
>>>>>>> can probably move forward with:
>>>>>>> 
>>>>>>> 1. TokenAuth method - assuming this means the pluggable
>>>>>>> authentication mechanisms within the RPC layer (2 votes: Kai and
>>>>>>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
>>>>>>> 
>>>>>>> I propose that we attack both of these aspects as one. Let's
>> provide
>>>>>>> the structure and interfaces of the pluggable framework for use in
>>>>>>> the RPC layer through leveraging Daryn's pluggability work and POC
>>> it
>>>>>>> with a particular token format (not necessarily the only format
>> ever
>>>>>>> supported - we just need one to start). If there has already been
>>>>>>> work done in this area by anyone then please speak up and commit
>> to
>>>>>>> providing a patch - so that we don't duplicate effort.
>>>>>>> 
>>>>>>> @Daryn - is there a particular Jira or set of Jiras that we can
>> look
>>>>>>> at to discern the pluggability mechanism details? Documentation of
>>> it
>>>>>>> would be great as well.
>>>>>>> @Kai - do you have existing code for the pluggable token
>>>>>>> authentication mechanism - if not, we can take a stab at
>>> representing
>>>>>>> it with interfaces and/or POC code.
>>>>>>> I can standup and say that we have a token format that we have
>> been
>>>>>>> working with already and can provide a patch that represents it
>> as a
>>>>>>> contribution to test out the pluggable tokenAuth.
>>>>>>> 
>>>>>>> These patches will provide progress toward code being the central
>>>>>>> discussion vehicle. As a community, we can then incrementally
>> build
>>>>>>> on that foundation in order to collaboratively deliver the common
>>>> vision.
>>>>>>> 
>>>>>>> In the absence of any other home for posting such patches, let's
>>>>>>> assume that they will be attached to HADOOP-9392 - or a dedicated
>>>>>>> subtask for this particular aspect/s - I will leave that detail to
>>>> Kai.
>>>>>>> 
>>>>>>> @Alejandro, being the only voice on this thread that isn't
>>>>>>> represented in the votes above, please feel free to agree or
>>> disagree
>>>> with this direction.
>>>>>>> 
>>>>>>> thanks,
>>>>>>> 
>>>>>>> --larry
>>>>>>> 
>>>>>>> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Andy -
>>>>>>>> 
>>>>>>>>> Happy Fourth of July to you and yours.
>>>>>>>> 
>>>>>>>> Same to you and yours. :-)
>>>>>>>> We had some fun in the sun for a change - we've had nothing but
>>> rain
>>>>>>>> on
>>>>>>> the east coast lately.
>>>>>>>> 
>>>>>>>>> My concern here is there may have been a misinterpretation or
>> lack
>>>>>>>>> of consensus on what is meant by "clean slate"
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Apparently so.
>>>>>>>> On the pre-summit call, I stated that I was interested in
>>>>>>>> reconciling
>>>>>>> the jiras so that we had one to work from.
>>>>>>>> 
>>>>>>>> You recommended that we set them aside for the time being - with
>>> the
>>>>>>> understanding that work would continue on your side (and our's as
>>>>>>> well) - and approach the community discussion from a clean slate.
>>>>>>>> We seemed to do this at the summit session quite well.
>>>>>>>> It was my understanding that this community discussion would live
>>>>>>>> beyond
>>>>>>> the summit and continue on this list.
>>>>>>>> 
>>>>>>>> While closing the summit session we agreed to follow up on
>>>>>>>> common-dev
>>>>>>> with first a summary then a discussion of the moving parts.
>>>>>>>> 
>>>>>>>> I never expected the previous work to be abandoned and fully
>>>>>>>> expected it
>>>>>>> to inform the discussion that happened here.
>>>>>>>> 
>>>>>>>> If you would like to reframe what clean slate was supposed to
>> mean
>>>>>>>> or
>>>>>>> describe what it means now - that would be welcome - before I
>> waste
>>>>>>> anymore time trying to facilitate a community discussion that is
>>>>>>> apparently not wanted.
>>>>>>>> 
>>>>>>>>> Nowhere in this
>>>>>>>>> picture are self appointed "master JIRAs" and such, which have
>>> been
>>>>>>>>> disappointing to see crop up, we should be collaboratively
>> coding
>>>>>>>>> not planting flags.
>>>>>>>> 
>>>>>>>> I don't know what you mean by self-appointed master JIRAs.
>>>>>>>> It has certainly not been anyone's intention to disappoint.
>>>>>>>> Any mention of a new JIRA was just to have a clear context to
>>> gather
>>>>>>>> the
>>>>>>> agreed upon points - previous and/or existing JIRAs would easily
>> be
>>>> linked.
>>>>>>>> 
>>>>>>>> Planting flags... I need to go back and read my discussion point
>>>>>>>> about the
>>>>>>> JIRA and see how this is the impression that was made.
>>>>>>>> That is not how I define success. The only flags that count is
>>> code.
>>>>>>> What we are lacking is the roadmap on which to put the code.
>>>>>>>> 
>>>>>>>>> I read Kai's latest document as something approaching today's
>>>>>>>>> consensus
>>>>>>> (or
>>>>>>>>> at least a common point of view?) rather than a historical
>>> document.
>>>>>>>>> Perhaps he and it can be given equal share of the consideration.
>>>>>>>> 
>>>>>>>> I definitely read it as something that has evolved into something
>>>>>>> approaching what we have been talking about so far. There has not
>>>>>>> however been enough discussion anywhere near the level of detail
>> in
>>>>>>> that document and more details are needed for each component in
>> the
>>>> design.
>>>>>>>> Why the work in that document should not be fed into the
>> community
>>>>>>> discussion as anyone else's would be - I fail to understand.
>>>>>>>> 
>>>>>>>> My suggestion continues to be that you should take that document
>>> and
>>>>>>> speak to the inventory of moving parts as we agreed.
>>>>>>>> As these are agreed upon, we will ensure that the appropriate
>>>>>>>> subtasks
>>>>>>> are filed against whatever JIRA is to host them - don't really
>> care
>>>>>>> much which it is.
>>>>>>>> 
>>>>>>>> I don't really want to continue with two separate JIRAs - as I
>>>>>>>> stated
>>>>>>> long ago - but until we understand what the pieces are and how
>> they
>>>>>>> relate then they can't be consolidated.
>>>>>>>> Even if 9533 ended up being repurposed as the server instance of
>>> the
>>>>>>> work - it should be a subtask of a larger one - if that is to be
>>>>>>> 9392, so be it.
>>>>>>>> We still need to define all the pieces of the larger picture
>> before
>>>>>>>> that
>>>>>>> can be done.
>>>>>>>> 
>>>>>>>> What I thought was the clean slate approach to the discussion
>>> seemed
>>>>>>>> a
>>>>>>> very reasonable way to make all this happen.
>>>>>>>> If you would like to restate what you intended by it or something
>>>>>>>> else
>>>>>>> equally as reasonable as a way to move forward that would be
>>> awesome.
>>>>>>>> 
>>>>>>>> I will be happy to work toward the roadmap with everyone once it
>> is
>>>>>>> articulated, understood and actionable.
>>>>>>>> In the meantime, I have work to do.
>>>>>>>> 
>>>>>>>> thanks,
>>>>>>>> 
>>>>>>>> --larry
>>>>>>>> 
>>>>>>>> BTW - I meant to quote you in an earlier response and ended up
>>>>>>>> saying it
>>>>>>> was Aaron instead. Not sure what happened there. :-)
>>>>>>>> 
>>>>>>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org>
>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Larry (and all),
>>>>>>>>> 
>>>>>>>>> Happy Fourth of July to you and yours.
>>>>>>>>> 
>>>>>>>>> In our shop Kai and Tianyou are already doing the coding, so I'd
>>>>>>>>> defer
>>>>>>> to
>>>>>>>>> them on the detailed points.
>>>>>>>>> 
>>>>>>>>> My concern here is there may have been a misinterpretation or
>> lack
>>>>>>>>> of consensus on what is meant by "clean slate". Hopefully that
>> can
>>>>>>>>> be
>>>>>>> quickly
>>>>>>>>> cleared up. Certainly we did not mean ignore all that came
>> before.
>>>>>>>>> The
>>>>>>> idea
>>>>>>>>> was to reset discussions to find common ground and new direction
>>>>>>>>> where
>>>>>>> we
>>>>>>>>> are working together, not in conflict, on an agreed upon set of
>>>>>>>>> design points and tasks. There's been a lot of good discussion
>> and
>>>>>>>>> design preceeding that we should figure out how to port over.
>>>>>>>>> Nowhere in this picture are self appointed "master JIRAs" and
>>> such,
>>>>>>>>> which have been disappointing to see crop up, we should be
>>>>>>>>> collaboratively coding not planting flags.
>>>>>>>>> 
>>>>>>>>> I read Kai's latest document as something approaching today's
>>>>>>>>> consensus
>>>>>>> (or
>>>>>>>>> at least a common point of view?) rather than a historical
>>> document.
>>>>>>>>> Perhaps he and it can be given equal share of the consideration.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>>>>>>>>> 
>>>>>>>>>> Hey Andrew -
>>>>>>>>>> 
>>>>>>>>>> I largely agree with that statement.
>>>>>>>>>> My intention was to let the differences be worked out within
>> the
>>>>>>>>>> individual components once they were identified and subtasks
>>>> created.
>>>>>>>>>> 
>>>>>>>>>> My reference to HSSO was really referring to a SSO *server*
>> based
>>>>>>> design
>>>>>>>>>> which was not clearly articulated in the earlier documents.
>>>>>>>>>> We aren't trying to compare and contrast one design over
>> another
>>>>>>> anymore.
>>>>>>>>>> 
>>>>>>>>>> Let's move this collaboration along as we've mapped out and the
>>>>>>>>>> differences in the details will reveal themselves and be
>>> addressed
>>>>>>> within
>>>>>>>>>> their components.
>>>>>>>>>> 
>>>>>>>>>> I've actually been looking forward to you weighing in on the
>>>>>>>>>> actual discussion points in this thread.
>>>>>>>>>> Could you do that?
>>>>>>>>>> 
>>>>>>>>>> At this point, I am most interested in your thoughts on a
>> single
>>>>>>>>>> jira
>>>>>>> to
>>>>>>>>>> represent all of this work and whether we should start
>> discussing
>>>>>>>>>> the
>>>>>>> SSO
>>>>>>>>>> Tokens.
>>>>>>>>>> If you think there are discussion points missing from that
>> list,
>>>>>>>>>> feel
>>>>>>> free
>>>>>>>>>> to add to it.
>>>>>>>>>> 
>>>>>>>>>> thanks,
>>>>>>>>>> 
>>>>>>>>>> --larry
>>>>>>>>>> 
>>>>>>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
>> apurtell@apache.org>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Larry,
>>>>>>>>>>> 
>>>>>>>>>>> Of course I'll let Kai speak for himself. However, let me
>> point
>>>>>>>>>>> out
>>>>>>> that,
>>>>>>>>>>> while the differences between the competing JIRAs have been
>>>>>>>>>>> reduced
>>>>>>> for
>>>>>>>>>>> sure, there were some key differences that didn't just
>>> disappear.
>>>>>>>>>>> Subsequent discussion will make that clear. I also disagree
>> with
>>>>>>>>>>> your characterization that we have simply endorsed all of the
>>>>>>>>>>> design
>>>>>>> decisions
>>>>>>>>>>> of the so-called HSSO, this is taking a mile from an inch. We
>>> are
>>>>>>> here to
>>>>>>>>>>> engage in a collaborative process as peers. I've been
>> encouraged
>>>>>>>>>>> by
>>>>>>> the
>>>>>>>>>>> spirit of the discussions up to this point and hope that can
>>>>>>>>>>> continue beyond one design summit.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
>>>>>>>>>>> <lm...@hortonworks.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Kai -
>>>>>>>>>>>> 
>>>>>>>>>>>> I think that I need to clarify something...
>>>>>>>>>>>> 
>>>>>>>>>>>> This is not an update for 9533 but a continuation of the
>>>>>>>>>>>> discussions
>>>>>>>>>> that
>>>>>>>>>>>> are focused on a fresh look at a SSO for Hadoop.
>>>>>>>>>>>> We've agreed to leave our previous designs behind and
>> therefore
>>>>>>>>>>>> we
>>>>>>>>>> aren't
>>>>>>>>>>>> really seeing it as an HSSO layered on top of TAS approach or
>>> an
>>>>>>> HSSO vs
>>>>>>>>>>>> TAS discussion.
>>>>>>>>>>>> 
>>>>>>>>>>>> Your latest design revision actually makes it clear that you
>>> are
>>>>>>>>>>>> now targeting exactly what was described as HSSO - so
>> comparing
>>>>>>>>>>>> and
>>>>>>>>>> contrasting
>>>>>>>>>>>> is not going to add any value.
>>>>>>>>>>>> 
>>>>>>>>>>>> What we need you to do at this point, is to look at those
>>>>>>>>>>>> high-level components described on this thread and comment on
>>>>>>>>>>>> whether we need additional components or any that are listed
>>>>>>>>>>>> that don't seem
>>>>>>> necessary
>>>>>>>>>> to
>>>>>>>>>>>> you and why.
>>>>>>>>>>>> In other words, we need to define and agree on the work that
>>> has
>>>>>>>>>>>> to
>>>>>>> be
>>>>>>>>>>>> done.
>>>>>>>>>>>> 
>>>>>>>>>>>> We also need to determine those components that need to be
>> done
>>>>>>> before
>>>>>>>>>>>> anything else can be started.
>>>>>>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
>>>>>>>>>>>> central to
>>>>>>>>>> all
>>>>>>>>>>>> the other components and should probably be defined and POC'd
>>> in
>>>>>>> short
>>>>>>>>>>>> order.
>>>>>>>>>>>> 
>>>>>>>>>>>> Personally, I think that continuing the separation of 9533
>> and
>>>>>>>>>>>> 9392
>>>>>>> will
>>>>>>>>>>>> do this effort a disservice. There doesn't seem to be enough
>>>>>>> differences
>>>>>>>>>>>> between the two to justify separate jiras anymore. It may be
>>>>>>>>>>>> best to
>>>>>>>>>> file a
>>>>>>>>>>>> new one that reflects a single vision without the extra cruft
>>>>>>>>>>>> that
>>>>>>> has
>>>>>>>>>>>> built up in either of the existing ones. We would certainly
>>>>>>>>>>>> reference
>>>>>>>>>> the
>>>>>>>>>>>> existing ones within the new one. This approach would align
>>> with
>>>>>>>>>>>> the
>>>>>>>>>> spirit
>>>>>>>>>>>> of the discussions up to this point.
>>>>>>>>>>>> 
>>>>>>>>>>>> I am prepared to start a discussion around the shape of the
>> two
>>>>>>> Hadoop
>>>>>>>>>> SSO
>>>>>>>>>>>> tokens: identity and access. If this is what others feel the
>>>>>>>>>>>> next
>>>>>>> topic
>>>>>>>>>>>> should be.
>>>>>>>>>>>> If we can identify a jira home for it, we can do it there -
>>>>>>> otherwise we
>>>>>>>>>>>> can create another DISCUSS thread for it.
>>>>>>>>>>>> 
>>>>>>>>>>>> thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> --larry
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
>> kai.zheng@intel.com>
>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Larry,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks for the update. Good to see that with this update we
>>> are
>>>>>>>>>>>>> now
>>>>>>>>>>>> aligned on most points.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
>>>>>>>>>>>>> new
>>>>>>>>>>>> revision incorporates feedback and suggestions in related
>>>>>>>>>>>> discussion
>>>>>>>>>> with
>>>>>>>>>>>> the community, particularly from Microsoft and others
>> attending
>>>>>>>>>>>> the Security design lounge session at the Hadoop summit.
>>> Summary
>>>>>>>>>>>> of the
>>>>>>>>>> changes:
>>>>>>>>>>>>> 1.    Revised the approach to now use two tokens, Identity
>>> Token
>>>>>>> plus
>>>>>>>>>>>> Access Token, particularly considering our authorization
>>>>>>>>>>>> framework
>>>>>>> and
>>>>>>>>>>>> compatibility with HSSO;
>>>>>>>>>>>>> 2.    Introduced Authorization Server (AS) from our
>>>> authorization
>>>>>>>>>>>> framework into the flow that issues access tokens for clients
>>>>>>>>>>>> with
>>>>>>>>>> identity
>>>>>>>>>>>> tokens to access services;
>>>>>>>>>>>>> 3.    Refined proxy access token and the proxy/impersonation
>>>> flow;
>>>>>>>>>>>>> 4.    Refined the browser web SSO flow regarding access to
>>>> Hadoop
>>>>>>> web
>>>>>>>>>>>> services;
>>>>>>>>>>>>> 5.    Added Hadoop RPC access flow regard
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> 
>>>>>>>>> - Andy
>>>>>>>>> 
>>>>>>>>> Problems worthy of attack prove their worth by hitting back. -
>>> Piet
>>>>>>>>> Hein (via Tom White)
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Alejandro
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> <Iteration1PluggableUserAuthenticationandFederation.pdf>
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Alejandro
>>> 
>> 
> 
> 
> 
> -- 
> Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

yep, that is what I meant. Thanks Chris


On Tue, Aug 6, 2013 at 1:12 PM, Chris Nauroth <cn...@hortonworks.com>wrote:

> Perhaps this is also a good opportunity to try out the new "branch
> committers" clause in the bylaws, enabling non-committers who are working
> on this to commit to the feature branch.
>
>
> http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <tucu@cloudera.com
> >wrote:
>
> > Larry,
> >
> > Sorry for the delay answering. Thanks for laying down things, yes, it
> makes
> > sense.
> >
> > Given the large scope of the changes, number of JIRAs and number of
> > developers involved, wouldn't make sense to create a feature branch for
> all
> > this work not to destabilize (more ;) trunk?
> >
> > Thanks again.
> >
> >
> > On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lm...@hortonworks.com>
> > wrote:
> >
> > > The following JIRA was filed to provide a token and basic authority
> > > implementation for this effort:
> > > https://issues.apache.org/jira/browse/HADOOP-9781
> > >
> > > I have attached an initial patch though have yet to submit it as one
> > since
> > > it is dependent on the patch for CMF that was posted to:
> > > https://issues.apache.org/jira/browse/HADOOP-9534
> > > and this patch still has a couple outstanding issues - javac warnings
> for
> > > com.sun classes for certification generation and 11 javadoc warnings.
> > >
> > > Please feel free to review the patches and raise any questions or
> > concerns
> > > related to them.
> > >
> > > On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com>
> wrote:
> > >
> > > > Hello All -
> > > >
> > > > In an effort to scope an initial iteration that provides value to the
> > > community while focusing on the pluggable authentication aspects, I've
> > > written a description for "Iteration 1". It identifies the goal of the
> > > iteration, the endstate and a set of initial usecases. It also
> enumerates
> > > the components that are required for each usecase. There is a scope
> > section
> > > that details specific things that should be kept out of the first
> > > iteration. This is certainly up for discussion. There may be some of
> > these
> > > things that can be contributed in short order. If we can add some
> things
> > in
> > > without unnecessary complexity for the identified usecases then we
> > should.
> > > >
> > > > @Alejandro - please review this and see whether it satisfies your
> point
> > > for a definition of what we are building.
> > > >
> > > > In addition to the document that I will paste here as text and
> attach a
> > > pdf version, we have a couple patches for components that are
> identified
> > in
> > > the document.
> > > > Specifically, COMP-7 and COMP-8.
> > > >
> > > > I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was
> filed
> > > specifically for that functionality.
> > > > COMP-7 is a small set of classes to introduce JsonWebToken as the
> token
> > > format and a basic JsonWebTokenAuthority that can issue and verify
> these
> > > tokens.
> > > >
> > > > Since there is no JIRA for this yet, I will likely file a new JIRA
> for
> > a
> > > SSO token implementation.
> > > >
> > > > Both of these patches assume to be modules within
> > > hadoop-common/hadoop-common-project.
> > > > While they are relatively small, I think that they will be pulled in
> by
> > > other modules such as hadoop-auth which would likely not want a
> > dependency
> > > on something larger like
> > hadoop-common/hadoop-common-project/hadoop-common.
> > > >
> > > > This is certainly something that we should discuss within the
> community
> > > for this effort though - that being, exactly how to add these libraries
> > so
> > > that they are most easily consumed by existing projects.
> > > >
> > > > Anyway, the following is the Iteration-1 document - it is also
> attached
> > > as a pdf:
> > > >
> > > > Iteration 1: Pluggable User Authentication and Federation
> > > >
> > > > Introduction
> > > > The intent of this effort is to bootstrap the development of
> pluggable
> > > token-based authentication mechanisms to support certain goals of
> > > enterprise authentication integrations. By restricting the scope of
> this
> > > effort, we hope to provide immediate benefit to the community while
> > keeping
> > > the initial contribution to a manageable size that can be easily
> > reviewed,
> > > understood and extended with further development through follow up
> JIRAs
> > > and related iterations.
> > > >
> > > > Iteration Endstate
> > > > Once complete, this effort will have extended the authentication
> > > mechanisms - for all client types - from the existing: Simple, Kerberos
> > and
> > > Plain (for RPC) to include LDAP authentication and SAML based
> federation.
> > > In addition, the ability to provide additional/custom authentication
> > > mechanisms will be enabled for users to plug in their preferred
> > mechanisms.
> > > >
> > > > Project Scope
> > > > The scope of this effort is a subset of the features covered by the
> > > overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on
> > > enabling Hadoop to issue, accept/validate SSO tokens of its own. The
> > > pluggable authentication mechanism within SASL/RPC layer and the
> > > authentication filter pluggability for REST and UI components will be
> > > leveraged and extended to support the results of this effort.
> > > >
> > > > Out of Scope
> > > > In order to scope the initial deliverable as the minimally viable
> > > product, a handful of things have been simplified or left out of scope
> > for
> > > this effort. This is not meant to say that these aspects are not useful
> > or
> > > not needed but that they are not necessary for this iteration. We do
> > > however need to ensure that we don’t do anything to preclude adding
> them
> > in
> > > future iterations.
> > > > 1. Additional Attributes - the result of authentication will continue
> > to
> > > use the existing hadoop tokens and identity representations. Additional
> > > attributes used for finer grained authorization decisions will be added
> > > through follow-up efforts.
> > > > 2. Token revocation - the ability to revoke issued identity tokens
> will
> > > be added later
> > > > 3. Multi-factor authentication - this will likely require additional
> > > attributes and is not necessary for this iteration.
> > > > 4. Authorization changes - we will require additional attributes for
> > the
> > > fine-grained access control plans. This is not needed for this
> iteration.
> > > > 5. Domains - we assume a single flat domain for all users
> > > > 6. Kinit alternative - we can leverage existing REST clients such as
> > > cURL to retrieve tokens through authentication and federation for the
> > time
> > > being
> > > > 7. A specific authentication framework isn’t really necessary within
> > the
> > > REST endpoints for this iteration. If one is available then we can use
> it
> > > otherwise we can leverage existing things like Apache Shiro within a
> > > servlet filter.
> > > >
> > > > In Scope
> > > > What is in scope for this effort is defined by the usecases described
> > > below. Components required for supporting the usecases are summarized
> for
> > > each client type. Each component is a candidate for a JIRA subtask -
> > though
> > > multiple components are likely to be included in a JIRA to represent a
> > set
> > > of functionality rather than individual JIRAs per component.
> > > >
> > > > Terminology and Naming
> > > > The terms and names of components within this document are merely
> > > descriptive of the functionality that they represent. Any similarity or
> > > difference in names or terms from those that are found in other
> documents
> > > are not intended to make any statement about those other documents or
> the
> > > descriptions within. This document represents the pluggable
> > authentication
> > > mechanisms and server functionality required to replace Kerberos.
> > > >
> > > > Ultimately, the naming of the implementation classes will be a
> product
> > > of the patches accepted by the community.
> > > >
> > > > Usecases:
> > > > client types: REST, CLI, UI
> > > > authentication types: Simple, Kerberos, authentication/LDAP,
> > > federation/SAML
> > > >
> > > > Simple and Kerberos
> > > > Simple and Kerberos usecases continue to work as they do today. The
> > > addition of Authentication/LDAP and Federation/SAML are added through
> the
> > > existing pluggability points either as they are or with required
> > extension.
> > > Either way, continued support for Simple and Kerberos must not require
> > > changes to existing deployments in the field as a result of this
> effort.
> > > >
> > > > REST
> > > > USECASE REST-1 Authentication/LDAP:
> > > > For REST clients, we will provide the ability to:
> > > > 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed
> by
> > > an AuthenticationServer instance via REST calls to:
> > > >    a. authenticate - passing username/password returning a hadoop
> > > id_token
> > > >    b. get-access-token - from the TokenGrantingService by passing the
> > > hadoop id_token as an Authorization: Bearer token along with the
> desired
> > > service name (master service name) returning a hadoop access token
> > > > 2. Successfully invoke a hadoop service REST API passing the hadoop
> > > access token through an HTTP header as an Authorization Bearer token
> > > >    a. validation of the incoming token on the service endpoint is
> > > accomplished by an SSOAuthenticationHandler
> > > > 3. Successfully block access to a REST resource when presenting a
> > hadoop
> > > access token intended for a different service
> > > >    a. validation of the incoming token on the service endpoint is
> > > accomplished by an SSOAuthenticationHandler
> > > >
> > > > USECASE REST-2 Federation/SAML:
> > > > We will also provide federation capabilities for REST clients such
> > that:
> > > > 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> > > persist in a permissions protected file - ie.
> ~/.hadoop_tokens/.idp_token
> > > > 2. use cURL to Federate a token from a trusted IdP through an SP
> > > endpoint exposed by an AuthenticationServer(FederationServer?) instance
> > via
> > > REST calls to:
> > > >    a. federate - passing a SAML assertion as an Authorization: Bearer
> > > token returning a hadoop id_token
> > > >       - can copy and paste from commandline or use cat to include
> > > persisted token through "--Header Authorization: Bearer 'cat
> > > ~/.hadoop_tokens/.id_token'"
> > > >    b. get-access-token - from the TokenGrantingService by passing the
> > > hadoop id_token as an Authorization: Bearer token along with the
> desired
> > > service name (master service name) to the TokenGrantingService
> returning
> > a
> > > hadoop access token
> > > > 3. Successfully invoke a hadoop service REST API passing the hadoop
> > > access token through an HTTP header as an Authorization Bearer token
> > > >    a. validation of the incoming token on the service endpoint is
> > > accomplished by an SSOAuthenticationHandler
> > > > 4. Successfully block access to a REST resource when presenting a
> > hadoop
> > > access token intended for a different service
> > > >    a. validation of the incoming token on the service endpoint is
> > > accomplished by an SSOAuthenticationHandler
> > > >
> > > > REQUIRED COMPONENTS for REST USECASES:
> > > > COMP-1. REST client - cURL or similar
> > > > COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint
> > > example - returning hadoop id_token
> > > > COMP-3. REST endpoint for federation with SAML Bearer token -
> > shibboleth
> > > SP?|OpenSAML? - returning hadoop id_token
> > > > COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access
> > > tokens from hadoop id_tokens
> > > > COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
> > > tokens
> > > > COMP-6. some source of a SAML assertion - shibboleth IdP?
> > > > COMP-7. hadoop token and authority implementations
> > > > COMP-8. core services for crypto support for signing, verifying and
> PKI
> > > management
> > > >
> > > > CLI
> > > > USECASE CLI-1 Authentication/LDAP:
> > > > For CLI/RPC clients, we will provide the ability to:
> > > > 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed
> by
> > > an AuthenticationServer instance via REST calls to:
> > > >    a. authenticate - passing username/password returning a hadoop
> > > id_token
> > > >       - for RPC clients we need to persist the returned hadoop
> identity
> > > token in a file protected by fs permissions so that it may be leveraged
> > > until expiry
> > > >       - directing the returned response to a file may suffice for now
> > > something like ">~/.hadoop_tokens/.id_token"
> > > > 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> > > >    a. RPC client negotiates a TokenAuth method through SASL layer,
> > > hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed
> as
> > > Authorization: Bearer token to the get-access-token REST endpoint
> exposed
> > > by TokenGrantingService returning a hadoop access token
> > > >    b. RPC server side validates the presented hadoop access token and
> > > continues to serve request
> > > >    c. Successfully invoke a hadoop service RPC API
> > > >
> > > > USECASE CLI-2 Federation/SAML:
> > > > For CLI/RPC clients, we will provide the ability to:
> > > > 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> > > persist in a permissions protected file - ie.
> ~/.hadoop_tokens/.idp_token
> > > > 2. use cURL to Federate a token from a trusted IdP through an SP
> > > endpoint exposed by an AuthenticationServer(FederationServer?) instance
> > via
> > > REST calls to:
> > > >    a. federate - passing a SAML assertion as an Authorization: Bearer
> > > token returning a hadoop id_token
> > > >       - can copy and paste from commandline or use cat to include
> > > previously persisted token through "--Header Authorization: Bearer 'cat
> > > ~/.hadoop_tokens/.id_token'"
> > > > 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> > > >    a. RPC client negotiates a TokenAuth method through SASL layer,
> > > hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed
> as
> > > Authorization: Bearer token to the get-access-token REST endpoint
> exposed
> > > by TokenGrantingService returning a hadoop access token
> > > >    b. RPC server side validates the presented hadoop access token and
> > > continues to serve request
> > > >    c. Successfully invoke a hadoop service RPC API
> > > >
> > > > REQUIRED COMPONENTS for CLI USECASES - (beyond those required for
> > REST):
> > > > COMP-9. TokenAuth Method negotiation, etc
> > > > COMP-10. Client side implementation to leverage REST endpoint for
> > > acquiring hadoop access tokens given a hadoop id_token
> > > > COMP-11. Server side implementation to validate incoming hadoop
> access
> > > tokens
> > > >
> > > > UI
> > > > Various Hadoop services have their own web UI consoles for
> > > administration and end user interactions. These consoles need to also
> > > benefit from the pluggability of authentication mechansims to be on par
> > > with the access control of the cluster REST and RPC APIs.
> > > > Web consoles are protected with an WebSSOAuthenticationHandler which
> > > will be configured for either authentication or federation.
> > > >
> > > > USECASE UI-1 Authentication/LDAP:
> > > > For the authentication usecase:
> > > > 1. User’s browser requests access to a UI console page
> > > > 2. WebSSOAuthenticationHandler intercepts the request and redirects
> the
> > > browser to an IdP web endpoint exposed by the AuthenticationServer
> > passing
> > > the requested url as the redirect_url
> > > > 3. IdP web endpoint presents the user with a FORM over https
> > > >    a. user provides username/password and submits the FORM
> > > > 4. AuthenticationServer authenticates the user with provided
> > credentials
> > > against the configured LDAP server and:
> > > >    a. leverages a servlet filter or other authentication mechanism
> for
> > > the endpoint and authenticates the user with a simple LDAP bind with
> > > username and password
> > > >    b. acquires a hadoop id_token and uses it to acquire the required
> > > hadoop access token which is added as a cookie
> > > >    c. redirects the browser to the original service UI resource via
> the
> > > provided redirect_url
> > > > 5. WebSSOAuthenticationHandler for the original UI resource
> > interrogates
> > > the incoming request again for an authcookie that contains an access
> > token
> > > upon finding one:
> > > >    a. validates the incoming token
> > > >    b. returns the AuthenticationToken as per AuthenticationHandler
> > > contract
> > > >    c. AuthenticationFilter adds the hadoop auth cookie with the
> > expected
> > > token
> > > >    d. serves requested resource for valid tokens
> > > >    e. subsequent requests are handled by the AuthenticationFilter
> > > recognition of the hadoop auth cookie
> > > >
> > > > USECASE UI-2 Federation/SAML:
> > > > For the federation usecase:
> > > > 1. User’s browser requests access to a UI console page
> > > > 2. WebSSOAuthenticationHandler intercepts the request and redirects
> the
> > > browser to an SP web endpoint exposed by the AuthenticationServer
> passing
> > > the requested url as the redirect_url. This endpoint:
> > > >    a. is dedicated to redirecting to the external IdP passing the
> > > required parameters which may include a redirect_url back to itself as
> > well
> > > as encoding the original redirect_url so that it can determine it on
> the
> > > way back to the client
> > > > 3. the IdP:
> > > >    a. challenges the user for credentials and authenticates the user
> > > >    b. creates appropriate token/cookie and redirects back to the
> > > AuthenticationServer endpoint
> > > > 4. AuthenticationServer endpoint:
> > > >    a. extracts the expected token/cookie from the incoming request
> and
> > > validates it
> > > >    b. creates a hadoop id_token
> > > >    c. acquires a hadoop access token for the id_token
> > > >    d. creates appropriate cookie and redirects back to the original
> > > redirect_url - being the requested resource
> > > > 5. WebSSOAuthenticationHandler for the original UI resource
> > interrogates
> > > the incoming request again for an authcookie that contains an access
> > token
> > > upon finding one:
> > > >    a. validates the incoming token
> > > >    b. returns the AuthenticationToken as per AuthenticationHandler
> > > contrac
> > > >    c. AuthenticationFilter adds the hadoop auth cookie with the
> > expected
> > > token
> > > >    d. serves requested resource for valid tokens
> > > >    e. subsequent requests are handled by the AuthenticationFilter
> > > recognition of the hadoop auth cookie
> > > > REQUIRED COMPONENTS for UI USECASES:
> > > > COMP-12. WebSSOAuthenticationHandler
> > > > COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based
> > > login
> > > > COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party
> > token
> > > federation
> > > >
> > > >
> > > >
> > > > On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <
> Brian.Swan@microsoft.com>
> > > wrote:
> > > > Thanks, Larry. That is what I was trying to say, but you've said it
> > > better and in more detail. :-) To extract from what you are saying: "If
> > we
> > > were to reframe the immediate scope to the lowest common denominator of
> > > what is needed for accepting tokens in authentication plugins then we
> > > gain... an end-state for the lowest common denominator that enables
> code
> > > patches in the near-term is the best of both worlds."
> > > >
> > > > -Brian
> > > >
> > > > -----Original Message-----
> > > > From: Larry McCay [mailto:lmccay@hortonworks.com]
> > > > Sent: Wednesday, July 10, 2013 10:40 AM
> > > > To: common-dev@hadoop.apache.org
> > > > Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> > > > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > > >
> > > > It seems to me that we can have the best of both worlds here...it's
> all
> > > about the scoping.
> > > >
> > > > If we were to reframe the immediate scope to the lowest common
> > > denominator of what is needed for accepting tokens in authentication
> > > plugins then we gain:
> > > >
> > > > 1. a very manageable scope to define and agree upon 2. a deliverable
> > > that should be useful in and of itself 3. a foundation for community
> > > collaboration that we build on for higher level solutions built on this
> > > lowest common denominator and experience as a working community
> > > >
> > > > So, to Alejandro's point, perhaps we need to define what would make
> #2
> > > above true - this could serve as the "what" we are building instead of
> > the
> > > "how" to build it.
> > > > Including:
> > > > a. project structure within hadoop-common-project/common-security or
> > the
> > > like b. the usecases that would need to be enabled to make it a self
> > > contained and useful contribution - without higher level solutions c.
> the
> > > JIRA/s for contributing patches d. what specific patches will be needed
> > to
> > > accomplished the usecases in #b
> > > >
> > > > In other words, an end-state for the lowest common denominator that
> > > enables code patches in the near-term is the best of both worlds.
> > > >
> > > > I think this may be a good way to bootstrap the collaboration process
> > > for our emerging security community rather than trying to tackle a huge
> > > vision all at once.
> > > >
> > > > @Alejandro - if you have something else in mind that would bootstrap
> > > this process - that would great - please advise.
> > > >
> > > > thoughts?
> > > >
> > > > On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
> > > wrote:
> > > >
> > > > > Hi Alejandro, all-
> > > > >
> > > > > There seems to be agreement on the broad stroke description of the
> > > components needed to achieve pluggable token authentication (I'm sure
> > I'll
> > > be corrected if that isn't the case). However, discussion of the
> details
> > of
> > > those components doesn't seem to be moving forward. I think this is
> > because
> > > the details are really best understood through code. I also see *a*
> (i.e.
> > > one of many possible) token format and pluggable authentication
> > mechanisms
> > > within the RPC layer as components that can have immediate benefit to
> > > Hadoop users AND still allow flexibility in the larger design. So, I
> > think
> > > the best way to move the conversation of "what we are aiming for"
> forward
> > > is to start looking at code for these components. I am especially
> > > interested in moving forward with pluggable authentication mechanisms
> > > within the RPC layer and would love to see what others have done in
> this
> > > area (if anything).
> > > > >
> > > > > Thanks.
> > > > >
> > > > > -Brian
> > > > >
> > > > > -----Original Message-----
> > > > > From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> > > > > Sent: Wednesday, July 10, 2013 8:15 AM
> > > > > To: Larry McCay
> > > > > Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> > > > > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > > > >
> > > > > Larry, all,
> > > > >
> > > > > Still is not clear to me what is the end state we are aiming for,
> or
> > > that we even agree on that.
> > > > >
> > > > > IMO, Instead trying to agree what to do, we should first  agree on
> > the
> > > final state, then we see what should be changed to there there, then we
> > see
> > > how we change things to get there.
> > > > >
> > > > > The different documents out there focus more on how.
> > > > >
> > > > > We not try to say how before we know what.
> > > > >
> > > > > Thx.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <
> lmccay@hortonworks.com
> > >
> > > wrote:
> > > > >
> > > > >> All -
> > > > >>
> > > > >> After combing through this thread - as well as the summit session
> > > > >> summary thread, I think that we have the following two items that
> we
> > > > >> can probably move forward with:
> > > > >>
> > > > >> 1. TokenAuth method - assuming this means the pluggable
> > > > >> authentication mechanisms within the RPC layer (2 votes: Kai and
> > > > >> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
> > > > >>
> > > > >> I propose that we attack both of these aspects as one. Let's
> provide
> > > > >> the structure and interfaces of the pluggable framework for use in
> > > > >> the RPC layer through leveraging Daryn's pluggability work and POC
> > it
> > > > >> with a particular token format (not necessarily the only format
> ever
> > > > >> supported - we just need one to start). If there has already been
> > > > >> work done in this area by anyone then please speak up and commit
> to
> > > > >> providing a patch - so that we don't duplicate effort.
> > > > >>
> > > > >> @Daryn - is there a particular Jira or set of Jiras that we can
> look
> > > > >> at to discern the pluggability mechanism details? Documentation of
> > it
> > > > >> would be great as well.
> > > > >> @Kai - do you have existing code for the pluggable token
> > > > >> authentication mechanism - if not, we can take a stab at
> > representing
> > > > >> it with interfaces and/or POC code.
> > > > >> I can standup and say that we have a token format that we have
> been
> > > > >> working with already and can provide a patch that represents it
> as a
> > > > >> contribution to test out the pluggable tokenAuth.
> > > > >>
> > > > >> These patches will provide progress toward code being the central
> > > > >> discussion vehicle. As a community, we can then incrementally
> build
> > > > >> on that foundation in order to collaboratively deliver the common
> > > vision.
> > > > >>
> > > > >> In the absence of any other home for posting such patches, let's
> > > > >> assume that they will be attached to HADOOP-9392 - or a dedicated
> > > > >> subtask for this particular aspect/s - I will leave that detail to
> > > Kai.
> > > > >>
> > > > >> @Alejandro, being the only voice on this thread that isn't
> > > > >> represented in the votes above, please feel free to agree or
> > disagree
> > > with this direction.
> > > > >>
> > > > >> thanks,
> > > > >>
> > > > >> --larry
> > > > >>
> > > > >> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
> > > wrote:
> > > > >>
> > > > >>> Hi Andy -
> > > > >>>
> > > > >>>> Happy Fourth of July to you and yours.
> > > > >>>
> > > > >>> Same to you and yours. :-)
> > > > >>> We had some fun in the sun for a change - we've had nothing but
> > rain
> > > > >>> on
> > > > >> the east coast lately.
> > > > >>>
> > > > >>>> My concern here is there may have been a misinterpretation or
> lack
> > > > >>>> of consensus on what is meant by "clean slate"
> > > > >>>
> > > > >>>
> > > > >>> Apparently so.
> > > > >>> On the pre-summit call, I stated that I was interested in
> > > > >>> reconciling
> > > > >> the jiras so that we had one to work from.
> > > > >>>
> > > > >>> You recommended that we set them aside for the time being - with
> > the
> > > > >> understanding that work would continue on your side (and our's as
> > > > >> well) - and approach the community discussion from a clean slate.
> > > > >>> We seemed to do this at the summit session quite well.
> > > > >>> It was my understanding that this community discussion would live
> > > > >>> beyond
> > > > >> the summit and continue on this list.
> > > > >>>
> > > > >>> While closing the summit session we agreed to follow up on
> > > > >>> common-dev
> > > > >> with first a summary then a discussion of the moving parts.
> > > > >>>
> > > > >>> I never expected the previous work to be abandoned and fully
> > > > >>> expected it
> > > > >> to inform the discussion that happened here.
> > > > >>>
> > > > >>> If you would like to reframe what clean slate was supposed to
> mean
> > > > >>> or
> > > > >> describe what it means now - that would be welcome - before I
> waste
> > > > >> anymore time trying to facilitate a community discussion that is
> > > > >> apparently not wanted.
> > > > >>>
> > > > >>>> Nowhere in this
> > > > >>>> picture are self appointed "master JIRAs" and such, which have
> > been
> > > > >>>> disappointing to see crop up, we should be collaboratively
> coding
> > > > >>>> not planting flags.
> > > > >>>
> > > > >>> I don't know what you mean by self-appointed master JIRAs.
> > > > >>> It has certainly not been anyone's intention to disappoint.
> > > > >>> Any mention of a new JIRA was just to have a clear context to
> > gather
> > > > >>> the
> > > > >> agreed upon points - previous and/or existing JIRAs would easily
> be
> > > linked.
> > > > >>>
> > > > >>> Planting flags... I need to go back and read my discussion point
> > > > >>> about the
> > > > >> JIRA and see how this is the impression that was made.
> > > > >>> That is not how I define success. The only flags that count is
> > code.
> > > > >> What we are lacking is the roadmap on which to put the code.
> > > > >>>
> > > > >>>> I read Kai's latest document as something approaching today's
> > > > >>>> consensus
> > > > >> (or
> > > > >>>> at least a common point of view?) rather than a historical
> > document.
> > > > >>>> Perhaps he and it can be given equal share of the consideration.
> > > > >>>
> > > > >>> I definitely read it as something that has evolved into something
> > > > >> approaching what we have been talking about so far. There has not
> > > > >> however been enough discussion anywhere near the level of detail
> in
> > > > >> that document and more details are needed for each component in
> the
> > > design.
> > > > >>> Why the work in that document should not be fed into the
> community
> > > > >> discussion as anyone else's would be - I fail to understand.
> > > > >>>
> > > > >>> My suggestion continues to be that you should take that document
> > and
> > > > >> speak to the inventory of moving parts as we agreed.
> > > > >>> As these are agreed upon, we will ensure that the appropriate
> > > > >>> subtasks
> > > > >> are filed against whatever JIRA is to host them - don't really
> care
> > > > >> much which it is.
> > > > >>>
> > > > >>> I don't really want to continue with two separate JIRAs - as I
> > > > >>> stated
> > > > >> long ago - but until we understand what the pieces are and how
> they
> > > > >> relate then they can't be consolidated.
> > > > >>> Even if 9533 ended up being repurposed as the server instance of
> > the
> > > > >> work - it should be a subtask of a larger one - if that is to be
> > > > >> 9392, so be it.
> > > > >>> We still need to define all the pieces of the larger picture
> before
> > > > >>> that
> > > > >> can be done.
> > > > >>>
> > > > >>> What I thought was the clean slate approach to the discussion
> > seemed
> > > > >>> a
> > > > >> very reasonable way to make all this happen.
> > > > >>> If you would like to restate what you intended by it or something
> > > > >>> else
> > > > >> equally as reasonable as a way to move forward that would be
> > awesome.
> > > > >>>
> > > > >>> I will be happy to work toward the roadmap with everyone once it
> is
> > > > >> articulated, understood and actionable.
> > > > >>> In the meantime, I have work to do.
> > > > >>>
> > > > >>> thanks,
> > > > >>>
> > > > >>> --larry
> > > > >>>
> > > > >>> BTW - I meant to quote you in an earlier response and ended up
> > > > >>> saying it
> > > > >> was Aaron instead. Not sure what happened there. :-)
> > > > >>>
> > > > >>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org>
> > > wrote:
> > > > >>>
> > > > >>>> Hi Larry (and all),
> > > > >>>>
> > > > >>>> Happy Fourth of July to you and yours.
> > > > >>>>
> > > > >>>> In our shop Kai and Tianyou are already doing the coding, so I'd
> > > > >>>> defer
> > > > >> to
> > > > >>>> them on the detailed points.
> > > > >>>>
> > > > >>>> My concern here is there may have been a misinterpretation or
> lack
> > > > >>>> of consensus on what is meant by "clean slate". Hopefully that
> can
> > > > >>>> be
> > > > >> quickly
> > > > >>>> cleared up. Certainly we did not mean ignore all that came
> before.
> > > > >>>> The
> > > > >> idea
> > > > >>>> was to reset discussions to find common ground and new direction
> > > > >>>> where
> > > > >> we
> > > > >>>> are working together, not in conflict, on an agreed upon set of
> > > > >>>> design points and tasks. There's been a lot of good discussion
> and
> > > > >>>> design preceeding that we should figure out how to port over.
> > > > >>>> Nowhere in this picture are self appointed "master JIRAs" and
> > such,
> > > > >>>> which have been disappointing to see crop up, we should be
> > > > >>>> collaboratively coding not planting flags.
> > > > >>>>
> > > > >>>> I read Kai's latest document as something approaching today's
> > > > >>>> consensus
> > > > >> (or
> > > > >>>> at least a common point of view?) rather than a historical
> > document.
> > > > >>>> Perhaps he and it can be given equal share of the consideration.
> > > > >>>>
> > > > >>>>
> > > > >>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> > > > >>>>
> > > > >>>>> Hey Andrew -
> > > > >>>>>
> > > > >>>>> I largely agree with that statement.
> > > > >>>>> My intention was to let the differences be worked out within
> the
> > > > >>>>> individual components once they were identified and subtasks
> > > created.
> > > > >>>>>
> > > > >>>>> My reference to HSSO was really referring to a SSO *server*
> based
> > > > >> design
> > > > >>>>> which was not clearly articulated in the earlier documents.
> > > > >>>>> We aren't trying to compare and contrast one design over
> another
> > > > >> anymore.
> > > > >>>>>
> > > > >>>>> Let's move this collaboration along as we've mapped out and the
> > > > >>>>> differences in the details will reveal themselves and be
> > addressed
> > > > >> within
> > > > >>>>> their components.
> > > > >>>>>
> > > > >>>>> I've actually been looking forward to you weighing in on the
> > > > >>>>> actual discussion points in this thread.
> > > > >>>>> Could you do that?
> > > > >>>>>
> > > > >>>>> At this point, I am most interested in your thoughts on a
> single
> > > > >>>>> jira
> > > > >> to
> > > > >>>>> represent all of this work and whether we should start
> discussing
> > > > >>>>> the
> > > > >> SSO
> > > > >>>>> Tokens.
> > > > >>>>> If you think there are discussion points missing from that
> list,
> > > > >>>>> feel
> > > > >> free
> > > > >>>>> to add to it.
> > > > >>>>>
> > > > >>>>> thanks,
> > > > >>>>>
> > > > >>>>> --larry
> > > > >>>>>
> > > > >>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <
> apurtell@apache.org>
> > > > >> wrote:
> > > > >>>>>
> > > > >>>>>> Hi Larry,
> > > > >>>>>>
> > > > >>>>>> Of course I'll let Kai speak for himself. However, let me
> point
> > > > >>>>>> out
> > > > >> that,
> > > > >>>>>> while the differences between the competing JIRAs have been
> > > > >>>>>> reduced
> > > > >> for
> > > > >>>>>> sure, there were some key differences that didn't just
> > disappear.
> > > > >>>>>> Subsequent discussion will make that clear. I also disagree
> with
> > > > >>>>>> your characterization that we have simply endorsed all of the
> > > > >>>>>> design
> > > > >> decisions
> > > > >>>>>> of the so-called HSSO, this is taking a mile from an inch. We
> > are
> > > > >> here to
> > > > >>>>>> engage in a collaborative process as peers. I've been
> encouraged
> > > > >>>>>> by
> > > > >> the
> > > > >>>>>> spirit of the discussions up to this point and hope that can
> > > > >>>>>> continue beyond one design summit.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> > > > >>>>>> <lm...@hortonworks.com>
> > > > >>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hi Kai -
> > > > >>>>>>>
> > > > >>>>>>> I think that I need to clarify something...
> > > > >>>>>>>
> > > > >>>>>>> This is not an update for 9533 but a continuation of the
> > > > >>>>>>> discussions
> > > > >>>>> that
> > > > >>>>>>> are focused on a fresh look at a SSO for Hadoop.
> > > > >>>>>>> We've agreed to leave our previous designs behind and
> therefore
> > > > >>>>>>> we
> > > > >>>>> aren't
> > > > >>>>>>> really seeing it as an HSSO layered on top of TAS approach or
> > an
> > > > >> HSSO vs
> > > > >>>>>>> TAS discussion.
> > > > >>>>>>>
> > > > >>>>>>> Your latest design revision actually makes it clear that you
> > are
> > > > >>>>>>> now targeting exactly what was described as HSSO - so
> comparing
> > > > >>>>>>> and
> > > > >>>>> contrasting
> > > > >>>>>>> is not going to add any value.
> > > > >>>>>>>
> > > > >>>>>>> What we need you to do at this point, is to look at those
> > > > >>>>>>> high-level components described on this thread and comment on
> > > > >>>>>>> whether we need additional components or any that are listed
> > > > >>>>>>> that don't seem
> > > > >> necessary
> > > > >>>>> to
> > > > >>>>>>> you and why.
> > > > >>>>>>> In other words, we need to define and agree on the work that
> > has
> > > > >>>>>>> to
> > > > >> be
> > > > >>>>>>> done.
> > > > >>>>>>>
> > > > >>>>>>> We also need to determine those components that need to be
> done
> > > > >> before
> > > > >>>>>>> anything else can be started.
> > > > >>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> > > > >>>>>>> central to
> > > > >>>>> all
> > > > >>>>>>> the other components and should probably be defined and POC'd
> > in
> > > > >> short
> > > > >>>>>>> order.
> > > > >>>>>>>
> > > > >>>>>>> Personally, I think that continuing the separation of 9533
> and
> > > > >>>>>>> 9392
> > > > >> will
> > > > >>>>>>> do this effort a disservice. There doesn't seem to be enough
> > > > >> differences
> > > > >>>>>>> between the two to justify separate jiras anymore. It may be
> > > > >>>>>>> best to
> > > > >>>>> file a
> > > > >>>>>>> new one that reflects a single vision without the extra cruft
> > > > >>>>>>> that
> > > > >> has
> > > > >>>>>>> built up in either of the existing ones. We would certainly
> > > > >>>>>>> reference
> > > > >>>>> the
> > > > >>>>>>> existing ones within the new one. This approach would align
> > with
> > > > >>>>>>> the
> > > > >>>>> spirit
> > > > >>>>>>> of the discussions up to this point.
> > > > >>>>>>>
> > > > >>>>>>> I am prepared to start a discussion around the shape of the
> two
> > > > >> Hadoop
> > > > >>>>> SSO
> > > > >>>>>>> tokens: identity and access. If this is what others feel the
> > > > >>>>>>> next
> > > > >> topic
> > > > >>>>>>> should be.
> > > > >>>>>>> If we can identify a jira home for it, we can do it there -
> > > > >> otherwise we
> > > > >>>>>>> can create another DISCUSS thread for it.
> > > > >>>>>>>
> > > > >>>>>>> thanks,
> > > > >>>>>>>
> > > > >>>>>>> --larry
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <
> kai.zheng@intel.com>
> > > > >> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Hi Larry,
> > > > >>>>>>>>
> > > > >>>>>>>> Thanks for the update. Good to see that with this update we
> > are
> > > > >>>>>>>> now
> > > > >>>>>>> aligned on most points.
> > > > >>>>>>>>
> > > > >>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
> > > > >>>>>>>> new
> > > > >>>>>>> revision incorporates feedback and suggestions in related
> > > > >>>>>>> discussion
> > > > >>>>> with
> > > > >>>>>>> the community, particularly from Microsoft and others
> attending
> > > > >>>>>>> the Security design lounge session at the Hadoop summit.
> > Summary
> > > > >>>>>>> of the
> > > > >>>>> changes:
> > > > >>>>>>>> 1.    Revised the approach to now use two tokens, Identity
> > Token
> > > > >> plus
> > > > >>>>>>> Access Token, particularly considering our authorization
> > > > >>>>>>> framework
> > > > >> and
> > > > >>>>>>> compatibility with HSSO;
> > > > >>>>>>>> 2.    Introduced Authorization Server (AS) from our
> > > authorization
> > > > >>>>>>> framework into the flow that issues access tokens for clients
> > > > >>>>>>> with
> > > > >>>>> identity
> > > > >>>>>>> tokens to access services;
> > > > >>>>>>>> 3.    Refined proxy access token and the proxy/impersonation
> > > flow;
> > > > >>>>>>>> 4.    Refined the browser web SSO flow regarding access to
> > > Hadoop
> > > > >> web
> > > > >>>>>>> services;
> > > > >>>>>>>> 5.    Added Hadoop RPC access flow regard
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>>> --
> > > > >>>> Best regards,
> > > > >>>>
> > > > >>>> - Andy
> > > > >>>>
> > > > >>>> Problems worthy of attack prove their worth by hitting back. -
> > Piet
> > > > >>>> Hein (via Tom White)
> > > > >>>
> > > > >>
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > Alejandro
> > > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > <Iteration1PluggableUserAuthenticationandFederation.pdf>
> > >
> > >
> >
> >
> > --
> > Alejandro
> >
>



-- 
Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Chris Nauroth <cn...@hortonworks.com>.

Perhaps this is also a good opportunity to try out the new "branch
committers" clause in the bylaws, enabling non-committers who are working
on this to commit to the feature branch.

http://mail-archives.apache.org/mod_mbox/hadoop-general/201308.mbox/%3CCACO5Y4we4d8knB_xU3a=hr2gbeQO5m3vaU+inbA0Li1i9e21DQ@mail.gmail.com%3E

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Tue, Aug 6, 2013 at 1:04 PM, Alejandro Abdelnur <tu...@cloudera.com>wrote:

> Larry,
>
> Sorry for the delay answering. Thanks for laying down things, yes, it makes
> sense.
>
> Given the large scope of the changes, number of JIRAs and number of
> developers involved, wouldn't make sense to create a feature branch for all
> this work not to destabilize (more ;) trunk?
>
> Thanks again.
>
>
> On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lm...@hortonworks.com>
> wrote:
>
> > The following JIRA was filed to provide a token and basic authority
> > implementation for this effort:
> > https://issues.apache.org/jira/browse/HADOOP-9781
> >
> > I have attached an initial patch though have yet to submit it as one
> since
> > it is dependent on the patch for CMF that was posted to:
> > https://issues.apache.org/jira/browse/HADOOP-9534
> > and this patch still has a couple outstanding issues - javac warnings for
> > com.sun classes for certification generation and 11 javadoc warnings.
> >
> > Please feel free to review the patches and raise any questions or
> concerns
> > related to them.
> >
> > On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com> wrote:
> >
> > > Hello All -
> > >
> > > In an effort to scope an initial iteration that provides value to the
> > community while focusing on the pluggable authentication aspects, I've
> > written a description for "Iteration 1". It identifies the goal of the
> > iteration, the endstate and a set of initial usecases. It also enumerates
> > the components that are required for each usecase. There is a scope
> section
> > that details specific things that should be kept out of the first
> > iteration. This is certainly up for discussion. There may be some of
> these
> > things that can be contributed in short order. If we can add some things
> in
> > without unnecessary complexity for the identified usecases then we
> should.
> > >
> > > @Alejandro - please review this and see whether it satisfies your point
> > for a definition of what we are building.
> > >
> > > In addition to the document that I will paste here as text and attach a
> > pdf version, we have a couple patches for components that are identified
> in
> > the document.
> > > Specifically, COMP-7 and COMP-8.
> > >
> > > I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was filed
> > specifically for that functionality.
> > > COMP-7 is a small set of classes to introduce JsonWebToken as the token
> > format and a basic JsonWebTokenAuthority that can issue and verify these
> > tokens.
> > >
> > > Since there is no JIRA for this yet, I will likely file a new JIRA for
> a
> > SSO token implementation.
> > >
> > > Both of these patches assume to be modules within
> > hadoop-common/hadoop-common-project.
> > > While they are relatively small, I think that they will be pulled in by
> > other modules such as hadoop-auth which would likely not want a
> dependency
> > on something larger like
> hadoop-common/hadoop-common-project/hadoop-common.
> > >
> > > This is certainly something that we should discuss within the community
> > for this effort though - that being, exactly how to add these libraries
> so
> > that they are most easily consumed by existing projects.
> > >
> > > Anyway, the following is the Iteration-1 document - it is also attached
> > as a pdf:
> > >
> > > Iteration 1: Pluggable User Authentication and Federation
> > >
> > > Introduction
> > > The intent of this effort is to bootstrap the development of pluggable
> > token-based authentication mechanisms to support certain goals of
> > enterprise authentication integrations. By restricting the scope of this
> > effort, we hope to provide immediate benefit to the community while
> keeping
> > the initial contribution to a manageable size that can be easily
> reviewed,
> > understood and extended with further development through follow up JIRAs
> > and related iterations.
> > >
> > > Iteration Endstate
> > > Once complete, this effort will have extended the authentication
> > mechanisms - for all client types - from the existing: Simple, Kerberos
> and
> > Plain (for RPC) to include LDAP authentication and SAML based federation.
> > In addition, the ability to provide additional/custom authentication
> > mechanisms will be enabled for users to plug in their preferred
> mechanisms.
> > >
> > > Project Scope
> > > The scope of this effort is a subset of the features covered by the
> > overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on
> > enabling Hadoop to issue, accept/validate SSO tokens of its own. The
> > pluggable authentication mechanism within SASL/RPC layer and the
> > authentication filter pluggability for REST and UI components will be
> > leveraged and extended to support the results of this effort.
> > >
> > > Out of Scope
> > > In order to scope the initial deliverable as the minimally viable
> > product, a handful of things have been simplified or left out of scope
> for
> > this effort. This is not meant to say that these aspects are not useful
> or
> > not needed but that they are not necessary for this iteration. We do
> > however need to ensure that we don’t do anything to preclude adding them
> in
> > future iterations.
> > > 1. Additional Attributes - the result of authentication will continue
> to
> > use the existing hadoop tokens and identity representations. Additional
> > attributes used for finer grained authorization decisions will be added
> > through follow-up efforts.
> > > 2. Token revocation - the ability to revoke issued identity tokens will
> > be added later
> > > 3. Multi-factor authentication - this will likely require additional
> > attributes and is not necessary for this iteration.
> > > 4. Authorization changes - we will require additional attributes for
> the
> > fine-grained access control plans. This is not needed for this iteration.
> > > 5. Domains - we assume a single flat domain for all users
> > > 6. Kinit alternative - we can leverage existing REST clients such as
> > cURL to retrieve tokens through authentication and federation for the
> time
> > being
> > > 7. A specific authentication framework isn’t really necessary within
> the
> > REST endpoints for this iteration. If one is available then we can use it
> > otherwise we can leverage existing things like Apache Shiro within a
> > servlet filter.
> > >
> > > In Scope
> > > What is in scope for this effort is defined by the usecases described
> > below. Components required for supporting the usecases are summarized for
> > each client type. Each component is a candidate for a JIRA subtask -
> though
> > multiple components are likely to be included in a JIRA to represent a
> set
> > of functionality rather than individual JIRAs per component.
> > >
> > > Terminology and Naming
> > > The terms and names of components within this document are merely
> > descriptive of the functionality that they represent. Any similarity or
> > difference in names or terms from those that are found in other documents
> > are not intended to make any statement about those other documents or the
> > descriptions within. This document represents the pluggable
> authentication
> > mechanisms and server functionality required to replace Kerberos.
> > >
> > > Ultimately, the naming of the implementation classes will be a product
> > of the patches accepted by the community.
> > >
> > > Usecases:
> > > client types: REST, CLI, UI
> > > authentication types: Simple, Kerberos, authentication/LDAP,
> > federation/SAML
> > >
> > > Simple and Kerberos
> > > Simple and Kerberos usecases continue to work as they do today. The
> > addition of Authentication/LDAP and Federation/SAML are added through the
> > existing pluggability points either as they are or with required
> extension.
> > Either way, continued support for Simple and Kerberos must not require
> > changes to existing deployments in the field as a result of this effort.
> > >
> > > REST
> > > USECASE REST-1 Authentication/LDAP:
> > > For REST clients, we will provide the ability to:
> > > 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by
> > an AuthenticationServer instance via REST calls to:
> > >    a. authenticate - passing username/password returning a hadoop
> > id_token
> > >    b. get-access-token - from the TokenGrantingService by passing the
> > hadoop id_token as an Authorization: Bearer token along with the desired
> > service name (master service name) returning a hadoop access token
> > > 2. Successfully invoke a hadoop service REST API passing the hadoop
> > access token through an HTTP header as an Authorization Bearer token
> > >    a. validation of the incoming token on the service endpoint is
> > accomplished by an SSOAuthenticationHandler
> > > 3. Successfully block access to a REST resource when presenting a
> hadoop
> > access token intended for a different service
> > >    a. validation of the incoming token on the service endpoint is
> > accomplished by an SSOAuthenticationHandler
> > >
> > > USECASE REST-2 Federation/SAML:
> > > We will also provide federation capabilities for REST clients such
> that:
> > > 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> > persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
> > > 2. use cURL to Federate a token from a trusted IdP through an SP
> > endpoint exposed by an AuthenticationServer(FederationServer?) instance
> via
> > REST calls to:
> > >    a. federate - passing a SAML assertion as an Authorization: Bearer
> > token returning a hadoop id_token
> > >       - can copy and paste from commandline or use cat to include
> > persisted token through "--Header Authorization: Bearer 'cat
> > ~/.hadoop_tokens/.id_token'"
> > >    b. get-access-token - from the TokenGrantingService by passing the
> > hadoop id_token as an Authorization: Bearer token along with the desired
> > service name (master service name) to the TokenGrantingService returning
> a
> > hadoop access token
> > > 3. Successfully invoke a hadoop service REST API passing the hadoop
> > access token through an HTTP header as an Authorization Bearer token
> > >    a. validation of the incoming token on the service endpoint is
> > accomplished by an SSOAuthenticationHandler
> > > 4. Successfully block access to a REST resource when presenting a
> hadoop
> > access token intended for a different service
> > >    a. validation of the incoming token on the service endpoint is
> > accomplished by an SSOAuthenticationHandler
> > >
> > > REQUIRED COMPONENTS for REST USECASES:
> > > COMP-1. REST client - cURL or similar
> > > COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint
> > example - returning hadoop id_token
> > > COMP-3. REST endpoint for federation with SAML Bearer token -
> shibboleth
> > SP?|OpenSAML? - returning hadoop id_token
> > > COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access
> > tokens from hadoop id_tokens
> > > COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
> > tokens
> > > COMP-6. some source of a SAML assertion - shibboleth IdP?
> > > COMP-7. hadoop token and authority implementations
> > > COMP-8. core services for crypto support for signing, verifying and PKI
> > management
> > >
> > > CLI
> > > USECASE CLI-1 Authentication/LDAP:
> > > For CLI/RPC clients, we will provide the ability to:
> > > 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by
> > an AuthenticationServer instance via REST calls to:
> > >    a. authenticate - passing username/password returning a hadoop
> > id_token
> > >       - for RPC clients we need to persist the returned hadoop identity
> > token in a file protected by fs permissions so that it may be leveraged
> > until expiry
> > >       - directing the returned response to a file may suffice for now
> > something like ">~/.hadoop_tokens/.id_token"
> > > 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> > >    a. RPC client negotiates a TokenAuth method through SASL layer,
> > hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as
> > Authorization: Bearer token to the get-access-token REST endpoint exposed
> > by TokenGrantingService returning a hadoop access token
> > >    b. RPC server side validates the presented hadoop access token and
> > continues to serve request
> > >    c. Successfully invoke a hadoop service RPC API
> > >
> > > USECASE CLI-2 Federation/SAML:
> > > For CLI/RPC clients, we will provide the ability to:
> > > 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> > persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
> > > 2. use cURL to Federate a token from a trusted IdP through an SP
> > endpoint exposed by an AuthenticationServer(FederationServer?) instance
> via
> > REST calls to:
> > >    a. federate - passing a SAML assertion as an Authorization: Bearer
> > token returning a hadoop id_token
> > >       - can copy and paste from commandline or use cat to include
> > previously persisted token through "--Header Authorization: Bearer 'cat
> > ~/.hadoop_tokens/.id_token'"
> > > 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> > >    a. RPC client negotiates a TokenAuth method through SASL layer,
> > hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as
> > Authorization: Bearer token to the get-access-token REST endpoint exposed
> > by TokenGrantingService returning a hadoop access token
> > >    b. RPC server side validates the presented hadoop access token and
> > continues to serve request
> > >    c. Successfully invoke a hadoop service RPC API
> > >
> > > REQUIRED COMPONENTS for CLI USECASES - (beyond those required for
> REST):
> > > COMP-9. TokenAuth Method negotiation, etc
> > > COMP-10. Client side implementation to leverage REST endpoint for
> > acquiring hadoop access tokens given a hadoop id_token
> > > COMP-11. Server side implementation to validate incoming hadoop access
> > tokens
> > >
> > > UI
> > > Various Hadoop services have their own web UI consoles for
> > administration and end user interactions. These consoles need to also
> > benefit from the pluggability of authentication mechansims to be on par
> > with the access control of the cluster REST and RPC APIs.
> > > Web consoles are protected with an WebSSOAuthenticationHandler which
> > will be configured for either authentication or federation.
> > >
> > > USECASE UI-1 Authentication/LDAP:
> > > For the authentication usecase:
> > > 1. User’s browser requests access to a UI console page
> > > 2. WebSSOAuthenticationHandler intercepts the request and redirects the
> > browser to an IdP web endpoint exposed by the AuthenticationServer
> passing
> > the requested url as the redirect_url
> > > 3. IdP web endpoint presents the user with a FORM over https
> > >    a. user provides username/password and submits the FORM
> > > 4. AuthenticationServer authenticates the user with provided
> credentials
> > against the configured LDAP server and:
> > >    a. leverages a servlet filter or other authentication mechanism for
> > the endpoint and authenticates the user with a simple LDAP bind with
> > username and password
> > >    b. acquires a hadoop id_token and uses it to acquire the required
> > hadoop access token which is added as a cookie
> > >    c. redirects the browser to the original service UI resource via the
> > provided redirect_url
> > > 5. WebSSOAuthenticationHandler for the original UI resource
> interrogates
> > the incoming request again for an authcookie that contains an access
> token
> > upon finding one:
> > >    a. validates the incoming token
> > >    b. returns the AuthenticationToken as per AuthenticationHandler
> > contract
> > >    c. AuthenticationFilter adds the hadoop auth cookie with the
> expected
> > token
> > >    d. serves requested resource for valid tokens
> > >    e. subsequent requests are handled by the AuthenticationFilter
> > recognition of the hadoop auth cookie
> > >
> > > USECASE UI-2 Federation/SAML:
> > > For the federation usecase:
> > > 1. User’s browser requests access to a UI console page
> > > 2. WebSSOAuthenticationHandler intercepts the request and redirects the
> > browser to an SP web endpoint exposed by the AuthenticationServer passing
> > the requested url as the redirect_url. This endpoint:
> > >    a. is dedicated to redirecting to the external IdP passing the
> > required parameters which may include a redirect_url back to itself as
> well
> > as encoding the original redirect_url so that it can determine it on the
> > way back to the client
> > > 3. the IdP:
> > >    a. challenges the user for credentials and authenticates the user
> > >    b. creates appropriate token/cookie and redirects back to the
> > AuthenticationServer endpoint
> > > 4. AuthenticationServer endpoint:
> > >    a. extracts the expected token/cookie from the incoming request and
> > validates it
> > >    b. creates a hadoop id_token
> > >    c. acquires a hadoop access token for the id_token
> > >    d. creates appropriate cookie and redirects back to the original
> > redirect_url - being the requested resource
> > > 5. WebSSOAuthenticationHandler for the original UI resource
> interrogates
> > the incoming request again for an authcookie that contains an access
> token
> > upon finding one:
> > >    a. validates the incoming token
> > >    b. returns the AuthenticationToken as per AuthenticationHandler
> > contrac
> > >    c. AuthenticationFilter adds the hadoop auth cookie with the
> expected
> > token
> > >    d. serves requested resource for valid tokens
> > >    e. subsequent requests are handled by the AuthenticationFilter
> > recognition of the hadoop auth cookie
> > > REQUIRED COMPONENTS for UI USECASES:
> > > COMP-12. WebSSOAuthenticationHandler
> > > COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based
> > login
> > > COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party
> token
> > federation
> > >
> > >
> > >
> > > On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <Br...@microsoft.com>
> > wrote:
> > > Thanks, Larry. That is what I was trying to say, but you've said it
> > better and in more detail. :-) To extract from what you are saying: "If
> we
> > were to reframe the immediate scope to the lowest common denominator of
> > what is needed for accepting tokens in authentication plugins then we
> > gain... an end-state for the lowest common denominator that enables code
> > patches in the near-term is the best of both worlds."
> > >
> > > -Brian
> > >
> > > -----Original Message-----
> > > From: Larry McCay [mailto:lmccay@hortonworks.com]
> > > Sent: Wednesday, July 10, 2013 10:40 AM
> > > To: common-dev@hadoop.apache.org
> > > Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> > > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > >
> > > It seems to me that we can have the best of both worlds here...it's all
> > about the scoping.
> > >
> > > If we were to reframe the immediate scope to the lowest common
> > denominator of what is needed for accepting tokens in authentication
> > plugins then we gain:
> > >
> > > 1. a very manageable scope to define and agree upon 2. a deliverable
> > that should be useful in and of itself 3. a foundation for community
> > collaboration that we build on for higher level solutions built on this
> > lowest common denominator and experience as a working community
> > >
> > > So, to Alejandro's point, perhaps we need to define what would make #2
> > above true - this could serve as the "what" we are building instead of
> the
> > "how" to build it.
> > > Including:
> > > a. project structure within hadoop-common-project/common-security or
> the
> > like b. the usecases that would need to be enabled to make it a self
> > contained and useful contribution - without higher level solutions c. the
> > JIRA/s for contributing patches d. what specific patches will be needed
> to
> > accomplished the usecases in #b
> > >
> > > In other words, an end-state for the lowest common denominator that
> > enables code patches in the near-term is the best of both worlds.
> > >
> > > I think this may be a good way to bootstrap the collaboration process
> > for our emerging security community rather than trying to tackle a huge
> > vision all at once.
> > >
> > > @Alejandro - if you have something else in mind that would bootstrap
> > this process - that would great - please advise.
> > >
> > > thoughts?
> > >
> > > On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
> > wrote:
> > >
> > > > Hi Alejandro, all-
> > > >
> > > > There seems to be agreement on the broad stroke description of the
> > components needed to achieve pluggable token authentication (I'm sure
> I'll
> > be corrected if that isn't the case). However, discussion of the details
> of
> > those components doesn't seem to be moving forward. I think this is
> because
> > the details are really best understood through code. I also see *a* (i.e.
> > one of many possible) token format and pluggable authentication
> mechanisms
> > within the RPC layer as components that can have immediate benefit to
> > Hadoop users AND still allow flexibility in the larger design. So, I
> think
> > the best way to move the conversation of "what we are aiming for" forward
> > is to start looking at code for these components. I am especially
> > interested in moving forward with pluggable authentication mechanisms
> > within the RPC layer and would love to see what others have done in this
> > area (if anything).
> > > >
> > > > Thanks.
> > > >
> > > > -Brian
> > > >
> > > > -----Original Message-----
> > > > From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> > > > Sent: Wednesday, July 10, 2013 8:15 AM
> > > > To: Larry McCay
> > > > Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> > > > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > > >
> > > > Larry, all,
> > > >
> > > > Still is not clear to me what is the end state we are aiming for, or
> > that we even agree on that.
> > > >
> > > > IMO, Instead trying to agree what to do, we should first  agree on
> the
> > final state, then we see what should be changed to there there, then we
> see
> > how we change things to get there.
> > > >
> > > > The different documents out there focus more on how.
> > > >
> > > > We not try to say how before we know what.
> > > >
> > > > Thx.
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lmccay@hortonworks.com
> >
> > wrote:
> > > >
> > > >> All -
> > > >>
> > > >> After combing through this thread - as well as the summit session
> > > >> summary thread, I think that we have the following two items that we
> > > >> can probably move forward with:
> > > >>
> > > >> 1. TokenAuth method - assuming this means the pluggable
> > > >> authentication mechanisms within the RPC layer (2 votes: Kai and
> > > >> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
> > > >>
> > > >> I propose that we attack both of these aspects as one. Let's provide
> > > >> the structure and interfaces of the pluggable framework for use in
> > > >> the RPC layer through leveraging Daryn's pluggability work and POC
> it
> > > >> with a particular token format (not necessarily the only format ever
> > > >> supported - we just need one to start). If there has already been
> > > >> work done in this area by anyone then please speak up and commit to
> > > >> providing a patch - so that we don't duplicate effort.
> > > >>
> > > >> @Daryn - is there a particular Jira or set of Jiras that we can look
> > > >> at to discern the pluggability mechanism details? Documentation of
> it
> > > >> would be great as well.
> > > >> @Kai - do you have existing code for the pluggable token
> > > >> authentication mechanism - if not, we can take a stab at
> representing
> > > >> it with interfaces and/or POC code.
> > > >> I can standup and say that we have a token format that we have been
> > > >> working with already and can provide a patch that represents it as a
> > > >> contribution to test out the pluggable tokenAuth.
> > > >>
> > > >> These patches will provide progress toward code being the central
> > > >> discussion vehicle. As a community, we can then incrementally build
> > > >> on that foundation in order to collaboratively deliver the common
> > vision.
> > > >>
> > > >> In the absence of any other home for posting such patches, let's
> > > >> assume that they will be attached to HADOOP-9392 - or a dedicated
> > > >> subtask for this particular aspect/s - I will leave that detail to
> > Kai.
> > > >>
> > > >> @Alejandro, being the only voice on this thread that isn't
> > > >> represented in the votes above, please feel free to agree or
> disagree
> > with this direction.
> > > >>
> > > >> thanks,
> > > >>
> > > >> --larry
> > > >>
> > > >> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
> > wrote:
> > > >>
> > > >>> Hi Andy -
> > > >>>
> > > >>>> Happy Fourth of July to you and yours.
> > > >>>
> > > >>> Same to you and yours. :-)
> > > >>> We had some fun in the sun for a change - we've had nothing but
> rain
> > > >>> on
> > > >> the east coast lately.
> > > >>>
> > > >>>> My concern here is there may have been a misinterpretation or lack
> > > >>>> of consensus on what is meant by "clean slate"
> > > >>>
> > > >>>
> > > >>> Apparently so.
> > > >>> On the pre-summit call, I stated that I was interested in
> > > >>> reconciling
> > > >> the jiras so that we had one to work from.
> > > >>>
> > > >>> You recommended that we set them aside for the time being - with
> the
> > > >> understanding that work would continue on your side (and our's as
> > > >> well) - and approach the community discussion from a clean slate.
> > > >>> We seemed to do this at the summit session quite well.
> > > >>> It was my understanding that this community discussion would live
> > > >>> beyond
> > > >> the summit and continue on this list.
> > > >>>
> > > >>> While closing the summit session we agreed to follow up on
> > > >>> common-dev
> > > >> with first a summary then a discussion of the moving parts.
> > > >>>
> > > >>> I never expected the previous work to be abandoned and fully
> > > >>> expected it
> > > >> to inform the discussion that happened here.
> > > >>>
> > > >>> If you would like to reframe what clean slate was supposed to mean
> > > >>> or
> > > >> describe what it means now - that would be welcome - before I waste
> > > >> anymore time trying to facilitate a community discussion that is
> > > >> apparently not wanted.
> > > >>>
> > > >>>> Nowhere in this
> > > >>>> picture are self appointed "master JIRAs" and such, which have
> been
> > > >>>> disappointing to see crop up, we should be collaboratively coding
> > > >>>> not planting flags.
> > > >>>
> > > >>> I don't know what you mean by self-appointed master JIRAs.
> > > >>> It has certainly not been anyone's intention to disappoint.
> > > >>> Any mention of a new JIRA was just to have a clear context to
> gather
> > > >>> the
> > > >> agreed upon points - previous and/or existing JIRAs would easily be
> > linked.
> > > >>>
> > > >>> Planting flags... I need to go back and read my discussion point
> > > >>> about the
> > > >> JIRA and see how this is the impression that was made.
> > > >>> That is not how I define success. The only flags that count is
> code.
> > > >> What we are lacking is the roadmap on which to put the code.
> > > >>>
> > > >>>> I read Kai's latest document as something approaching today's
> > > >>>> consensus
> > > >> (or
> > > >>>> at least a common point of view?) rather than a historical
> document.
> > > >>>> Perhaps he and it can be given equal share of the consideration.
> > > >>>
> > > >>> I definitely read it as something that has evolved into something
> > > >> approaching what we have been talking about so far. There has not
> > > >> however been enough discussion anywhere near the level of detail in
> > > >> that document and more details are needed for each component in the
> > design.
> > > >>> Why the work in that document should not be fed into the community
> > > >> discussion as anyone else's would be - I fail to understand.
> > > >>>
> > > >>> My suggestion continues to be that you should take that document
> and
> > > >> speak to the inventory of moving parts as we agreed.
> > > >>> As these are agreed upon, we will ensure that the appropriate
> > > >>> subtasks
> > > >> are filed against whatever JIRA is to host them - don't really care
> > > >> much which it is.
> > > >>>
> > > >>> I don't really want to continue with two separate JIRAs - as I
> > > >>> stated
> > > >> long ago - but until we understand what the pieces are and how they
> > > >> relate then they can't be consolidated.
> > > >>> Even if 9533 ended up being repurposed as the server instance of
> the
> > > >> work - it should be a subtask of a larger one - if that is to be
> > > >> 9392, so be it.
> > > >>> We still need to define all the pieces of the larger picture before
> > > >>> that
> > > >> can be done.
> > > >>>
> > > >>> What I thought was the clean slate approach to the discussion
> seemed
> > > >>> a
> > > >> very reasonable way to make all this happen.
> > > >>> If you would like to restate what you intended by it or something
> > > >>> else
> > > >> equally as reasonable as a way to move forward that would be
> awesome.
> > > >>>
> > > >>> I will be happy to work toward the roadmap with everyone once it is
> > > >> articulated, understood and actionable.
> > > >>> In the meantime, I have work to do.
> > > >>>
> > > >>> thanks,
> > > >>>
> > > >>> --larry
> > > >>>
> > > >>> BTW - I meant to quote you in an earlier response and ended up
> > > >>> saying it
> > > >> was Aaron instead. Not sure what happened there. :-)
> > > >>>
> > > >>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org>
> > wrote:
> > > >>>
> > > >>>> Hi Larry (and all),
> > > >>>>
> > > >>>> Happy Fourth of July to you and yours.
> > > >>>>
> > > >>>> In our shop Kai and Tianyou are already doing the coding, so I'd
> > > >>>> defer
> > > >> to
> > > >>>> them on the detailed points.
> > > >>>>
> > > >>>> My concern here is there may have been a misinterpretation or lack
> > > >>>> of consensus on what is meant by "clean slate". Hopefully that can
> > > >>>> be
> > > >> quickly
> > > >>>> cleared up. Certainly we did not mean ignore all that came before.
> > > >>>> The
> > > >> idea
> > > >>>> was to reset discussions to find common ground and new direction
> > > >>>> where
> > > >> we
> > > >>>> are working together, not in conflict, on an agreed upon set of
> > > >>>> design points and tasks. There's been a lot of good discussion and
> > > >>>> design preceeding that we should figure out how to port over.
> > > >>>> Nowhere in this picture are self appointed "master JIRAs" and
> such,
> > > >>>> which have been disappointing to see crop up, we should be
> > > >>>> collaboratively coding not planting flags.
> > > >>>>
> > > >>>> I read Kai's latest document as something approaching today's
> > > >>>> consensus
> > > >> (or
> > > >>>> at least a common point of view?) rather than a historical
> document.
> > > >>>> Perhaps he and it can be given equal share of the consideration.
> > > >>>>
> > > >>>>
> > > >>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> > > >>>>
> > > >>>>> Hey Andrew -
> > > >>>>>
> > > >>>>> I largely agree with that statement.
> > > >>>>> My intention was to let the differences be worked out within the
> > > >>>>> individual components once they were identified and subtasks
> > created.
> > > >>>>>
> > > >>>>> My reference to HSSO was really referring to a SSO *server* based
> > > >> design
> > > >>>>> which was not clearly articulated in the earlier documents.
> > > >>>>> We aren't trying to compare and contrast one design over another
> > > >> anymore.
> > > >>>>>
> > > >>>>> Let's move this collaboration along as we've mapped out and the
> > > >>>>> differences in the details will reveal themselves and be
> addressed
> > > >> within
> > > >>>>> their components.
> > > >>>>>
> > > >>>>> I've actually been looking forward to you weighing in on the
> > > >>>>> actual discussion points in this thread.
> > > >>>>> Could you do that?
> > > >>>>>
> > > >>>>> At this point, I am most interested in your thoughts on a single
> > > >>>>> jira
> > > >> to
> > > >>>>> represent all of this work and whether we should start discussing
> > > >>>>> the
> > > >> SSO
> > > >>>>> Tokens.
> > > >>>>> If you think there are discussion points missing from that list,
> > > >>>>> feel
> > > >> free
> > > >>>>> to add to it.
> > > >>>>>
> > > >>>>> thanks,
> > > >>>>>
> > > >>>>> --larry
> > > >>>>>
> > > >>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
> > > >> wrote:
> > > >>>>>
> > > >>>>>> Hi Larry,
> > > >>>>>>
> > > >>>>>> Of course I'll let Kai speak for himself. However, let me point
> > > >>>>>> out
> > > >> that,
> > > >>>>>> while the differences between the competing JIRAs have been
> > > >>>>>> reduced
> > > >> for
> > > >>>>>> sure, there were some key differences that didn't just
> disappear.
> > > >>>>>> Subsequent discussion will make that clear. I also disagree with
> > > >>>>>> your characterization that we have simply endorsed all of the
> > > >>>>>> design
> > > >> decisions
> > > >>>>>> of the so-called HSSO, this is taking a mile from an inch. We
> are
> > > >> here to
> > > >>>>>> engage in a collaborative process as peers. I've been encouraged
> > > >>>>>> by
> > > >> the
> > > >>>>>> spirit of the discussions up to this point and hope that can
> > > >>>>>> continue beyond one design summit.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> > > >>>>>> <lm...@hortonworks.com>
> > > >>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hi Kai -
> > > >>>>>>>
> > > >>>>>>> I think that I need to clarify something...
> > > >>>>>>>
> > > >>>>>>> This is not an update for 9533 but a continuation of the
> > > >>>>>>> discussions
> > > >>>>> that
> > > >>>>>>> are focused on a fresh look at a SSO for Hadoop.
> > > >>>>>>> We've agreed to leave our previous designs behind and therefore
> > > >>>>>>> we
> > > >>>>> aren't
> > > >>>>>>> really seeing it as an HSSO layered on top of TAS approach or
> an
> > > >> HSSO vs
> > > >>>>>>> TAS discussion.
> > > >>>>>>>
> > > >>>>>>> Your latest design revision actually makes it clear that you
> are
> > > >>>>>>> now targeting exactly what was described as HSSO - so comparing
> > > >>>>>>> and
> > > >>>>> contrasting
> > > >>>>>>> is not going to add any value.
> > > >>>>>>>
> > > >>>>>>> What we need you to do at this point, is to look at those
> > > >>>>>>> high-level components described on this thread and comment on
> > > >>>>>>> whether we need additional components or any that are listed
> > > >>>>>>> that don't seem
> > > >> necessary
> > > >>>>> to
> > > >>>>>>> you and why.
> > > >>>>>>> In other words, we need to define and agree on the work that
> has
> > > >>>>>>> to
> > > >> be
> > > >>>>>>> done.
> > > >>>>>>>
> > > >>>>>>> We also need to determine those components that need to be done
> > > >> before
> > > >>>>>>> anything else can be started.
> > > >>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> > > >>>>>>> central to
> > > >>>>> all
> > > >>>>>>> the other components and should probably be defined and POC'd
> in
> > > >> short
> > > >>>>>>> order.
> > > >>>>>>>
> > > >>>>>>> Personally, I think that continuing the separation of 9533 and
> > > >>>>>>> 9392
> > > >> will
> > > >>>>>>> do this effort a disservice. There doesn't seem to be enough
> > > >> differences
> > > >>>>>>> between the two to justify separate jiras anymore. It may be
> > > >>>>>>> best to
> > > >>>>> file a
> > > >>>>>>> new one that reflects a single vision without the extra cruft
> > > >>>>>>> that
> > > >> has
> > > >>>>>>> built up in either of the existing ones. We would certainly
> > > >>>>>>> reference
> > > >>>>> the
> > > >>>>>>> existing ones within the new one. This approach would align
> with
> > > >>>>>>> the
> > > >>>>> spirit
> > > >>>>>>> of the discussions up to this point.
> > > >>>>>>>
> > > >>>>>>> I am prepared to start a discussion around the shape of the two
> > > >> Hadoop
> > > >>>>> SSO
> > > >>>>>>> tokens: identity and access. If this is what others feel the
> > > >>>>>>> next
> > > >> topic
> > > >>>>>>> should be.
> > > >>>>>>> If we can identify a jira home for it, we can do it there -
> > > >> otherwise we
> > > >>>>>>> can create another DISCUSS thread for it.
> > > >>>>>>>
> > > >>>>>>> thanks,
> > > >>>>>>>
> > > >>>>>>> --larry
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
> > > >> wrote:
> > > >>>>>>>
> > > >>>>>>>> Hi Larry,
> > > >>>>>>>>
> > > >>>>>>>> Thanks for the update. Good to see that with this update we
> are
> > > >>>>>>>> now
> > > >>>>>>> aligned on most points.
> > > >>>>>>>>
> > > >>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
> > > >>>>>>>> new
> > > >>>>>>> revision incorporates feedback and suggestions in related
> > > >>>>>>> discussion
> > > >>>>> with
> > > >>>>>>> the community, particularly from Microsoft and others attending
> > > >>>>>>> the Security design lounge session at the Hadoop summit.
> Summary
> > > >>>>>>> of the
> > > >>>>> changes:
> > > >>>>>>>> 1.    Revised the approach to now use two tokens, Identity
> Token
> > > >> plus
> > > >>>>>>> Access Token, particularly considering our authorization
> > > >>>>>>> framework
> > > >> and
> > > >>>>>>> compatibility with HSSO;
> > > >>>>>>>> 2.    Introduced Authorization Server (AS) from our
> > authorization
> > > >>>>>>> framework into the flow that issues access tokens for clients
> > > >>>>>>> with
> > > >>>>> identity
> > > >>>>>>> tokens to access services;
> > > >>>>>>>> 3.    Refined proxy access token and the proxy/impersonation
> > flow;
> > > >>>>>>>> 4.    Refined the browser web SSO flow regarding access to
> > Hadoop
> > > >> web
> > > >>>>>>> services;
> > > >>>>>>>> 5.    Added Hadoop RPC access flow regard
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>>> --
> > > >>>> Best regards,
> > > >>>>
> > > >>>> - Andy
> > > >>>>
> > > >>>> Problems worthy of attack prove their worth by hitting back. -
> Piet
> > > >>>> Hein (via Tom White)
> > > >>>
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > Alejandro
> > > >
> > >
> > >
> > >
> > >
> > >
> > > <Iteration1PluggableUserAuthenticationandFederation.pdf>
> >
> >
>
>
> --
> Alejandro
>

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Larry,

Sorry for the delay answering. Thanks for laying down things, yes, it makes
sense.

Given the large scope of the changes, number of JIRAs and number of
developers involved, wouldn't make sense to create a feature branch for all
this work not to destabilize (more ;) trunk?

Thanks again.


On Tue, Jul 30, 2013 at 9:43 AM, Larry McCay <lm...@hortonworks.com> wrote:

> The following JIRA was filed to provide a token and basic authority
> implementation for this effort:
> https://issues.apache.org/jira/browse/HADOOP-9781
>
> I have attached an initial patch though have yet to submit it as one since
> it is dependent on the patch for CMF that was posted to:
> https://issues.apache.org/jira/browse/HADOOP-9534
> and this patch still has a couple outstanding issues - javac warnings for
> com.sun classes for certification generation and 11 javadoc warnings.
>
> Please feel free to review the patches and raise any questions or concerns
> related to them.
>
> On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com> wrote:
>
> > Hello All -
> >
> > In an effort to scope an initial iteration that provides value to the
> community while focusing on the pluggable authentication aspects, I've
> written a description for "Iteration 1". It identifies the goal of the
> iteration, the endstate and a set of initial usecases. It also enumerates
> the components that are required for each usecase. There is a scope section
> that details specific things that should be kept out of the first
> iteration. This is certainly up for discussion. There may be some of these
> things that can be contributed in short order. If we can add some things in
> without unnecessary complexity for the identified usecases then we should.
> >
> > @Alejandro - please review this and see whether it satisfies your point
> for a definition of what we are building.
> >
> > In addition to the document that I will paste here as text and attach a
> pdf version, we have a couple patches for components that are identified in
> the document.
> > Specifically, COMP-7 and COMP-8.
> >
> > I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was filed
> specifically for that functionality.
> > COMP-7 is a small set of classes to introduce JsonWebToken as the token
> format and a basic JsonWebTokenAuthority that can issue and verify these
> tokens.
> >
> > Since there is no JIRA for this yet, I will likely file a new JIRA for a
> SSO token implementation.
> >
> > Both of these patches assume to be modules within
> hadoop-common/hadoop-common-project.
> > While they are relatively small, I think that they will be pulled in by
> other modules such as hadoop-auth which would likely not want a dependency
> on something larger like hadoop-common/hadoop-common-project/hadoop-common.
> >
> > This is certainly something that we should discuss within the community
> for this effort though - that being, exactly how to add these libraries so
> that they are most easily consumed by existing projects.
> >
> > Anyway, the following is the Iteration-1 document - it is also attached
> as a pdf:
> >
> > Iteration 1: Pluggable User Authentication and Federation
> >
> > Introduction
> > The intent of this effort is to bootstrap the development of pluggable
> token-based authentication mechanisms to support certain goals of
> enterprise authentication integrations. By restricting the scope of this
> effort, we hope to provide immediate benefit to the community while keeping
> the initial contribution to a manageable size that can be easily reviewed,
> understood and extended with further development through follow up JIRAs
> and related iterations.
> >
> > Iteration Endstate
> > Once complete, this effort will have extended the authentication
> mechanisms - for all client types - from the existing: Simple, Kerberos and
> Plain (for RPC) to include LDAP authentication and SAML based federation.
> In addition, the ability to provide additional/custom authentication
> mechanisms will be enabled for users to plug in their preferred mechanisms.
> >
> > Project Scope
> > The scope of this effort is a subset of the features covered by the
> overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on
> enabling Hadoop to issue, accept/validate SSO tokens of its own. The
> pluggable authentication mechanism within SASL/RPC layer and the
> authentication filter pluggability for REST and UI components will be
> leveraged and extended to support the results of this effort.
> >
> > Out of Scope
> > In order to scope the initial deliverable as the minimally viable
> product, a handful of things have been simplified or left out of scope for
> this effort. This is not meant to say that these aspects are not useful or
> not needed but that they are not necessary for this iteration. We do
> however need to ensure that we don’t do anything to preclude adding them in
> future iterations.
> > 1. Additional Attributes - the result of authentication will continue to
> use the existing hadoop tokens and identity representations. Additional
> attributes used for finer grained authorization decisions will be added
> through follow-up efforts.
> > 2. Token revocation - the ability to revoke issued identity tokens will
> be added later
> > 3. Multi-factor authentication - this will likely require additional
> attributes and is not necessary for this iteration.
> > 4. Authorization changes - we will require additional attributes for the
> fine-grained access control plans. This is not needed for this iteration.
> > 5. Domains - we assume a single flat domain for all users
> > 6. Kinit alternative - we can leverage existing REST clients such as
> cURL to retrieve tokens through authentication and federation for the time
> being
> > 7. A specific authentication framework isn’t really necessary within the
> REST endpoints for this iteration. If one is available then we can use it
> otherwise we can leverage existing things like Apache Shiro within a
> servlet filter.
> >
> > In Scope
> > What is in scope for this effort is defined by the usecases described
> below. Components required for supporting the usecases are summarized for
> each client type. Each component is a candidate for a JIRA subtask - though
> multiple components are likely to be included in a JIRA to represent a set
> of functionality rather than individual JIRAs per component.
> >
> > Terminology and Naming
> > The terms and names of components within this document are merely
> descriptive of the functionality that they represent. Any similarity or
> difference in names or terms from those that are found in other documents
> are not intended to make any statement about those other documents or the
> descriptions within. This document represents the pluggable authentication
> mechanisms and server functionality required to replace Kerberos.
> >
> > Ultimately, the naming of the implementation classes will be a product
> of the patches accepted by the community.
> >
> > Usecases:
> > client types: REST, CLI, UI
> > authentication types: Simple, Kerberos, authentication/LDAP,
> federation/SAML
> >
> > Simple and Kerberos
> > Simple and Kerberos usecases continue to work as they do today. The
> addition of Authentication/LDAP and Federation/SAML are added through the
> existing pluggability points either as they are or with required extension.
> Either way, continued support for Simple and Kerberos must not require
> changes to existing deployments in the field as a result of this effort.
> >
> > REST
> > USECASE REST-1 Authentication/LDAP:
> > For REST clients, we will provide the ability to:
> > 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by
> an AuthenticationServer instance via REST calls to:
> >    a. authenticate - passing username/password returning a hadoop
> id_token
> >    b. get-access-token - from the TokenGrantingService by passing the
> hadoop id_token as an Authorization: Bearer token along with the desired
> service name (master service name) returning a hadoop access token
> > 2. Successfully invoke a hadoop service REST API passing the hadoop
> access token through an HTTP header as an Authorization Bearer token
> >    a. validation of the incoming token on the service endpoint is
> accomplished by an SSOAuthenticationHandler
> > 3. Successfully block access to a REST resource when presenting a hadoop
> access token intended for a different service
> >    a. validation of the incoming token on the service endpoint is
> accomplished by an SSOAuthenticationHandler
> >
> > USECASE REST-2 Federation/SAML:
> > We will also provide federation capabilities for REST clients such that:
> > 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
> > 2. use cURL to Federate a token from a trusted IdP through an SP
> endpoint exposed by an AuthenticationServer(FederationServer?) instance via
> REST calls to:
> >    a. federate - passing a SAML assertion as an Authorization: Bearer
> token returning a hadoop id_token
> >       - can copy and paste from commandline or use cat to include
> persisted token through "--Header Authorization: Bearer 'cat
> ~/.hadoop_tokens/.id_token'"
> >    b. get-access-token - from the TokenGrantingService by passing the
> hadoop id_token as an Authorization: Bearer token along with the desired
> service name (master service name) to the TokenGrantingService returning a
> hadoop access token
> > 3. Successfully invoke a hadoop service REST API passing the hadoop
> access token through an HTTP header as an Authorization Bearer token
> >    a. validation of the incoming token on the service endpoint is
> accomplished by an SSOAuthenticationHandler
> > 4. Successfully block access to a REST resource when presenting a hadoop
> access token intended for a different service
> >    a. validation of the incoming token on the service endpoint is
> accomplished by an SSOAuthenticationHandler
> >
> > REQUIRED COMPONENTS for REST USECASES:
> > COMP-1. REST client - cURL or similar
> > COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint
> example - returning hadoop id_token
> > COMP-3. REST endpoint for federation with SAML Bearer token - shibboleth
> SP?|OpenSAML? - returning hadoop id_token
> > COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access
> tokens from hadoop id_tokens
> > COMP-5. SSOAuthenticationHandler to validate incoming hadoop access
> tokens
> > COMP-6. some source of a SAML assertion - shibboleth IdP?
> > COMP-7. hadoop token and authority implementations
> > COMP-8. core services for crypto support for signing, verifying and PKI
> management
> >
> > CLI
> > USECASE CLI-1 Authentication/LDAP:
> > For CLI/RPC clients, we will provide the ability to:
> > 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by
> an AuthenticationServer instance via REST calls to:
> >    a. authenticate - passing username/password returning a hadoop
> id_token
> >       - for RPC clients we need to persist the returned hadoop identity
> token in a file protected by fs permissions so that it may be leveraged
> until expiry
> >       - directing the returned response to a file may suffice for now
> something like ">~/.hadoop_tokens/.id_token"
> > 2. use hadoop CLI to invoke RPC API on a specific hadoop service
> >    a. RPC client negotiates a TokenAuth method through SASL layer,
> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as
> Authorization: Bearer token to the get-access-token REST endpoint exposed
> by TokenGrantingService returning a hadoop access token
> >    b. RPC server side validates the presented hadoop access token and
> continues to serve request
> >    c. Successfully invoke a hadoop service RPC API
> >
> > USECASE CLI-2 Federation/SAML:
> > For CLI/RPC clients, we will provide the ability to:
> > 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
> persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
> > 2. use cURL to Federate a token from a trusted IdP through an SP
> endpoint exposed by an AuthenticationServer(FederationServer?) instance via
> REST calls to:
> >    a. federate - passing a SAML assertion as an Authorization: Bearer
> token returning a hadoop id_token
> >       - can copy and paste from commandline or use cat to include
> previously persisted token through "--Header Authorization: Bearer 'cat
> ~/.hadoop_tokens/.id_token'"
> > 3. use hadoop CLI to invoke RPC API on a specific hadoop service
> >    a. RPC client negotiates a TokenAuth method through SASL layer,
> hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as
> Authorization: Bearer token to the get-access-token REST endpoint exposed
> by TokenGrantingService returning a hadoop access token
> >    b. RPC server side validates the presented hadoop access token and
> continues to serve request
> >    c. Successfully invoke a hadoop service RPC API
> >
> > REQUIRED COMPONENTS for CLI USECASES - (beyond those required for REST):
> > COMP-9. TokenAuth Method negotiation, etc
> > COMP-10. Client side implementation to leverage REST endpoint for
> acquiring hadoop access tokens given a hadoop id_token
> > COMP-11. Server side implementation to validate incoming hadoop access
> tokens
> >
> > UI
> > Various Hadoop services have their own web UI consoles for
> administration and end user interactions. These consoles need to also
> benefit from the pluggability of authentication mechansims to be on par
> with the access control of the cluster REST and RPC APIs.
> > Web consoles are protected with an WebSSOAuthenticationHandler which
> will be configured for either authentication or federation.
> >
> > USECASE UI-1 Authentication/LDAP:
> > For the authentication usecase:
> > 1. User’s browser requests access to a UI console page
> > 2. WebSSOAuthenticationHandler intercepts the request and redirects the
> browser to an IdP web endpoint exposed by the AuthenticationServer passing
> the requested url as the redirect_url
> > 3. IdP web endpoint presents the user with a FORM over https
> >    a. user provides username/password and submits the FORM
> > 4. AuthenticationServer authenticates the user with provided credentials
> against the configured LDAP server and:
> >    a. leverages a servlet filter or other authentication mechanism for
> the endpoint and authenticates the user with a simple LDAP bind with
> username and password
> >    b. acquires a hadoop id_token and uses it to acquire the required
> hadoop access token which is added as a cookie
> >    c. redirects the browser to the original service UI resource via the
> provided redirect_url
> > 5. WebSSOAuthenticationHandler for the original UI resource interrogates
> the incoming request again for an authcookie that contains an access token
> upon finding one:
> >    a. validates the incoming token
> >    b. returns the AuthenticationToken as per AuthenticationHandler
> contract
> >    c. AuthenticationFilter adds the hadoop auth cookie with the expected
> token
> >    d. serves requested resource for valid tokens
> >    e. subsequent requests are handled by the AuthenticationFilter
> recognition of the hadoop auth cookie
> >
> > USECASE UI-2 Federation/SAML:
> > For the federation usecase:
> > 1. User’s browser requests access to a UI console page
> > 2. WebSSOAuthenticationHandler intercepts the request and redirects the
> browser to an SP web endpoint exposed by the AuthenticationServer passing
> the requested url as the redirect_url. This endpoint:
> >    a. is dedicated to redirecting to the external IdP passing the
> required parameters which may include a redirect_url back to itself as well
> as encoding the original redirect_url so that it can determine it on the
> way back to the client
> > 3. the IdP:
> >    a. challenges the user for credentials and authenticates the user
> >    b. creates appropriate token/cookie and redirects back to the
> AuthenticationServer endpoint
> > 4. AuthenticationServer endpoint:
> >    a. extracts the expected token/cookie from the incoming request and
> validates it
> >    b. creates a hadoop id_token
> >    c. acquires a hadoop access token for the id_token
> >    d. creates appropriate cookie and redirects back to the original
> redirect_url - being the requested resource
> > 5. WebSSOAuthenticationHandler for the original UI resource interrogates
> the incoming request again for an authcookie that contains an access token
> upon finding one:
> >    a. validates the incoming token
> >    b. returns the AuthenticationToken as per AuthenticationHandler
> contrac
> >    c. AuthenticationFilter adds the hadoop auth cookie with the expected
> token
> >    d. serves requested resource for valid tokens
> >    e. subsequent requests are handled by the AuthenticationFilter
> recognition of the hadoop auth cookie
> > REQUIRED COMPONENTS for UI USECASES:
> > COMP-12. WebSSOAuthenticationHandler
> > COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based
> login
> > COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token
> federation
> >
> >
> >
> > On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <Br...@microsoft.com>
> wrote:
> > Thanks, Larry. That is what I was trying to say, but you've said it
> better and in more detail. :-) To extract from what you are saying: "If we
> were to reframe the immediate scope to the lowest common denominator of
> what is needed for accepting tokens in authentication plugins then we
> gain... an end-state for the lowest common denominator that enables code
> patches in the near-term is the best of both worlds."
> >
> > -Brian
> >
> > -----Original Message-----
> > From: Larry McCay [mailto:lmccay@hortonworks.com]
> > Sent: Wednesday, July 10, 2013 10:40 AM
> > To: common-dev@hadoop.apache.org
> > Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >
> > It seems to me that we can have the best of both worlds here...it's all
> about the scoping.
> >
> > If we were to reframe the immediate scope to the lowest common
> denominator of what is needed for accepting tokens in authentication
> plugins then we gain:
> >
> > 1. a very manageable scope to define and agree upon 2. a deliverable
> that should be useful in and of itself 3. a foundation for community
> collaboration that we build on for higher level solutions built on this
> lowest common denominator and experience as a working community
> >
> > So, to Alejandro's point, perhaps we need to define what would make #2
> above true - this could serve as the "what" we are building instead of the
> "how" to build it.
> > Including:
> > a. project structure within hadoop-common-project/common-security or the
> like b. the usecases that would need to be enabled to make it a self
> contained and useful contribution - without higher level solutions c. the
> JIRA/s for contributing patches d. what specific patches will be needed to
> accomplished the usecases in #b
> >
> > In other words, an end-state for the lowest common denominator that
> enables code patches in the near-term is the best of both worlds.
> >
> > I think this may be a good way to bootstrap the collaboration process
> for our emerging security community rather than trying to tackle a huge
> vision all at once.
> >
> > @Alejandro - if you have something else in mind that would bootstrap
> this process - that would great - please advise.
> >
> > thoughts?
> >
> > On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com>
> wrote:
> >
> > > Hi Alejandro, all-
> > >
> > > There seems to be agreement on the broad stroke description of the
> components needed to achieve pluggable token authentication (I'm sure I'll
> be corrected if that isn't the case). However, discussion of the details of
> those components doesn't seem to be moving forward. I think this is because
> the details are really best understood through code. I also see *a* (i.e.
> one of many possible) token format and pluggable authentication mechanisms
> within the RPC layer as components that can have immediate benefit to
> Hadoop users AND still allow flexibility in the larger design. So, I think
> the best way to move the conversation of "what we are aiming for" forward
> is to start looking at code for these components. I am especially
> interested in moving forward with pluggable authentication mechanisms
> within the RPC layer and would love to see what others have done in this
> area (if anything).
> > >
> > > Thanks.
> > >
> > > -Brian
> > >
> > > -----Original Message-----
> > > From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> > > Sent: Wednesday, July 10, 2013 8:15 AM
> > > To: Larry McCay
> > > Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> > > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> > >
> > > Larry, all,
> > >
> > > Still is not clear to me what is the end state we are aiming for, or
> that we even agree on that.
> > >
> > > IMO, Instead trying to agree what to do, we should first  agree on the
> final state, then we see what should be changed to there there, then we see
> how we change things to get there.
> > >
> > > The different documents out there focus more on how.
> > >
> > > We not try to say how before we know what.
> > >
> > > Thx.
> > >
> > >
> > >
> > >
> > > On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com>
> wrote:
> > >
> > >> All -
> > >>
> > >> After combing through this thread - as well as the summit session
> > >> summary thread, I think that we have the following two items that we
> > >> can probably move forward with:
> > >>
> > >> 1. TokenAuth method - assuming this means the pluggable
> > >> authentication mechanisms within the RPC layer (2 votes: Kai and
> > >> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
> > >>
> > >> I propose that we attack both of these aspects as one. Let's provide
> > >> the structure and interfaces of the pluggable framework for use in
> > >> the RPC layer through leveraging Daryn's pluggability work and POC it
> > >> with a particular token format (not necessarily the only format ever
> > >> supported - we just need one to start). If there has already been
> > >> work done in this area by anyone then please speak up and commit to
> > >> providing a patch - so that we don't duplicate effort.
> > >>
> > >> @Daryn - is there a particular Jira or set of Jiras that we can look
> > >> at to discern the pluggability mechanism details? Documentation of it
> > >> would be great as well.
> > >> @Kai - do you have existing code for the pluggable token
> > >> authentication mechanism - if not, we can take a stab at representing
> > >> it with interfaces and/or POC code.
> > >> I can standup and say that we have a token format that we have been
> > >> working with already and can provide a patch that represents it as a
> > >> contribution to test out the pluggable tokenAuth.
> > >>
> > >> These patches will provide progress toward code being the central
> > >> discussion vehicle. As a community, we can then incrementally build
> > >> on that foundation in order to collaboratively deliver the common
> vision.
> > >>
> > >> In the absence of any other home for posting such patches, let's
> > >> assume that they will be attached to HADOOP-9392 - or a dedicated
> > >> subtask for this particular aspect/s - I will leave that detail to
> Kai.
> > >>
> > >> @Alejandro, being the only voice on this thread that isn't
> > >> represented in the votes above, please feel free to agree or disagree
> with this direction.
> > >>
> > >> thanks,
> > >>
> > >> --larry
> > >>
> > >> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com>
> wrote:
> > >>
> > >>> Hi Andy -
> > >>>
> > >>>> Happy Fourth of July to you and yours.
> > >>>
> > >>> Same to you and yours. :-)
> > >>> We had some fun in the sun for a change - we've had nothing but rain
> > >>> on
> > >> the east coast lately.
> > >>>
> > >>>> My concern here is there may have been a misinterpretation or lack
> > >>>> of consensus on what is meant by "clean slate"
> > >>>
> > >>>
> > >>> Apparently so.
> > >>> On the pre-summit call, I stated that I was interested in
> > >>> reconciling
> > >> the jiras so that we had one to work from.
> > >>>
> > >>> You recommended that we set them aside for the time being - with the
> > >> understanding that work would continue on your side (and our's as
> > >> well) - and approach the community discussion from a clean slate.
> > >>> We seemed to do this at the summit session quite well.
> > >>> It was my understanding that this community discussion would live
> > >>> beyond
> > >> the summit and continue on this list.
> > >>>
> > >>> While closing the summit session we agreed to follow up on
> > >>> common-dev
> > >> with first a summary then a discussion of the moving parts.
> > >>>
> > >>> I never expected the previous work to be abandoned and fully
> > >>> expected it
> > >> to inform the discussion that happened here.
> > >>>
> > >>> If you would like to reframe what clean slate was supposed to mean
> > >>> or
> > >> describe what it means now - that would be welcome - before I waste
> > >> anymore time trying to facilitate a community discussion that is
> > >> apparently not wanted.
> > >>>
> > >>>> Nowhere in this
> > >>>> picture are self appointed "master JIRAs" and such, which have been
> > >>>> disappointing to see crop up, we should be collaboratively coding
> > >>>> not planting flags.
> > >>>
> > >>> I don't know what you mean by self-appointed master JIRAs.
> > >>> It has certainly not been anyone's intention to disappoint.
> > >>> Any mention of a new JIRA was just to have a clear context to gather
> > >>> the
> > >> agreed upon points - previous and/or existing JIRAs would easily be
> linked.
> > >>>
> > >>> Planting flags... I need to go back and read my discussion point
> > >>> about the
> > >> JIRA and see how this is the impression that was made.
> > >>> That is not how I define success. The only flags that count is code.
> > >> What we are lacking is the roadmap on which to put the code.
> > >>>
> > >>>> I read Kai's latest document as something approaching today's
> > >>>> consensus
> > >> (or
> > >>>> at least a common point of view?) rather than a historical document.
> > >>>> Perhaps he and it can be given equal share of the consideration.
> > >>>
> > >>> I definitely read it as something that has evolved into something
> > >> approaching what we have been talking about so far. There has not
> > >> however been enough discussion anywhere near the level of detail in
> > >> that document and more details are needed for each component in the
> design.
> > >>> Why the work in that document should not be fed into the community
> > >> discussion as anyone else's would be - I fail to understand.
> > >>>
> > >>> My suggestion continues to be that you should take that document and
> > >> speak to the inventory of moving parts as we agreed.
> > >>> As these are agreed upon, we will ensure that the appropriate
> > >>> subtasks
> > >> are filed against whatever JIRA is to host them - don't really care
> > >> much which it is.
> > >>>
> > >>> I don't really want to continue with two separate JIRAs - as I
> > >>> stated
> > >> long ago - but until we understand what the pieces are and how they
> > >> relate then they can't be consolidated.
> > >>> Even if 9533 ended up being repurposed as the server instance of the
> > >> work - it should be a subtask of a larger one - if that is to be
> > >> 9392, so be it.
> > >>> We still need to define all the pieces of the larger picture before
> > >>> that
> > >> can be done.
> > >>>
> > >>> What I thought was the clean slate approach to the discussion seemed
> > >>> a
> > >> very reasonable way to make all this happen.
> > >>> If you would like to restate what you intended by it or something
> > >>> else
> > >> equally as reasonable as a way to move forward that would be awesome.
> > >>>
> > >>> I will be happy to work toward the roadmap with everyone once it is
> > >> articulated, understood and actionable.
> > >>> In the meantime, I have work to do.
> > >>>
> > >>> thanks,
> > >>>
> > >>> --larry
> > >>>
> > >>> BTW - I meant to quote you in an earlier response and ended up
> > >>> saying it
> > >> was Aaron instead. Not sure what happened there. :-)
> > >>>
> > >>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org>
> wrote:
> > >>>
> > >>>> Hi Larry (and all),
> > >>>>
> > >>>> Happy Fourth of July to you and yours.
> > >>>>
> > >>>> In our shop Kai and Tianyou are already doing the coding, so I'd
> > >>>> defer
> > >> to
> > >>>> them on the detailed points.
> > >>>>
> > >>>> My concern here is there may have been a misinterpretation or lack
> > >>>> of consensus on what is meant by "clean slate". Hopefully that can
> > >>>> be
> > >> quickly
> > >>>> cleared up. Certainly we did not mean ignore all that came before.
> > >>>> The
> > >> idea
> > >>>> was to reset discussions to find common ground and new direction
> > >>>> where
> > >> we
> > >>>> are working together, not in conflict, on an agreed upon set of
> > >>>> design points and tasks. There's been a lot of good discussion and
> > >>>> design preceeding that we should figure out how to port over.
> > >>>> Nowhere in this picture are self appointed "master JIRAs" and such,
> > >>>> which have been disappointing to see crop up, we should be
> > >>>> collaboratively coding not planting flags.
> > >>>>
> > >>>> I read Kai's latest document as something approaching today's
> > >>>> consensus
> > >> (or
> > >>>> at least a common point of view?) rather than a historical document.
> > >>>> Perhaps he and it can be given equal share of the consideration.
> > >>>>
> > >>>>
> > >>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> > >>>>
> > >>>>> Hey Andrew -
> > >>>>>
> > >>>>> I largely agree with that statement.
> > >>>>> My intention was to let the differences be worked out within the
> > >>>>> individual components once they were identified and subtasks
> created.
> > >>>>>
> > >>>>> My reference to HSSO was really referring to a SSO *server* based
> > >> design
> > >>>>> which was not clearly articulated in the earlier documents.
> > >>>>> We aren't trying to compare and contrast one design over another
> > >> anymore.
> > >>>>>
> > >>>>> Let's move this collaboration along as we've mapped out and the
> > >>>>> differences in the details will reveal themselves and be addressed
> > >> within
> > >>>>> their components.
> > >>>>>
> > >>>>> I've actually been looking forward to you weighing in on the
> > >>>>> actual discussion points in this thread.
> > >>>>> Could you do that?
> > >>>>>
> > >>>>> At this point, I am most interested in your thoughts on a single
> > >>>>> jira
> > >> to
> > >>>>> represent all of this work and whether we should start discussing
> > >>>>> the
> > >> SSO
> > >>>>> Tokens.
> > >>>>> If you think there are discussion points missing from that list,
> > >>>>> feel
> > >> free
> > >>>>> to add to it.
> > >>>>>
> > >>>>> thanks,
> > >>>>>
> > >>>>> --larry
> > >>>>>
> > >>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
> > >> wrote:
> > >>>>>
> > >>>>>> Hi Larry,
> > >>>>>>
> > >>>>>> Of course I'll let Kai speak for himself. However, let me point
> > >>>>>> out
> > >> that,
> > >>>>>> while the differences between the competing JIRAs have been
> > >>>>>> reduced
> > >> for
> > >>>>>> sure, there were some key differences that didn't just disappear.
> > >>>>>> Subsequent discussion will make that clear. I also disagree with
> > >>>>>> your characterization that we have simply endorsed all of the
> > >>>>>> design
> > >> decisions
> > >>>>>> of the so-called HSSO, this is taking a mile from an inch. We are
> > >> here to
> > >>>>>> engage in a collaborative process as peers. I've been encouraged
> > >>>>>> by
> > >> the
> > >>>>>> spirit of the discussions up to this point and hope that can
> > >>>>>> continue beyond one design summit.
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> > >>>>>> <lm...@hortonworks.com>
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi Kai -
> > >>>>>>>
> > >>>>>>> I think that I need to clarify something...
> > >>>>>>>
> > >>>>>>> This is not an update for 9533 but a continuation of the
> > >>>>>>> discussions
> > >>>>> that
> > >>>>>>> are focused on a fresh look at a SSO for Hadoop.
> > >>>>>>> We've agreed to leave our previous designs behind and therefore
> > >>>>>>> we
> > >>>>> aren't
> > >>>>>>> really seeing it as an HSSO layered on top of TAS approach or an
> > >> HSSO vs
> > >>>>>>> TAS discussion.
> > >>>>>>>
> > >>>>>>> Your latest design revision actually makes it clear that you are
> > >>>>>>> now targeting exactly what was described as HSSO - so comparing
> > >>>>>>> and
> > >>>>> contrasting
> > >>>>>>> is not going to add any value.
> > >>>>>>>
> > >>>>>>> What we need you to do at this point, is to look at those
> > >>>>>>> high-level components described on this thread and comment on
> > >>>>>>> whether we need additional components or any that are listed
> > >>>>>>> that don't seem
> > >> necessary
> > >>>>> to
> > >>>>>>> you and why.
> > >>>>>>> In other words, we need to define and agree on the work that has
> > >>>>>>> to
> > >> be
> > >>>>>>> done.
> > >>>>>>>
> > >>>>>>> We also need to determine those components that need to be done
> > >> before
> > >>>>>>> anything else can be started.
> > >>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> > >>>>>>> central to
> > >>>>> all
> > >>>>>>> the other components and should probably be defined and POC'd in
> > >> short
> > >>>>>>> order.
> > >>>>>>>
> > >>>>>>> Personally, I think that continuing the separation of 9533 and
> > >>>>>>> 9392
> > >> will
> > >>>>>>> do this effort a disservice. There doesn't seem to be enough
> > >> differences
> > >>>>>>> between the two to justify separate jiras anymore. It may be
> > >>>>>>> best to
> > >>>>> file a
> > >>>>>>> new one that reflects a single vision without the extra cruft
> > >>>>>>> that
> > >> has
> > >>>>>>> built up in either of the existing ones. We would certainly
> > >>>>>>> reference
> > >>>>> the
> > >>>>>>> existing ones within the new one. This approach would align with
> > >>>>>>> the
> > >>>>> spirit
> > >>>>>>> of the discussions up to this point.
> > >>>>>>>
> > >>>>>>> I am prepared to start a discussion around the shape of the two
> > >> Hadoop
> > >>>>> SSO
> > >>>>>>> tokens: identity and access. If this is what others feel the
> > >>>>>>> next
> > >> topic
> > >>>>>>> should be.
> > >>>>>>> If we can identify a jira home for it, we can do it there -
> > >> otherwise we
> > >>>>>>> can create another DISCUSS thread for it.
> > >>>>>>>
> > >>>>>>> thanks,
> > >>>>>>>
> > >>>>>>> --larry
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
> > >> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Larry,
> > >>>>>>>>
> > >>>>>>>> Thanks for the update. Good to see that with this update we are
> > >>>>>>>> now
> > >>>>>>> aligned on most points.
> > >>>>>>>>
> > >>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
> > >>>>>>>> new
> > >>>>>>> revision incorporates feedback and suggestions in related
> > >>>>>>> discussion
> > >>>>> with
> > >>>>>>> the community, particularly from Microsoft and others attending
> > >>>>>>> the Security design lounge session at the Hadoop summit. Summary
> > >>>>>>> of the
> > >>>>> changes:
> > >>>>>>>> 1.    Revised the approach to now use two tokens, Identity Token
> > >> plus
> > >>>>>>> Access Token, particularly considering our authorization
> > >>>>>>> framework
> > >> and
> > >>>>>>> compatibility with HSSO;
> > >>>>>>>> 2.    Introduced Authorization Server (AS) from our
> authorization
> > >>>>>>> framework into the flow that issues access tokens for clients
> > >>>>>>> with
> > >>>>> identity
> > >>>>>>> tokens to access services;
> > >>>>>>>> 3.    Refined proxy access token and the proxy/impersonation
> flow;
> > >>>>>>>> 4.    Refined the browser web SSO flow regarding access to
> Hadoop
> > >> web
> > >>>>>>> services;
> > >>>>>>>> 5.    Added Hadoop RPC access flow regard
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Best regards,
> > >>>>
> > >>>> - Andy
> > >>>>
> > >>>> Problems worthy of attack prove their worth by hitting back. - Piet
> > >>>> Hein (via Tom White)
> > >>>
> > >>
> > >>
> > >
> > >
> > > --
> > > Alejandro
> > >
> >
> >
> >
> >
> >
> > <Iteration1PluggableUserAuthenticationandFederation.pdf>
>
>


-- 
Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

The following JIRA was filed to provide a token and basic authority implementation for this effort:
https://issues.apache.org/jira/browse/HADOOP-9781

I have attached an initial patch though have yet to submit it as one since it is dependent on the patch for CMF that was posted to:
https://issues.apache.org/jira/browse/HADOOP-9534
and this patch still has a couple outstanding issues - javac warnings for com.sun classes for certification generation and 11 javadoc warnings.

Please feel free to review the patches and raise any questions or concerns related to them.

On Jul 26, 2013, at 8:59 PM, Larry McCay <lm...@hortonworks.com> wrote:

> Hello All -
> 
> In an effort to scope an initial iteration that provides value to the community while focusing on the pluggable authentication aspects, I've written a description for "Iteration 1". It identifies the goal of the iteration, the endstate and a set of initial usecases. It also enumerates the components that are required for each usecase. There is a scope section that details specific things that should be kept out of the first iteration. This is certainly up for discussion. There may be some of these things that can be contributed in short order. If we can add some things in without unnecessary complexity for the identified usecases then we should.
> 
> @Alejandro - please review this and see whether it satisfies your point for a definition of what we are building.
> 
> In addition to the document that I will paste here as text and attach a pdf version, we have a couple patches for components that are identified in the document.
> Specifically, COMP-7 and COMP-8.
> 
> I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was filed specifically for that functionality.
> COMP-7 is a small set of classes to introduce JsonWebToken as the token format and a basic JsonWebTokenAuthority that can issue and verify these tokens.
> 
> Since there is no JIRA for this yet, I will likely file a new JIRA for a SSO token implementation.
> 
> Both of these patches assume to be modules within hadoop-common/hadoop-common-project.
> While they are relatively small, I think that they will be pulled in by other modules such as hadoop-auth which would likely not want a dependency on something larger like hadoop-common/hadoop-common-project/hadoop-common.
> 
> This is certainly something that we should discuss within the community for this effort though - that being, exactly how to add these libraries so that they are most easily consumed by existing projects.
> 
> Anyway, the following is the Iteration-1 document - it is also attached as a pdf:
> 
> Iteration 1: Pluggable User Authentication and Federation
> 
> Introduction
> The intent of this effort is to bootstrap the development of pluggable token-based authentication mechanisms to support certain goals of enterprise authentication integrations. By restricting the scope of this effort, we hope to provide immediate benefit to the community while keeping the initial contribution to a manageable size that can be easily reviewed, understood and extended with further development through follow up JIRAs and related iterations.
> 
> Iteration Endstate
> Once complete, this effort will have extended the authentication mechanisms - for all client types - from the existing: Simple, Kerberos and Plain (for RPC) to include LDAP authentication and SAML based federation. In addition, the ability to provide additional/custom authentication mechanisms will be enabled for users to plug in their preferred mechanisms.
> 
> Project Scope
> The scope of this effort is a subset of the features covered by the overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on enabling Hadoop to issue, accept/validate SSO tokens of its own. The pluggable authentication mechanism within SASL/RPC layer and the authentication filter pluggability for REST and UI components will be leveraged and extended to support the results of this effort.
> 
> Out of Scope
> In order to scope the initial deliverable as the minimally viable product, a handful of things have been simplified or left out of scope for this effort. This is not meant to say that these aspects are not useful or not needed but that they are not necessary for this iteration. We do however need to ensure that we don’t do anything to preclude adding them in future iterations.
> 1. Additional Attributes - the result of authentication will continue to use the existing hadoop tokens and identity representations. Additional attributes used for finer grained authorization decisions will be added through follow-up efforts.
> 2. Token revocation - the ability to revoke issued identity tokens will be added later
> 3. Multi-factor authentication - this will likely require additional attributes and is not necessary for this iteration.
> 4. Authorization changes - we will require additional attributes for the fine-grained access control plans. This is not needed for this iteration.
> 5. Domains - we assume a single flat domain for all users
> 6. Kinit alternative - we can leverage existing REST clients such as cURL to retrieve tokens through authentication and federation for the time being
> 7. A specific authentication framework isn’t really necessary within the REST endpoints for this iteration. If one is available then we can use it otherwise we can leverage existing things like Apache Shiro within a servlet filter.
> 
> In Scope
> What is in scope for this effort is defined by the usecases described below. Components required for supporting the usecases are summarized for each client type. Each component is a candidate for a JIRA subtask - though multiple components are likely to be included in a JIRA to represent a set of functionality rather than individual JIRAs per component.
> 
> Terminology and Naming
> The terms and names of components within this document are merely descriptive of the functionality that they represent. Any similarity or difference in names or terms from those that are found in other documents are not intended to make any statement about those other documents or the descriptions within. This document represents the pluggable authentication mechanisms and server functionality required to replace Kerberos.
> 
> Ultimately, the naming of the implementation classes will be a product of the patches accepted by the community.
> 
> Usecases:
> client types: REST, CLI, UI 
> authentication types: Simple, Kerberos, authentication/LDAP, federation/SAML
> 
> Simple and Kerberos
> Simple and Kerberos usecases continue to work as they do today. The addition of Authentication/LDAP and Federation/SAML are added through the existing pluggability points either as they are or with required extension. Either way, continued support for Simple and Kerberos must not require changes to existing deployments in the field as a result of this effort.
> 
> REST
> USECASE REST-1 Authentication/LDAP:
> For REST clients, we will provide the ability to:
> 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by an AuthenticationServer instance via REST calls to:
>    a. authenticate - passing username/password returning a hadoop id_token
>    b. get-access-token - from the TokenGrantingService by passing the hadoop id_token as an Authorization: Bearer token along with the desired service name (master service name) returning a hadoop access token
> 2. Successfully invoke a hadoop service REST API passing the hadoop access token through an HTTP header as an Authorization Bearer token
>    a. validation of the incoming token on the service endpoint is accomplished by an SSOAuthenticationHandler
> 3. Successfully block access to a REST resource when presenting a hadoop access token intended for a different service
>    a. validation of the incoming token on the service endpoint is accomplished by an SSOAuthenticationHandler
> 
> USECASE REST-2 Federation/SAML:
> We will also provide federation capabilities for REST clients such that:
> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
> 2. use cURL to Federate a token from a trusted IdP through an SP endpoint exposed by an AuthenticationServer(FederationServer?) instance via REST calls to:
>    a. federate - passing a SAML assertion as an Authorization: Bearer token returning a hadoop id_token
>       - can copy and paste from commandline or use cat to include persisted token through "--Header Authorization: Bearer 'cat ~/.hadoop_tokens/.id_token'"
>    b. get-access-token - from the TokenGrantingService by passing the hadoop id_token as an Authorization: Bearer token along with the desired service name (master service name) to the TokenGrantingService returning a hadoop access token
> 3. Successfully invoke a hadoop service REST API passing the hadoop access token through an HTTP header as an Authorization Bearer token
>    a. validation of the incoming token on the service endpoint is accomplished by an SSOAuthenticationHandler
> 4. Successfully block access to a REST resource when presenting a hadoop access token intended for a different service
>    a. validation of the incoming token on the service endpoint is accomplished by an SSOAuthenticationHandler
> 	
> REQUIRED COMPONENTS for REST USECASES:
> COMP-1. REST client - cURL or similar
> COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint example - returning hadoop id_token
> COMP-3. REST endpoint for federation with SAML Bearer token - shibboleth SP?|OpenSAML? - returning hadoop id_token
> COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access tokens from hadoop id_tokens
> COMP-5. SSOAuthenticationHandler to validate incoming hadoop access tokens
> COMP-6. some source of a SAML assertion - shibboleth IdP?
> COMP-7. hadoop token and authority implementations
> COMP-8. core services for crypto support for signing, verifying and PKI management
> 
> CLI
> USECASE CLI-1 Authentication/LDAP:
> For CLI/RPC clients, we will provide the ability to:
> 1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by an AuthenticationServer instance via REST calls to:
>    a. authenticate - passing username/password returning a hadoop id_token 
>       - for RPC clients we need to persist the returned hadoop identity token in a file protected by fs permissions so that it may be leveraged until expiry
>       - directing the returned response to a file may suffice for now something like ">~/.hadoop_tokens/.id_token"
> 2. use hadoop CLI to invoke RPC API on a specific hadoop service
>    a. RPC client negotiates a TokenAuth method through SASL layer, hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as Authorization: Bearer token to the get-access-token REST endpoint exposed by TokenGrantingService returning a hadoop access token
>    b. RPC server side validates the presented hadoop access token and continues to serve request
>    c. Successfully invoke a hadoop service RPC API
> 
> USECASE CLI-2 Federation/SAML:
> For CLI/RPC clients, we will provide the ability to:
> 1. acquire SAML assertion token from a trusted IdP (shibboleth?) and persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
> 2. use cURL to Federate a token from a trusted IdP through an SP endpoint exposed by an AuthenticationServer(FederationServer?) instance via REST calls to:
>    a. federate - passing a SAML assertion as an Authorization: Bearer token returning a hadoop id_token
>       - can copy and paste from commandline or use cat to include previously persisted token through "--Header Authorization: Bearer 'cat ~/.hadoop_tokens/.id_token'"
> 3. use hadoop CLI to invoke RPC API on a specific hadoop service
>    a. RPC client negotiates a TokenAuth method through SASL layer, hadoop id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as Authorization: Bearer token to the get-access-token REST endpoint exposed by TokenGrantingService returning a hadoop access token
>    b. RPC server side validates the presented hadoop access token and continues to serve request
>    c. Successfully invoke a hadoop service RPC API
> 
> REQUIRED COMPONENTS for CLI USECASES - (beyond those required for REST):
> COMP-9. TokenAuth Method negotiation, etc
> COMP-10. Client side implementation to leverage REST endpoint for acquiring hadoop access tokens given a hadoop id_token
> COMP-11. Server side implementation to validate incoming hadoop access tokens
> 
> UI
> Various Hadoop services have their own web UI consoles for administration and end user interactions. These consoles need to also benefit from the pluggability of authentication mechansims to be on par with the access control of the cluster REST and RPC APIs.
> Web consoles are protected with an WebSSOAuthenticationHandler which will be configured for either authentication or federation.
> 
> USECASE UI-1 Authentication/LDAP:
> For the authentication usecase:
> 1. User’s browser requests access to a UI console page
> 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an IdP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url
> 3. IdP web endpoint presents the user with a FORM over https
>    a. user provides username/password and submits the FORM
> 4. AuthenticationServer authenticates the user with provided credentials against the configured LDAP server and:
>    a. leverages a servlet filter or other authentication mechanism for the endpoint and authenticates the user with a simple LDAP bind with username and password
>    b. acquires a hadoop id_token and uses it to acquire the required hadoop access token which is added as a cookie
>    c. redirects the browser to the original service UI resource via the provided redirect_url
> 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one:
>    a. validates the incoming token
>    b. returns the AuthenticationToken as per AuthenticationHandler contract
>    c. AuthenticationFilter adds the hadoop auth cookie with the expected token
>    d. serves requested resource for valid tokens
>    e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie
> 
> USECASE UI-2 Federation/SAML:
> For the federation usecase:
> 1. User’s browser requests access to a UI console page
> 2. WebSSOAuthenticationHandler intercepts the request and redirects the browser to an SP web endpoint exposed by the AuthenticationServer passing the requested url as the redirect_url. This endpoint:
>    a. is dedicated to redirecting to the external IdP passing the required parameters which may include a redirect_url back to itself as well as encoding the original redirect_url so that it can determine it on the way back to the client
> 3. the IdP: 
>    a. challenges the user for credentials and authenticates the user
>    b. creates appropriate token/cookie and redirects back to the AuthenticationServer endpoint
> 4. AuthenticationServer endpoint:
>    a. extracts the expected token/cookie from the incoming request and validates it
>    b. creates a hadoop id_token
>    c. acquires a hadoop access token for the id_token
>    d. creates appropriate cookie and redirects back to the original redirect_url - being the requested resource
> 5. WebSSOAuthenticationHandler for the original UI resource interrogates the incoming request again for an authcookie that contains an access token upon finding one:
>    a. validates the incoming token
>    b. returns the AuthenticationToken as per AuthenticationHandler contrac
>    c. AuthenticationFilter adds the hadoop auth cookie with the expected token
>    d. serves requested resource for valid tokens
>    e. subsequent requests are handled by the AuthenticationFilter recognition of the hadoop auth cookie
> REQUIRED COMPONENTS for UI USECASES:
> COMP-12. WebSSOAuthenticationHandler
> COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login
> COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token federation
>  
> 
> 
> On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <Br...@microsoft.com> wrote:
> Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: "If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds."
> 
> -Brian
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com]
> Sent: Wednesday, July 10, 2013 10:40 AM
> To: common-dev@hadoop.apache.org
> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> 
> It seems to me that we can have the best of both worlds here...it's all about the scoping.
> 
> If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain:
> 
> 1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community
> 
> So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the "what" we are building instead of the "how" to build it.
> Including:
> a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b
> 
> In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds.
> 
> I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once.
> 
> @Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise.
> 
> thoughts?
> 
> On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com> wrote:
> 
> > Hi Alejandro, all-
> >
> > There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of "what we are aiming for" forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything).
> >
> > Thanks.
> >
> > -Brian
> >
> > -----Original Message-----
> > From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> > Sent: Wednesday, July 10, 2013 8:15 AM
> > To: Larry McCay
> > Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >
> > Larry, all,
> >
> > Still is not clear to me what is the end state we are aiming for, or that we even agree on that.
> >
> > IMO, Instead trying to agree what to do, we should first  agree on the final state, then we see what should be changed to there there, then we see how we change things to get there.
> >
> > The different documents out there focus more on how.
> >
> > We not try to say how before we know what.
> >
> > Thx.
> >
> >
> >
> >
> > On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com> wrote:
> >
> >> All -
> >>
> >> After combing through this thread - as well as the summit session
> >> summary thread, I think that we have the following two items that we
> >> can probably move forward with:
> >>
> >> 1. TokenAuth method - assuming this means the pluggable
> >> authentication mechanisms within the RPC layer (2 votes: Kai and
> >> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
> >>
> >> I propose that we attack both of these aspects as one. Let's provide
> >> the structure and interfaces of the pluggable framework for use in
> >> the RPC layer through leveraging Daryn's pluggability work and POC it
> >> with a particular token format (not necessarily the only format ever
> >> supported - we just need one to start). If there has already been
> >> work done in this area by anyone then please speak up and commit to
> >> providing a patch - so that we don't duplicate effort.
> >>
> >> @Daryn - is there a particular Jira or set of Jiras that we can look
> >> at to discern the pluggability mechanism details? Documentation of it
> >> would be great as well.
> >> @Kai - do you have existing code for the pluggable token
> >> authentication mechanism - if not, we can take a stab at representing
> >> it with interfaces and/or POC code.
> >> I can standup and say that we have a token format that we have been
> >> working with already and can provide a patch that represents it as a
> >> contribution to test out the pluggable tokenAuth.
> >>
> >> These patches will provide progress toward code being the central
> >> discussion vehicle. As a community, we can then incrementally build
> >> on that foundation in order to collaboratively deliver the common vision.
> >>
> >> In the absence of any other home for posting such patches, let's
> >> assume that they will be attached to HADOOP-9392 - or a dedicated
> >> subtask for this particular aspect/s - I will leave that detail to Kai.
> >>
> >> @Alejandro, being the only voice on this thread that isn't
> >> represented in the votes above, please feel free to agree or disagree with this direction.
> >>
> >> thanks,
> >>
> >> --larry
> >>
> >> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
> >>
> >>> Hi Andy -
> >>>
> >>>> Happy Fourth of July to you and yours.
> >>>
> >>> Same to you and yours. :-)
> >>> We had some fun in the sun for a change - we've had nothing but rain
> >>> on
> >> the east coast lately.
> >>>
> >>>> My concern here is there may have been a misinterpretation or lack
> >>>> of consensus on what is meant by "clean slate"
> >>>
> >>>
> >>> Apparently so.
> >>> On the pre-summit call, I stated that I was interested in
> >>> reconciling
> >> the jiras so that we had one to work from.
> >>>
> >>> You recommended that we set them aside for the time being - with the
> >> understanding that work would continue on your side (and our's as
> >> well) - and approach the community discussion from a clean slate.
> >>> We seemed to do this at the summit session quite well.
> >>> It was my understanding that this community discussion would live
> >>> beyond
> >> the summit and continue on this list.
> >>>
> >>> While closing the summit session we agreed to follow up on
> >>> common-dev
> >> with first a summary then a discussion of the moving parts.
> >>>
> >>> I never expected the previous work to be abandoned and fully
> >>> expected it
> >> to inform the discussion that happened here.
> >>>
> >>> If you would like to reframe what clean slate was supposed to mean
> >>> or
> >> describe what it means now - that would be welcome - before I waste
> >> anymore time trying to facilitate a community discussion that is
> >> apparently not wanted.
> >>>
> >>>> Nowhere in this
> >>>> picture are self appointed "master JIRAs" and such, which have been
> >>>> disappointing to see crop up, we should be collaboratively coding
> >>>> not planting flags.
> >>>
> >>> I don't know what you mean by self-appointed master JIRAs.
> >>> It has certainly not been anyone's intention to disappoint.
> >>> Any mention of a new JIRA was just to have a clear context to gather
> >>> the
> >> agreed upon points - previous and/or existing JIRAs would easily be linked.
> >>>
> >>> Planting flags... I need to go back and read my discussion point
> >>> about the
> >> JIRA and see how this is the impression that was made.
> >>> That is not how I define success. The only flags that count is code.
> >> What we are lacking is the roadmap on which to put the code.
> >>>
> >>>> I read Kai's latest document as something approaching today's
> >>>> consensus
> >> (or
> >>>> at least a common point of view?) rather than a historical document.
> >>>> Perhaps he and it can be given equal share of the consideration.
> >>>
> >>> I definitely read it as something that has evolved into something
> >> approaching what we have been talking about so far. There has not
> >> however been enough discussion anywhere near the level of detail in
> >> that document and more details are needed for each component in the design.
> >>> Why the work in that document should not be fed into the community
> >> discussion as anyone else's would be - I fail to understand.
> >>>
> >>> My suggestion continues to be that you should take that document and
> >> speak to the inventory of moving parts as we agreed.
> >>> As these are agreed upon, we will ensure that the appropriate
> >>> subtasks
> >> are filed against whatever JIRA is to host them - don't really care
> >> much which it is.
> >>>
> >>> I don't really want to continue with two separate JIRAs - as I
> >>> stated
> >> long ago - but until we understand what the pieces are and how they
> >> relate then they can't be consolidated.
> >>> Even if 9533 ended up being repurposed as the server instance of the
> >> work - it should be a subtask of a larger one - if that is to be
> >> 9392, so be it.
> >>> We still need to define all the pieces of the larger picture before
> >>> that
> >> can be done.
> >>>
> >>> What I thought was the clean slate approach to the discussion seemed
> >>> a
> >> very reasonable way to make all this happen.
> >>> If you would like to restate what you intended by it or something
> >>> else
> >> equally as reasonable as a way to move forward that would be awesome.
> >>>
> >>> I will be happy to work toward the roadmap with everyone once it is
> >> articulated, understood and actionable.
> >>> In the meantime, I have work to do.
> >>>
> >>> thanks,
> >>>
> >>> --larry
> >>>
> >>> BTW - I meant to quote you in an earlier response and ended up
> >>> saying it
> >> was Aaron instead. Not sure what happened there. :-)
> >>>
> >>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
> >>>
> >>>> Hi Larry (and all),
> >>>>
> >>>> Happy Fourth of July to you and yours.
> >>>>
> >>>> In our shop Kai and Tianyou are already doing the coding, so I'd
> >>>> defer
> >> to
> >>>> them on the detailed points.
> >>>>
> >>>> My concern here is there may have been a misinterpretation or lack
> >>>> of consensus on what is meant by "clean slate". Hopefully that can
> >>>> be
> >> quickly
> >>>> cleared up. Certainly we did not mean ignore all that came before.
> >>>> The
> >> idea
> >>>> was to reset discussions to find common ground and new direction
> >>>> where
> >> we
> >>>> are working together, not in conflict, on an agreed upon set of
> >>>> design points and tasks. There's been a lot of good discussion and
> >>>> design preceeding that we should figure out how to port over.
> >>>> Nowhere in this picture are self appointed "master JIRAs" and such,
> >>>> which have been disappointing to see crop up, we should be
> >>>> collaboratively coding not planting flags.
> >>>>
> >>>> I read Kai's latest document as something approaching today's
> >>>> consensus
> >> (or
> >>>> at least a common point of view?) rather than a historical document.
> >>>> Perhaps he and it can be given equal share of the consideration.
> >>>>
> >>>>
> >>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> >>>>
> >>>>> Hey Andrew -
> >>>>>
> >>>>> I largely agree with that statement.
> >>>>> My intention was to let the differences be worked out within the
> >>>>> individual components once they were identified and subtasks created.
> >>>>>
> >>>>> My reference to HSSO was really referring to a SSO *server* based
> >> design
> >>>>> which was not clearly articulated in the earlier documents.
> >>>>> We aren't trying to compare and contrast one design over another
> >> anymore.
> >>>>>
> >>>>> Let's move this collaboration along as we've mapped out and the
> >>>>> differences in the details will reveal themselves and be addressed
> >> within
> >>>>> their components.
> >>>>>
> >>>>> I've actually been looking forward to you weighing in on the
> >>>>> actual discussion points in this thread.
> >>>>> Could you do that?
> >>>>>
> >>>>> At this point, I am most interested in your thoughts on a single
> >>>>> jira
> >> to
> >>>>> represent all of this work and whether we should start discussing
> >>>>> the
> >> SSO
> >>>>> Tokens.
> >>>>> If you think there are discussion points missing from that list,
> >>>>> feel
> >> free
> >>>>> to add to it.
> >>>>>
> >>>>> thanks,
> >>>>>
> >>>>> --larry
> >>>>>
> >>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> Hi Larry,
> >>>>>>
> >>>>>> Of course I'll let Kai speak for himself. However, let me point
> >>>>>> out
> >> that,
> >>>>>> while the differences between the competing JIRAs have been
> >>>>>> reduced
> >> for
> >>>>>> sure, there were some key differences that didn't just disappear.
> >>>>>> Subsequent discussion will make that clear. I also disagree with
> >>>>>> your characterization that we have simply endorsed all of the
> >>>>>> design
> >> decisions
> >>>>>> of the so-called HSSO, this is taking a mile from an inch. We are
> >> here to
> >>>>>> engage in a collaborative process as peers. I've been encouraged
> >>>>>> by
> >> the
> >>>>>> spirit of the discussions up to this point and hope that can
> >>>>>> continue beyond one design summit.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> >>>>>> <lm...@hortonworks.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi Kai -
> >>>>>>>
> >>>>>>> I think that I need to clarify something...
> >>>>>>>
> >>>>>>> This is not an update for 9533 but a continuation of the
> >>>>>>> discussions
> >>>>> that
> >>>>>>> are focused on a fresh look at a SSO for Hadoop.
> >>>>>>> We've agreed to leave our previous designs behind and therefore
> >>>>>>> we
> >>>>> aren't
> >>>>>>> really seeing it as an HSSO layered on top of TAS approach or an
> >> HSSO vs
> >>>>>>> TAS discussion.
> >>>>>>>
> >>>>>>> Your latest design revision actually makes it clear that you are
> >>>>>>> now targeting exactly what was described as HSSO - so comparing
> >>>>>>> and
> >>>>> contrasting
> >>>>>>> is not going to add any value.
> >>>>>>>
> >>>>>>> What we need you to do at this point, is to look at those
> >>>>>>> high-level components described on this thread and comment on
> >>>>>>> whether we need additional components or any that are listed
> >>>>>>> that don't seem
> >> necessary
> >>>>> to
> >>>>>>> you and why.
> >>>>>>> In other words, we need to define and agree on the work that has
> >>>>>>> to
> >> be
> >>>>>>> done.
> >>>>>>>
> >>>>>>> We also need to determine those components that need to be done
> >> before
> >>>>>>> anything else can be started.
> >>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> >>>>>>> central to
> >>>>> all
> >>>>>>> the other components and should probably be defined and POC'd in
> >> short
> >>>>>>> order.
> >>>>>>>
> >>>>>>> Personally, I think that continuing the separation of 9533 and
> >>>>>>> 9392
> >> will
> >>>>>>> do this effort a disservice. There doesn't seem to be enough
> >> differences
> >>>>>>> between the two to justify separate jiras anymore. It may be
> >>>>>>> best to
> >>>>> file a
> >>>>>>> new one that reflects a single vision without the extra cruft
> >>>>>>> that
> >> has
> >>>>>>> built up in either of the existing ones. We would certainly
> >>>>>>> reference
> >>>>> the
> >>>>>>> existing ones within the new one. This approach would align with
> >>>>>>> the
> >>>>> spirit
> >>>>>>> of the discussions up to this point.
> >>>>>>>
> >>>>>>> I am prepared to start a discussion around the shape of the two
> >> Hadoop
> >>>>> SSO
> >>>>>>> tokens: identity and access. If this is what others feel the
> >>>>>>> next
> >> topic
> >>>>>>> should be.
> >>>>>>> If we can identify a jira home for it, we can do it there -
> >> otherwise we
> >>>>>>> can create another DISCUSS thread for it.
> >>>>>>>
> >>>>>>> thanks,
> >>>>>>>
> >>>>>>> --larry
> >>>>>>>
> >>>>>>>
> >>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
> >> wrote:
> >>>>>>>
> >>>>>>>> Hi Larry,
> >>>>>>>>
> >>>>>>>> Thanks for the update. Good to see that with this update we are
> >>>>>>>> now
> >>>>>>> aligned on most points.
> >>>>>>>>
> >>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
> >>>>>>>> new
> >>>>>>> revision incorporates feedback and suggestions in related
> >>>>>>> discussion
> >>>>> with
> >>>>>>> the community, particularly from Microsoft and others attending
> >>>>>>> the Security design lounge session at the Hadoop summit. Summary
> >>>>>>> of the
> >>>>> changes:
> >>>>>>>> 1.    Revised the approach to now use two tokens, Identity Token
> >> plus
> >>>>>>> Access Token, particularly considering our authorization
> >>>>>>> framework
> >> and
> >>>>>>> compatibility with HSSO;
> >>>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
> >>>>>>> framework into the flow that issues access tokens for clients
> >>>>>>> with
> >>>>> identity
> >>>>>>> tokens to access services;
> >>>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
> >>>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop
> >> web
> >>>>>>> services;
> >>>>>>>> 5.    Added Hadoop RPC access flow regard
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>>
> >>>> - Andy
> >>>>
> >>>> Problems worthy of attack prove their worth by hitting back. - Piet
> >>>> Hein (via Tom White)
> >>>
> >>
> >>
> >
> >
> > --
> > Alejandro
> >
> 
> 
> 
> 
> 
> <Iteration1PluggableUserAuthenticationandFederation.pdf>

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Hello All -

In an effort to scope an initial iteration that provides value to the
community while focusing on the pluggable authentication aspects, I've
written a description for "Iteration 1". It identifies the goal of the
iteration, the endstate and a set of initial usecases. It also enumerates
the components that are required for each usecase. There is a scope section
that details specific things that should be kept out of the first
iteration. This is certainly up for discussion. There may be some of these
things that can be contributed in short order. If we can add some things in
without unnecessary complexity for the identified usecases then we should.

@Alejandro - please review this and see whether it satisfies your point for
a definition of what we are building.

In addition to the document that I will paste here as text and attach a pdf
version, we have a couple patches for components that are identified in the
document.
Specifically, COMP-7 and COMP-8.

I will be posting COMP-8 patch to the HADOOP-9534 JIRA which was filed
specifically for that functionality.
COMP-7 is a small set of classes to introduce JsonWebToken as the token
format and a basic JsonWebTokenAuthority that can issue and verify these
tokens.

Since there is no JIRA for this yet, I will likely file a new JIRA for a
SSO token implementation.

Both of these patches assume to be modules within
hadoop-common/hadoop-common-project.
While they are relatively small, I think that they will be pulled in by
other modules such as hadoop-auth which would likely not want a dependency
on something larger like hadoop-common/hadoop-common-project/hadoop-common.

This is certainly something that we should discuss within the community for
this effort though - that being, exactly how to add these libraries so that
they are most easily consumed by existing projects.

Anyway, the following is the Iteration-1 document - it is also attached as
a pdf:

Iteration 1: Pluggable User Authentication and Federation

Introduction
The intent of this effort is to bootstrap the development of pluggable
token-based authentication mechanisms to support certain goals of
enterprise authentication integrations. By restricting the scope of this
effort, we hope to provide immediate benefit to the community while keeping
the initial contribution to a manageable size that can be easily reviewed,
understood and extended with further development through follow up JIRAs
and related iterations.

Iteration Endstate
Once complete, this effort will have extended the authentication mechanisms
- for all client types - from the existing: Simple, Kerberos and Plain (for
RPC) to include LDAP authentication and SAML based federation. In addition,
the ability to provide additional/custom authentication mechanisms will be
enabled for users to plug in their preferred mechanisms.

Project Scope
The scope of this effort is a subset of the features covered by the
overviews of HADOOP-9392 and HADOOP-9533. This effort concentrates on
enabling Hadoop to issue, accept/validate SSO tokens of its own. The
pluggable authentication mechanism within SASL/RPC layer and the
authentication filter pluggability for REST and UI components will be
leveraged and extended to support the results of this effort.

Out of Scope
In order to scope the initial deliverable as the minimally viable product,
a handful of things have been simplified or left out of scope for this
effort. This is not meant to say that these aspects are not useful or not
needed but that they are not necessary for this iteration. We do however
need to ensure that we don’t do anything to preclude adding them in future
iterations.
1. Additional Attributes - the result of authentication will continue to
use the existing hadoop tokens and identity representations. Additional
attributes used for finer grained authorization decisions will be added
through follow-up efforts.
2. Token revocation - the ability to revoke issued identity tokens will be
added later
3. Multi-factor authentication - this will likely require additional
attributes and is not necessary for this iteration.
4. Authorization changes - we will require additional attributes for the
fine-grained access control plans. This is not needed for this iteration.
5. Domains - we assume a single flat domain for all users
6. Kinit alternative - we can leverage existing REST clients such as cURL
to retrieve tokens through authentication and federation for the time being
7. A specific authentication framework isn’t really necessary within the
REST endpoints for this iteration. If one is available then we can use it
otherwise we can leverage existing things like Apache Shiro within a
servlet filter.

In Scope
What is in scope for this effort is defined by the usecases described
below. Components required for supporting the usecases are summarized for
each client type. Each component is a candidate for a JIRA subtask - though
multiple components are likely to be included in a JIRA to represent a set
of functionality rather than individual JIRAs per component.

Terminology and Naming
The terms and names of components within this document are merely
descriptive of the functionality that they represent. Any similarity or
difference in names or terms from those that are found in other documents
are not intended to make any statement about those other documents or the
descriptions within. This document represents the pluggable authentication
mechanisms and server functionality required to replace Kerberos.

Ultimately, the naming of the implementation classes will be a product of
the patches accepted by the community.

Usecases:
client types: REST, CLI, UI
authentication types: Simple, Kerberos, authentication/LDAP, federation/SAML

Simple and Kerberos
Simple and Kerberos usecases continue to work as they do today. The
addition of Authentication/LDAP and Federation/SAML are added through the
existing pluggability points either as they are or with required extension.
Either way, continued support for Simple and Kerberos must not require
changes to existing deployments in the field as a result of this effort.

REST
USECASE REST-1 Authentication/LDAP:
For REST clients, we will provide the ability to:
1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by an
AuthenticationServer instance via REST calls to:
   a. authenticate - passing username/password returning a hadoop id_token
   b. get-access-token - from the TokenGrantingService by passing the
hadoop id_token as an Authorization: Bearer token along with the desired
service name (master service name) returning a hadoop access token
2. Successfully invoke a hadoop service REST API passing the hadoop access
token through an HTTP header as an Authorization Bearer token
   a. validation of the incoming token on the service endpoint is
accomplished by an SSOAuthenticationHandler
3. Successfully block access to a REST resource when presenting a hadoop
access token intended for a different service
   a. validation of the incoming token on the service endpoint is
accomplished by an SSOAuthenticationHandler

USECASE REST-2 Federation/SAML:
We will also provide federation capabilities for REST clients such that:
1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
2. use cURL to Federate a token from a trusted IdP through an SP endpoint
exposed by an AuthenticationServer(FederationServer?) instance via REST
calls to:
   a. federate - passing a SAML assertion as an Authorization: Bearer token
returning a hadoop id_token
      - can copy and paste from commandline or use cat to include persisted
token through "--Header Authorization: Bearer 'cat
~/.hadoop_tokens/.id_token'"
   b. get-access-token - from the TokenGrantingService by passing the
hadoop id_token as an Authorization: Bearer token along with the desired
service name (master service name) to the TokenGrantingService returning a
hadoop access token
3. Successfully invoke a hadoop service REST API passing the hadoop access
token through an HTTP header as an Authorization Bearer token
   a. validation of the incoming token on the service endpoint is
accomplished by an SSOAuthenticationHandler
4. Successfully block access to a REST resource when presenting a hadoop
access token intended for a different service
   a. validation of the incoming token on the service endpoint is
accomplished by an SSOAuthenticationHandler
 REQUIRED COMPONENTS for REST USECASES:
COMP-1. REST client - cURL or similar
COMP-2. REST endpoint for BASIC authentication to LDAP - IdP endpoint
example - returning hadoop id_token
COMP-3. REST endpoint for federation with SAML Bearer token - shibboleth
SP?|OpenSAML? - returning hadoop id_token
COMP-4. REST TokenGrantingServer endpoint for acquiring hadoop access
tokens from hadoop id_tokens
COMP-5. SSOAuthenticationHandler to validate incoming hadoop access tokens
COMP-6. some source of a SAML assertion - shibboleth IdP?
COMP-7. hadoop token and authority implementations
COMP-8. core services for crypto support for signing, verifying and PKI
management

CLI
USECASE CLI-1 Authentication/LDAP:
For CLI/RPC clients, we will provide the ability to:
1. use cURL to Authenticate via LDAP through an IdP endpoint exposed by an
AuthenticationServer instance via REST calls to:
   a. authenticate - passing username/password returning a hadoop id_token
      - for RPC clients we need to persist the returned hadoop identity
token in a file protected by fs permissions so that it may be leveraged
until expiry
      - directing the returned response to a file may suffice for now
something like ">~/.hadoop_tokens/.id_token"
2. use hadoop CLI to invoke RPC API on a specific hadoop service
   a. RPC client negotiates a TokenAuth method through SASL layer, hadoop
id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as
Authorization: Bearer token to the get-access-token REST endpoint exposed
by TokenGrantingService returning a hadoop access token
   b. RPC server side validates the presented hadoop access token and
continues to serve request
   c. Successfully invoke a hadoop service RPC API

USECASE CLI-2 Federation/SAML:
For CLI/RPC clients, we will provide the ability to:
1. acquire SAML assertion token from a trusted IdP (shibboleth?) and
persist in a permissions protected file - ie. ~/.hadoop_tokens/.idp_token
2. use cURL to Federate a token from a trusted IdP through an SP endpoint
exposed by an AuthenticationServer(FederationServer?) instance via REST
calls to:
   a. federate - passing a SAML assertion as an Authorization: Bearer token
returning a hadoop id_token
      - can copy and paste from commandline or use cat to include
previously persisted token through "--Header Authorization: Bearer 'cat
~/.hadoop_tokens/.id_token'"
3. use hadoop CLI to invoke RPC API on a specific hadoop service
   a. RPC client negotiates a TokenAuth method through SASL layer, hadoop
id_token is retrieved from ~/.hadoop_tokens/.id_token is passed as
Authorization: Bearer token to the get-access-token REST endpoint exposed
by TokenGrantingService returning a hadoop access token
   b. RPC server side validates the presented hadoop access token and
continues to serve request
   c. Successfully invoke a hadoop service RPC API

REQUIRED COMPONENTS for CLI USECASES - (beyond those required for REST):
COMP-9. TokenAuth Method negotiation, etc
COMP-10. Client side implementation to leverage REST endpoint for acquiring
hadoop access tokens given a hadoop id_token
COMP-11. Server side implementation to validate incoming hadoop access
tokens

UI
Various Hadoop services have their own web UI consoles for administration
and end user interactions. These consoles need to also benefit from the
pluggability of authentication mechansims to be on par with the access
control of the cluster REST and RPC APIs.
Web consoles are protected with an WebSSOAuthenticationHandler which will
be configured for either authentication or federation.

USECASE UI-1 Authentication/LDAP:
For the authentication usecase:
1. User’s browser requests access to a UI console page
2. WebSSOAuthenticationHandler intercepts the request and redirects the
browser to an IdP web endpoint exposed by the AuthenticationServer passing
the requested url as the redirect_url
3. IdP web endpoint presents the user with a FORM over https
   a. user provides username/password and submits the FORM
4. AuthenticationServer authenticates the user with provided credentials
against the configured LDAP server and:
   a. leverages a servlet filter or other authentication mechanism for the
endpoint and authenticates the user with a simple LDAP bind with username
and password
   b. acquires a hadoop id_token and uses it to acquire the required hadoop
access token which is added as a cookie
   c. redirects the browser to the original service UI resource via the
provided redirect_url
5. WebSSOAuthenticationHandler for the original UI resource interrogates
the incoming request again for an authcookie that contains an access token
upon finding one:
   a. validates the incoming token
   b. returns the AuthenticationToken as per AuthenticationHandler contract
   c. AuthenticationFilter adds the hadoop auth cookie with the expected
token
   d. serves requested resource for valid tokens
   e. subsequent requests are handled by the AuthenticationFilter
recognition of the hadoop auth cookie

USECASE UI-2 Federation/SAML:
For the federation usecase:
1. User’s browser requests access to a UI console page
2. WebSSOAuthenticationHandler intercepts the request and redirects the
browser to an SP web endpoint exposed by the AuthenticationServer passing
the requested url as the redirect_url. This endpoint:
   a. is dedicated to redirecting to the external IdP passing the required
parameters which may include a redirect_url back to itself as well as
encoding the original redirect_url so that it can determine it on the way
back to the client
3. the IdP:
   a. challenges the user for credentials and authenticates the user
   b. creates appropriate token/cookie and redirects back to the
AuthenticationServer endpoint
4. AuthenticationServer endpoint:
   a. extracts the expected token/cookie from the incoming request and
validates it
   b. creates a hadoop id_token
   c. acquires a hadoop access token for the id_token
   d. creates appropriate cookie and redirects back to the original
redirect_url - being the requested resource
5. WebSSOAuthenticationHandler for the original UI resource interrogates
the incoming request again for an authcookie that contains an access token
upon finding one:
   a. validates the incoming token
   b. returns the AuthenticationToken as per AuthenticationHandler contrac
   c. AuthenticationFilter adds the hadoop auth cookie with the expected
token
   d. serves requested resource for valid tokens
   e. subsequent requests are handled by the AuthenticationFilter
recognition of the hadoop auth cookie
REQUIRED COMPONENTS for UI USECASES:
COMP-12. WebSSOAuthenticationHandler
COMP-13. IdP Web Endpoint within AuthenticationServer for FORM based login
COMP-14. SP Web Endpoint within AuthenticationServer for 3rd party token
federation

On Wed, Jul 10, 2013 at 1:59 PM, Brian Swan <Br...@microsoft.com>wrote:

> Thanks, Larry. That is what I was trying to say, but you've said it better
> and in more detail. :-) To extract from what you are saying: "If we were to
> reframe the immediate scope to the lowest common denominator of what is
> needed for accepting tokens in authentication plugins then we gain... an
> end-state for the lowest common denominator that enables code patches in
> the near-term is the best of both worlds."
>
> -Brian
>
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com]
> Sent: Wednesday, July 10, 2013 10:40 AM
> To: common-dev@hadoop.apache.org
> Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
>
> It seems to me that we can have the best of both worlds here...it's all
> about the scoping.
>
> If we were to reframe the immediate scope to the lowest common denominator
> of what is needed for accepting tokens in authentication plugins then we
> gain:
>
> 1. a very manageable scope to define and agree upon 2. a deliverable that
> should be useful in and of itself 3. a foundation for community
> collaboration that we build on for higher level solutions built on this
> lowest common denominator and experience as a working community
>
> So, to Alejandro's point, perhaps we need to define what would make #2
> above true - this could serve as the "what" we are building instead of the
> "how" to build it.
> Including:
> a. project structure within hadoop-common-project/common-security or the
> like b. the usecases that would need to be enabled to make it a self
> contained and useful contribution - without higher level solutions c. the
> JIRA/s for contributing patches d. what specific patches will be needed to
> accomplished the usecases in #b
>
> In other words, an end-state for the lowest common denominator that
> enables code patches in the near-term is the best of both worlds.
>
> I think this may be a good way to bootstrap the collaboration process for
> our emerging security community rather than trying to tackle a huge vision
> all at once.
>
> @Alejandro - if you have something else in mind that would bootstrap this
> process - that would great - please advise.
>
> thoughts?
>
> On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com> wrote:
>
> > Hi Alejandro, all-
> >
> > There seems to be agreement on the broad stroke description of the
> components needed to achieve pluggable token authentication (I'm sure I'll
> be corrected if that isn't the case). However, discussion of the details of
> those components doesn't seem to be moving forward. I think this is because
> the details are really best understood through code. I also see *a* (i.e.
> one of many possible) token format and pluggable authentication mechanisms
> within the RPC layer as components that can have immediate benefit to
> Hadoop users AND still allow flexibility in the larger design. So, I think
> the best way to move the conversation of "what we are aiming for" forward
> is to start looking at code for these components. I am especially
> interested in moving forward with pluggable authentication mechanisms
> within the RPC layer and would love to see what others have done in this
> area (if anything).
> >
> > Thanks.
> >
> > -Brian
> >
> > -----Original Message-----
> > From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> > Sent: Wednesday, July 10, 2013 8:15 AM
> > To: Larry McCay
> > Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> > Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> >
> > Larry, all,
> >
> > Still is not clear to me what is the end state we are aiming for, or
> that we even agree on that.
> >
> > IMO, Instead trying to agree what to do, we should first  agree on the
> final state, then we see what should be changed to there there, then we see
> how we change things to get there.
> >
> > The different documents out there focus more on how.
> >
> > We not try to say how before we know what.
> >
> > Thx.
> >
> >
> >
> >
> > On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com>
> wrote:
> >
> >> All -
> >>
> >> After combing through this thread - as well as the summit session
> >> summary thread, I think that we have the following two items that we
> >> can probably move forward with:
> >>
> >> 1. TokenAuth method - assuming this means the pluggable
> >> authentication mechanisms within the RPC layer (2 votes: Kai and
> >> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
> >>
> >> I propose that we attack both of these aspects as one. Let's provide
> >> the structure and interfaces of the pluggable framework for use in
> >> the RPC layer through leveraging Daryn's pluggability work and POC it
> >> with a particular token format (not necessarily the only format ever
> >> supported - we just need one to start). If there has already been
> >> work done in this area by anyone then please speak up and commit to
> >> providing a patch - so that we don't duplicate effort.
> >>
> >> @Daryn - is there a particular Jira or set of Jiras that we can look
> >> at to discern the pluggability mechanism details? Documentation of it
> >> would be great as well.
> >> @Kai - do you have existing code for the pluggable token
> >> authentication mechanism - if not, we can take a stab at representing
> >> it with interfaces and/or POC code.
> >> I can standup and say that we have a token format that we have been
> >> working with already and can provide a patch that represents it as a
> >> contribution to test out the pluggable tokenAuth.
> >>
> >> These patches will provide progress toward code being the central
> >> discussion vehicle. As a community, we can then incrementally build
> >> on that foundation in order to collaboratively deliver the common
> vision.
> >>
> >> In the absence of any other home for posting such patches, let's
> >> assume that they will be attached to HADOOP-9392 - or a dedicated
> >> subtask for this particular aspect/s - I will leave that detail to Kai.
> >>
> >> @Alejandro, being the only voice on this thread that isn't
> >> represented in the votes above, please feel free to agree or disagree
> with this direction.
> >>
> >> thanks,
> >>
> >> --larry
> >>
> >> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
> >>
> >>> Hi Andy -
> >>>
> >>>> Happy Fourth of July to you and yours.
> >>>
> >>> Same to you and yours. :-)
> >>> We had some fun in the sun for a change - we've had nothing but rain
> >>> on
> >> the east coast lately.
> >>>
> >>>> My concern here is there may have been a misinterpretation or lack
> >>>> of consensus on what is meant by "clean slate"
> >>>
> >>>
> >>> Apparently so.
> >>> On the pre-summit call, I stated that I was interested in
> >>> reconciling
> >> the jiras so that we had one to work from.
> >>>
> >>> You recommended that we set them aside for the time being - with the
> >> understanding that work would continue on your side (and our's as
> >> well) - and approach the community discussion from a clean slate.
> >>> We seemed to do this at the summit session quite well.
> >>> It was my understanding that this community discussion would live
> >>> beyond
> >> the summit and continue on this list.
> >>>
> >>> While closing the summit session we agreed to follow up on
> >>> common-dev
> >> with first a summary then a discussion of the moving parts.
> >>>
> >>> I never expected the previous work to be abandoned and fully
> >>> expected it
> >> to inform the discussion that happened here.
> >>>
> >>> If you would like to reframe what clean slate was supposed to mean
> >>> or
> >> describe what it means now - that would be welcome - before I waste
> >> anymore time trying to facilitate a community discussion that is
> >> apparently not wanted.
> >>>
> >>>> Nowhere in this
> >>>> picture are self appointed "master JIRAs" and such, which have been
> >>>> disappointing to see crop up, we should be collaboratively coding
> >>>> not planting flags.
> >>>
> >>> I don't know what you mean by self-appointed master JIRAs.
> >>> It has certainly not been anyone's intention to disappoint.
> >>> Any mention of a new JIRA was just to have a clear context to gather
> >>> the
> >> agreed upon points - previous and/or existing JIRAs would easily be
> linked.
> >>>
> >>> Planting flags... I need to go back and read my discussion point
> >>> about the
> >> JIRA and see how this is the impression that was made.
> >>> That is not how I define success. The only flags that count is code.
> >> What we are lacking is the roadmap on which to put the code.
> >>>
> >>>> I read Kai's latest document as something approaching today's
> >>>> consensus
> >> (or
> >>>> at least a common point of view?) rather than a historical document.
> >>>> Perhaps he and it can be given equal share of the consideration.
> >>>
> >>> I definitely read it as something that has evolved into something
> >> approaching what we have been talking about so far. There has not
> >> however been enough discussion anywhere near the level of detail in
> >> that document and more details are needed for each component in the
> design.
> >>> Why the work in that document should not be fed into the community
> >> discussion as anyone else's would be - I fail to understand.
> >>>
> >>> My suggestion continues to be that you should take that document and
> >> speak to the inventory of moving parts as we agreed.
> >>> As these are agreed upon, we will ensure that the appropriate
> >>> subtasks
> >> are filed against whatever JIRA is to host them - don't really care
> >> much which it is.
> >>>
> >>> I don't really want to continue with two separate JIRAs - as I
> >>> stated
> >> long ago - but until we understand what the pieces are and how they
> >> relate then they can't be consolidated.
> >>> Even if 9533 ended up being repurposed as the server instance of the
> >> work - it should be a subtask of a larger one - if that is to be
> >> 9392, so be it.
> >>> We still need to define all the pieces of the larger picture before
> >>> that
> >> can be done.
> >>>
> >>> What I thought was the clean slate approach to the discussion seemed
> >>> a
> >> very reasonable way to make all this happen.
> >>> If you would like to restate what you intended by it or something
> >>> else
> >> equally as reasonable as a way to move forward that would be awesome.
> >>>
> >>> I will be happy to work toward the roadmap with everyone once it is
> >> articulated, understood and actionable.
> >>> In the meantime, I have work to do.
> >>>
> >>> thanks,
> >>>
> >>> --larry
> >>>
> >>> BTW - I meant to quote you in an earlier response and ended up
> >>> saying it
> >> was Aaron instead. Not sure what happened there. :-)
> >>>
> >>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org>
> wrote:
> >>>
> >>>> Hi Larry (and all),
> >>>>
> >>>> Happy Fourth of July to you and yours.
> >>>>
> >>>> In our shop Kai and Tianyou are already doing the coding, so I'd
> >>>> defer
> >> to
> >>>> them on the detailed points.
> >>>>
> >>>> My concern here is there may have been a misinterpretation or lack
> >>>> of consensus on what is meant by "clean slate". Hopefully that can
> >>>> be
> >> quickly
> >>>> cleared up. Certainly we did not mean ignore all that came before.
> >>>> The
> >> idea
> >>>> was to reset discussions to find common ground and new direction
> >>>> where
> >> we
> >>>> are working together, not in conflict, on an agreed upon set of
> >>>> design points and tasks. There's been a lot of good discussion and
> >>>> design preceeding that we should figure out how to port over.
> >>>> Nowhere in this picture are self appointed "master JIRAs" and such,
> >>>> which have been disappointing to see crop up, we should be
> >>>> collaboratively coding not planting flags.
> >>>>
> >>>> I read Kai's latest document as something approaching today's
> >>>> consensus
> >> (or
> >>>> at least a common point of view?) rather than a historical document.
> >>>> Perhaps he and it can be given equal share of the consideration.
> >>>>
> >>>>
> >>>> On Wednesday, July 3, 2013, Larry McCay wrote:
> >>>>
> >>>>> Hey Andrew -
> >>>>>
> >>>>> I largely agree with that statement.
> >>>>> My intention was to let the differences be worked out within the
> >>>>> individual components once they were identified and subtasks created.
> >>>>>
> >>>>> My reference to HSSO was really referring to a SSO *server* based
> >> design
> >>>>> which was not clearly articulated in the earlier documents.
> >>>>> We aren't trying to compare and contrast one design over another
> >> anymore.
> >>>>>
> >>>>> Let's move this collaboration along as we've mapped out and the
> >>>>> differences in the details will reveal themselves and be addressed
> >> within
> >>>>> their components.
> >>>>>
> >>>>> I've actually been looking forward to you weighing in on the
> >>>>> actual discussion points in this thread.
> >>>>> Could you do that?
> >>>>>
> >>>>> At this point, I am most interested in your thoughts on a single
> >>>>> jira
> >> to
> >>>>> represent all of this work and whether we should start discussing
> >>>>> the
> >> SSO
> >>>>> Tokens.
> >>>>> If you think there are discussion points missing from that list,
> >>>>> feel
> >> free
> >>>>> to add to it.
> >>>>>
> >>>>> thanks,
> >>>>>
> >>>>> --larry
> >>>>>
> >>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
> >> wrote:
> >>>>>
> >>>>>> Hi Larry,
> >>>>>>
> >>>>>> Of course I'll let Kai speak for himself. However, let me point
> >>>>>> out
> >> that,
> >>>>>> while the differences between the competing JIRAs have been
> >>>>>> reduced
> >> for
> >>>>>> sure, there were some key differences that didn't just disappear.
> >>>>>> Subsequent discussion will make that clear. I also disagree with
> >>>>>> your characterization that we have simply endorsed all of the
> >>>>>> design
> >> decisions
> >>>>>> of the so-called HSSO, this is taking a mile from an inch. We are
> >> here to
> >>>>>> engage in a collaborative process as peers. I've been encouraged
> >>>>>> by
> >> the
> >>>>>> spirit of the discussions up to this point and hope that can
> >>>>>> continue beyond one design summit.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay
> >>>>>> <lm...@hortonworks.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Hi Kai -
> >>>>>>>
> >>>>>>> I think that I need to clarify something...
> >>>>>>>
> >>>>>>> This is not an update for 9533 but a continuation of the
> >>>>>>> discussions
> >>>>> that
> >>>>>>> are focused on a fresh look at a SSO for Hadoop.
> >>>>>>> We've agreed to leave our previous designs behind and therefore
> >>>>>>> we
> >>>>> aren't
> >>>>>>> really seeing it as an HSSO layered on top of TAS approach or an
> >> HSSO vs
> >>>>>>> TAS discussion.
> >>>>>>>
> >>>>>>> Your latest design revision actually makes it clear that you are
> >>>>>>> now targeting exactly what was described as HSSO - so comparing
> >>>>>>> and
> >>>>> contrasting
> >>>>>>> is not going to add any value.
> >>>>>>>
> >>>>>>> What we need you to do at this point, is to look at those
> >>>>>>> high-level components described on this thread and comment on
> >>>>>>> whether we need additional components or any that are listed
> >>>>>>> that don't seem
> >> necessary
> >>>>> to
> >>>>>>> you and why.
> >>>>>>> In other words, we need to define and agree on the work that has
> >>>>>>> to
> >> be
> >>>>>>> done.
> >>>>>>>
> >>>>>>> We also need to determine those components that need to be done
> >> before
> >>>>>>> anything else can be started.
> >>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are
> >>>>>>> central to
> >>>>> all
> >>>>>>> the other components and should probably be defined and POC'd in
> >> short
> >>>>>>> order.
> >>>>>>>
> >>>>>>> Personally, I think that continuing the separation of 9533 and
> >>>>>>> 9392
> >> will
> >>>>>>> do this effort a disservice. There doesn't seem to be enough
> >> differences
> >>>>>>> between the two to justify separate jiras anymore. It may be
> >>>>>>> best to
> >>>>> file a
> >>>>>>> new one that reflects a single vision without the extra cruft
> >>>>>>> that
> >> has
> >>>>>>> built up in either of the existing ones. We would certainly
> >>>>>>> reference
> >>>>> the
> >>>>>>> existing ones within the new one. This approach would align with
> >>>>>>> the
> >>>>> spirit
> >>>>>>> of the discussions up to this point.
> >>>>>>>
> >>>>>>> I am prepared to start a discussion around the shape of the two
> >> Hadoop
> >>>>> SSO
> >>>>>>> tokens: identity and access. If this is what others feel the
> >>>>>>> next
> >> topic
> >>>>>>> should be.
> >>>>>>> If we can identify a jira home for it, we can do it there -
> >> otherwise we
> >>>>>>> can create another DISCUSS thread for it.
> >>>>>>>
> >>>>>>> thanks,
> >>>>>>>
> >>>>>>> --larry
> >>>>>>>
> >>>>>>>
> >>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
> >> wrote:
> >>>>>>>
> >>>>>>>> Hi Larry,
> >>>>>>>>
> >>>>>>>> Thanks for the update. Good to see that with this update we are
> >>>>>>>> now
> >>>>>>> aligned on most points.
> >>>>>>>>
> >>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The
> >>>>>>>> new
> >>>>>>> revision incorporates feedback and suggestions in related
> >>>>>>> discussion
> >>>>> with
> >>>>>>> the community, particularly from Microsoft and others attending
> >>>>>>> the Security design lounge session at the Hadoop summit. Summary
> >>>>>>> of the
> >>>>> changes:
> >>>>>>>> 1.    Revised the approach to now use two tokens, Identity Token
> >> plus
> >>>>>>> Access Token, particularly considering our authorization
> >>>>>>> framework
> >> and
> >>>>>>> compatibility with HSSO;
> >>>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
> >>>>>>> framework into the flow that issues access tokens for clients
> >>>>>>> with
> >>>>> identity
> >>>>>>> tokens to access services;
> >>>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
> >>>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop
> >> web
> >>>>>>> services;
> >>>>>>>> 5.    Added Hadoop RPC access flow regard
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Best regards,
> >>>>
> >>>> - Andy
> >>>>
> >>>> Problems worthy of attack prove their worth by hitting back. - Piet
> >>>> Hein (via Tom White)
> >>>
> >>
> >>
> >
> >
> > --
> > Alejandro
> >
>
>
>
>
>

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Brian Swan <Br...@microsoft.com>.

Thanks, Larry. That is what I was trying to say, but you've said it better and in more detail. :-) To extract from what you are saying: "If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain... an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds."

-Brian

-----Original Message-----
From: Larry McCay [mailto:lmccay@hortonworks.com] 
Sent: Wednesday, July 10, 2013 10:40 AM
To: common-dev@hadoop.apache.org
Cc: daryn@yahoo-inc.com; Kai Zheng; Alejandro Abdelnur
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

It seems to me that we can have the best of both worlds here...it's all about the scoping.

If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain:

1. a very manageable scope to define and agree upon 2. a deliverable that should be useful in and of itself 3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community

So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the "what" we are building instead of the "how" to build it.
Including:
a. project structure within hadoop-common-project/common-security or the like b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions c. the JIRA/s for contributing patches d. what specific patches will be needed to accomplished the usecases in #b

In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds.

I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once.

@Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise.

thoughts?

On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com> wrote:

> Hi Alejandro, all-
> 
> There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of "what we are aiming for" forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything).
> 
> Thanks.
> 
> -Brian
> 
> -----Original Message-----
> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> Sent: Wednesday, July 10, 2013 8:15 AM
> To: Larry McCay
> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> 
> Larry, all,
> 
> Still is not clear to me what is the end state we are aiming for, or that we even agree on that.
> 
> IMO, Instead trying to agree what to do, we should first  agree on the final state, then we see what should be changed to there there, then we see how we change things to get there.
> 
> The different documents out there focus more on how.
> 
> We not try to say how before we know what.
> 
> Thx.
> 
> 
> 
> 
> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com> wrote:
> 
>> All -
>> 
>> After combing through this thread - as well as the summit session 
>> summary thread, I think that we have the following two items that we 
>> can probably move forward with:
>> 
>> 1. TokenAuth method - assuming this means the pluggable 
>> authentication mechanisms within the RPC layer (2 votes: Kai and 
>> Kyle) 2. An actual Hadoop Token format (2 votes: Brian and myself)
>> 
>> I propose that we attack both of these aspects as one. Let's provide 
>> the structure and interfaces of the pluggable framework for use in 
>> the RPC layer through leveraging Daryn's pluggability work and POC it 
>> with a particular token format (not necessarily the only format ever 
>> supported - we just need one to start). If there has already been 
>> work done in this area by anyone then please speak up and commit to 
>> providing a patch - so that we don't duplicate effort.
>> 
>> @Daryn - is there a particular Jira or set of Jiras that we can look 
>> at to discern the pluggability mechanism details? Documentation of it 
>> would be great as well.
>> @Kai - do you have existing code for the pluggable token 
>> authentication mechanism - if not, we can take a stab at representing 
>> it with interfaces and/or POC code.
>> I can standup and say that we have a token format that we have been 
>> working with already and can provide a patch that represents it as a 
>> contribution to test out the pluggable tokenAuth.
>> 
>> These patches will provide progress toward code being the central 
>> discussion vehicle. As a community, we can then incrementally build 
>> on that foundation in order to collaboratively deliver the common vision.
>> 
>> In the absence of any other home for posting such patches, let's 
>> assume that they will be attached to HADOOP-9392 - or a dedicated 
>> subtask for this particular aspect/s - I will leave that detail to Kai.
>> 
>> @Alejandro, being the only voice on this thread that isn't 
>> represented in the votes above, please feel free to agree or disagree with this direction.
>> 
>> thanks,
>> 
>> --larry
>> 
>> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
>> 
>>> Hi Andy -
>>> 
>>>> Happy Fourth of July to you and yours.
>>> 
>>> Same to you and yours. :-)
>>> We had some fun in the sun for a change - we've had nothing but rain 
>>> on
>> the east coast lately.
>>> 
>>>> My concern here is there may have been a misinterpretation or lack 
>>>> of consensus on what is meant by "clean slate"
>>> 
>>> 
>>> Apparently so.
>>> On the pre-summit call, I stated that I was interested in 
>>> reconciling
>> the jiras so that we had one to work from.
>>> 
>>> You recommended that we set them aside for the time being - with the
>> understanding that work would continue on your side (and our's as
>> well) - and approach the community discussion from a clean slate.
>>> We seemed to do this at the summit session quite well.
>>> It was my understanding that this community discussion would live 
>>> beyond
>> the summit and continue on this list.
>>> 
>>> While closing the summit session we agreed to follow up on 
>>> common-dev
>> with first a summary then a discussion of the moving parts.
>>> 
>>> I never expected the previous work to be abandoned and fully 
>>> expected it
>> to inform the discussion that happened here.
>>> 
>>> If you would like to reframe what clean slate was supposed to mean 
>>> or
>> describe what it means now - that would be welcome - before I waste 
>> anymore time trying to facilitate a community discussion that is 
>> apparently not wanted.
>>> 
>>>> Nowhere in this
>>>> picture are self appointed "master JIRAs" and such, which have been 
>>>> disappointing to see crop up, we should be collaboratively coding 
>>>> not planting flags.
>>> 
>>> I don't know what you mean by self-appointed master JIRAs.
>>> It has certainly not been anyone's intention to disappoint.
>>> Any mention of a new JIRA was just to have a clear context to gather 
>>> the
>> agreed upon points - previous and/or existing JIRAs would easily be linked.
>>> 
>>> Planting flags... I need to go back and read my discussion point 
>>> about the
>> JIRA and see how this is the impression that was made.
>>> That is not how I define success. The only flags that count is code.
>> What we are lacking is the roadmap on which to put the code.
>>> 
>>>> I read Kai's latest document as something approaching today's 
>>>> consensus
>> (or
>>>> at least a common point of view?) rather than a historical document.
>>>> Perhaps he and it can be given equal share of the consideration.
>>> 
>>> I definitely read it as something that has evolved into something
>> approaching what we have been talking about so far. There has not 
>> however been enough discussion anywhere near the level of detail in 
>> that document and more details are needed for each component in the design.
>>> Why the work in that document should not be fed into the community
>> discussion as anyone else's would be - I fail to understand.
>>> 
>>> My suggestion continues to be that you should take that document and
>> speak to the inventory of moving parts as we agreed.
>>> As these are agreed upon, we will ensure that the appropriate 
>>> subtasks
>> are filed against whatever JIRA is to host them - don't really care 
>> much which it is.
>>> 
>>> I don't really want to continue with two separate JIRAs - as I 
>>> stated
>> long ago - but until we understand what the pieces are and how they 
>> relate then they can't be consolidated.
>>> Even if 9533 ended up being repurposed as the server instance of the
>> work - it should be a subtask of a larger one - if that is to be 
>> 9392, so be it.
>>> We still need to define all the pieces of the larger picture before 
>>> that
>> can be done.
>>> 
>>> What I thought was the clean slate approach to the discussion seemed 
>>> a
>> very reasonable way to make all this happen.
>>> If you would like to restate what you intended by it or something 
>>> else
>> equally as reasonable as a way to move forward that would be awesome.
>>> 
>>> I will be happy to work toward the roadmap with everyone once it is
>> articulated, understood and actionable.
>>> In the meantime, I have work to do.
>>> 
>>> thanks,
>>> 
>>> --larry
>>> 
>>> BTW - I meant to quote you in an earlier response and ended up 
>>> saying it
>> was Aaron instead. Not sure what happened there. :-)
>>> 
>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
>>> 
>>>> Hi Larry (and all),
>>>> 
>>>> Happy Fourth of July to you and yours.
>>>> 
>>>> In our shop Kai and Tianyou are already doing the coding, so I'd 
>>>> defer
>> to
>>>> them on the detailed points.
>>>> 
>>>> My concern here is there may have been a misinterpretation or lack 
>>>> of consensus on what is meant by "clean slate". Hopefully that can 
>>>> be
>> quickly
>>>> cleared up. Certainly we did not mean ignore all that came before. 
>>>> The
>> idea
>>>> was to reset discussions to find common ground and new direction 
>>>> where
>> we
>>>> are working together, not in conflict, on an agreed upon set of 
>>>> design points and tasks. There's been a lot of good discussion and 
>>>> design preceeding that we should figure out how to port over.
>>>> Nowhere in this picture are self appointed "master JIRAs" and such, 
>>>> which have been disappointing to see crop up, we should be 
>>>> collaboratively coding not planting flags.
>>>> 
>>>> I read Kai's latest document as something approaching today's 
>>>> consensus
>> (or
>>>> at least a common point of view?) rather than a historical document.
>>>> Perhaps he and it can be given equal share of the consideration.
>>>> 
>>>> 
>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>>>> 
>>>>> Hey Andrew -
>>>>> 
>>>>> I largely agree with that statement.
>>>>> My intention was to let the differences be worked out within the 
>>>>> individual components once they were identified and subtasks created.
>>>>> 
>>>>> My reference to HSSO was really referring to a SSO *server* based
>> design
>>>>> which was not clearly articulated in the earlier documents.
>>>>> We aren't trying to compare and contrast one design over another
>> anymore.
>>>>> 
>>>>> Let's move this collaboration along as we've mapped out and the 
>>>>> differences in the details will reveal themselves and be addressed
>> within
>>>>> their components.
>>>>> 
>>>>> I've actually been looking forward to you weighing in on the 
>>>>> actual discussion points in this thread.
>>>>> Could you do that?
>>>>> 
>>>>> At this point, I am most interested in your thoughts on a single 
>>>>> jira
>> to
>>>>> represent all of this work and whether we should start discussing 
>>>>> the
>> SSO
>>>>> Tokens.
>>>>> If you think there are discussion points missing from that list, 
>>>>> feel
>> free
>>>>> to add to it.
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> --larry
>>>>> 
>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>>>> 
>>>>>> Hi Larry,
>>>>>> 
>>>>>> Of course I'll let Kai speak for himself. However, let me point 
>>>>>> out
>> that,
>>>>>> while the differences between the competing JIRAs have been 
>>>>>> reduced
>> for
>>>>>> sure, there were some key differences that didn't just disappear.
>>>>>> Subsequent discussion will make that clear. I also disagree with 
>>>>>> your characterization that we have simply endorsed all of the 
>>>>>> design
>> decisions
>>>>>> of the so-called HSSO, this is taking a mile from an inch. We are
>> here to
>>>>>> engage in a collaborative process as peers. I've been encouraged 
>>>>>> by
>> the
>>>>>> spirit of the discussions up to this point and hope that can 
>>>>>> continue beyond one design summit.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay 
>>>>>> <lm...@hortonworks.com>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi Kai -
>>>>>>> 
>>>>>>> I think that I need to clarify something...
>>>>>>> 
>>>>>>> This is not an update for 9533 but a continuation of the 
>>>>>>> discussions
>>>>> that
>>>>>>> are focused on a fresh look at a SSO for Hadoop.
>>>>>>> We've agreed to leave our previous designs behind and therefore 
>>>>>>> we
>>>>> aren't
>>>>>>> really seeing it as an HSSO layered on top of TAS approach or an
>> HSSO vs
>>>>>>> TAS discussion.
>>>>>>> 
>>>>>>> Your latest design revision actually makes it clear that you are 
>>>>>>> now targeting exactly what was described as HSSO - so comparing 
>>>>>>> and
>>>>> contrasting
>>>>>>> is not going to add any value.
>>>>>>> 
>>>>>>> What we need you to do at this point, is to look at those 
>>>>>>> high-level components described on this thread and comment on 
>>>>>>> whether we need additional components or any that are listed 
>>>>>>> that don't seem
>> necessary
>>>>> to
>>>>>>> you and why.
>>>>>>> In other words, we need to define and agree on the work that has 
>>>>>>> to
>> be
>>>>>>> done.
>>>>>>> 
>>>>>>> We also need to determine those components that need to be done
>> before
>>>>>>> anything else can be started.
>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are 
>>>>>>> central to
>>>>> all
>>>>>>> the other components and should probably be defined and POC'd in
>> short
>>>>>>> order.
>>>>>>> 
>>>>>>> Personally, I think that continuing the separation of 9533 and
>>>>>>> 9392
>> will
>>>>>>> do this effort a disservice. There doesn't seem to be enough
>> differences
>>>>>>> between the two to justify separate jiras anymore. It may be 
>>>>>>> best to
>>>>> file a
>>>>>>> new one that reflects a single vision without the extra cruft 
>>>>>>> that
>> has
>>>>>>> built up in either of the existing ones. We would certainly 
>>>>>>> reference
>>>>> the
>>>>>>> existing ones within the new one. This approach would align with 
>>>>>>> the
>>>>> spirit
>>>>>>> of the discussions up to this point.
>>>>>>> 
>>>>>>> I am prepared to start a discussion around the shape of the two
>> Hadoop
>>>>> SSO
>>>>>>> tokens: identity and access. If this is what others feel the 
>>>>>>> next
>> topic
>>>>>>> should be.
>>>>>>> If we can identify a jira home for it, we can do it there -
>> otherwise we
>>>>>>> can create another DISCUSS thread for it.
>>>>>>> 
>>>>>>> thanks,
>>>>>>> 
>>>>>>> --larry
>>>>>>> 
>>>>>>> 
>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
>> wrote:
>>>>>>> 
>>>>>>>> Hi Larry,
>>>>>>>> 
>>>>>>>> Thanks for the update. Good to see that with this update we are 
>>>>>>>> now
>>>>>>> aligned on most points.
>>>>>>>> 
>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The 
>>>>>>>> new
>>>>>>> revision incorporates feedback and suggestions in related 
>>>>>>> discussion
>>>>> with
>>>>>>> the community, particularly from Microsoft and others attending 
>>>>>>> the Security design lounge session at the Hadoop summit. Summary 
>>>>>>> of the
>>>>> changes:
>>>>>>>> 1.    Revised the approach to now use two tokens, Identity Token
>> plus
>>>>>>> Access Token, particularly considering our authorization 
>>>>>>> framework
>> and
>>>>>>> compatibility with HSSO;
>>>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
>>>>>>> framework into the flow that issues access tokens for clients 
>>>>>>> with
>>>>> identity
>>>>>>> tokens to access services;
>>>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop
>> web
>>>>>>> services;
>>>>>>>> 5.    Added Hadoop RPC access flow regard
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> 
>>>> - Andy
>>>> 
>>>> Problems worthy of attack prove their worth by hitting back. - Piet 
>>>> Hein (via Tom White)
>>> 
>> 
>> 
> 
> 
> --
> Alejandro
>

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

It seems to me that we can have the best of both worlds here…it's all about the scoping.

If we were to reframe the immediate scope to the lowest common denominator of what is needed for accepting tokens in authentication plugins then we gain:

1. a very manageable scope to define and agree upon
2. a deliverable that should be useful in and of itself
3. a foundation for community collaboration that we build on for higher level solutions built on this lowest common denominator and experience as a working community

So, to Alejandro's point, perhaps we need to define what would make #2 above true - this could serve as the "what" we are building instead of the "how" to build it.
Including:
a. project structure within hadoop-common-project/common-security or the like
b. the usecases that would need to be enabled to make it a self contained and useful contribution - without higher level solutions
c. the JIRA/s for contributing patches
d. what specific patches will be needed to accomplished the usecases in #b

In other words, an end-state for the lowest common denominator that enables code patches in the near-term is the best of both worlds.

I think this may be a good way to bootstrap the collaboration process for our emerging security community rather than trying to tackle a huge vision all at once.

@Alejandro - if you have something else in mind that would bootstrap this process - that would great - please advise.

thoughts?

On Jul 10, 2013, at 1:06 PM, Brian Swan <Br...@microsoft.com> wrote:

> Hi Alejandro, all-
> 
> There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of "what we are aiming for" forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything).
> 
> Thanks.
> 
> -Brian
> 
> -----Original Message-----
> From: Alejandro Abdelnur [mailto:tucu@cloudera.com] 
> Sent: Wednesday, July 10, 2013 8:15 AM
> To: Larry McCay
> Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> 
> Larry, all,
> 
> Still is not clear to me what is the end state we are aiming for, or that we even agree on that.
> 
> IMO, Instead trying to agree what to do, we should first  agree on the final state, then we see what should be changed to there there, then we see how we change things to get there.
> 
> The different documents out there focus more on how.
> 
> We not try to say how before we know what.
> 
> Thx.
> 
> 
> 
> 
> On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com> wrote:
> 
>> All -
>> 
>> After combing through this thread - as well as the summit session 
>> summary thread, I think that we have the following two items that we 
>> can probably move forward with:
>> 
>> 1. TokenAuth method - assuming this means the pluggable authentication 
>> mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual 
>> Hadoop Token format (2 votes: Brian and myself)
>> 
>> I propose that we attack both of these aspects as one. Let's provide 
>> the structure and interfaces of the pluggable framework for use in the 
>> RPC layer through leveraging Daryn's pluggability work and POC it with 
>> a particular token format (not necessarily the only format ever 
>> supported - we just need one to start). If there has already been work 
>> done in this area by anyone then please speak up and commit to 
>> providing a patch - so that we don't duplicate effort.
>> 
>> @Daryn - is there a particular Jira or set of Jiras that we can look 
>> at to discern the pluggability mechanism details? Documentation of it 
>> would be great as well.
>> @Kai - do you have existing code for the pluggable token 
>> authentication mechanism - if not, we can take a stab at representing 
>> it with interfaces and/or POC code.
>> I can standup and say that we have a token format that we have been 
>> working with already and can provide a patch that represents it as a 
>> contribution to test out the pluggable tokenAuth.
>> 
>> These patches will provide progress toward code being the central 
>> discussion vehicle. As a community, we can then incrementally build on 
>> that foundation in order to collaboratively deliver the common vision.
>> 
>> In the absence of any other home for posting such patches, let's 
>> assume that they will be attached to HADOOP-9392 - or a dedicated 
>> subtask for this particular aspect/s - I will leave that detail to Kai.
>> 
>> @Alejandro, being the only voice on this thread that isn't represented 
>> in the votes above, please feel free to agree or disagree with this direction.
>> 
>> thanks,
>> 
>> --larry
>> 
>> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
>> 
>>> Hi Andy -
>>> 
>>>> Happy Fourth of July to you and yours.
>>> 
>>> Same to you and yours. :-)
>>> We had some fun in the sun for a change - we've had nothing but rain 
>>> on
>> the east coast lately.
>>> 
>>>> My concern here is there may have been a misinterpretation or lack 
>>>> of consensus on what is meant by "clean slate"
>>> 
>>> 
>>> Apparently so.
>>> On the pre-summit call, I stated that I was interested in 
>>> reconciling
>> the jiras so that we had one to work from.
>>> 
>>> You recommended that we set them aside for the time being - with the
>> understanding that work would continue on your side (and our's as 
>> well) - and approach the community discussion from a clean slate.
>>> We seemed to do this at the summit session quite well.
>>> It was my understanding that this community discussion would live 
>>> beyond
>> the summit and continue on this list.
>>> 
>>> While closing the summit session we agreed to follow up on 
>>> common-dev
>> with first a summary then a discussion of the moving parts.
>>> 
>>> I never expected the previous work to be abandoned and fully 
>>> expected it
>> to inform the discussion that happened here.
>>> 
>>> If you would like to reframe what clean slate was supposed to mean 
>>> or
>> describe what it means now - that would be welcome - before I waste 
>> anymore time trying to facilitate a community discussion that is 
>> apparently not wanted.
>>> 
>>>> Nowhere in this
>>>> picture are self appointed "master JIRAs" and such, which have been 
>>>> disappointing to see crop up, we should be collaboratively coding 
>>>> not planting flags.
>>> 
>>> I don't know what you mean by self-appointed master JIRAs.
>>> It has certainly not been anyone's intention to disappoint.
>>> Any mention of a new JIRA was just to have a clear context to gather 
>>> the
>> agreed upon points - previous and/or existing JIRAs would easily be linked.
>>> 
>>> Planting flags... I need to go back and read my discussion point about 
>>> the
>> JIRA and see how this is the impression that was made.
>>> That is not how I define success. The only flags that count is code.
>> What we are lacking is the roadmap on which to put the code.
>>> 
>>>> I read Kai's latest document as something approaching today's 
>>>> consensus
>> (or
>>>> at least a common point of view?) rather than a historical document.
>>>> Perhaps he and it can be given equal share of the consideration.
>>> 
>>> I definitely read it as something that has evolved into something
>> approaching what we have been talking about so far. There has not 
>> however been enough discussion anywhere near the level of detail in 
>> that document and more details are needed for each component in the design.
>>> Why the work in that document should not be fed into the community
>> discussion as anyone else's would be - I fail to understand.
>>> 
>>> My suggestion continues to be that you should take that document and
>> speak to the inventory of moving parts as we agreed.
>>> As these are agreed upon, we will ensure that the appropriate 
>>> subtasks
>> are filed against whatever JIRA is to host them - don't really care 
>> much which it is.
>>> 
>>> I don't really want to continue with two separate JIRAs - as I 
>>> stated
>> long ago - but until we understand what the pieces are and how they 
>> relate then they can't be consolidated.
>>> Even if 9533 ended up being repurposed as the server instance of the
>> work - it should be a subtask of a larger one - if that is to be 9392, 
>> so be it.
>>> We still need to define all the pieces of the larger picture before 
>>> that
>> can be done.
>>> 
>>> What I thought was the clean slate approach to the discussion seemed 
>>> a
>> very reasonable way to make all this happen.
>>> If you would like to restate what you intended by it or something 
>>> else
>> equally as reasonable as a way to move forward that would be awesome.
>>> 
>>> I will be happy to work toward the roadmap with everyone once it is
>> articulated, understood and actionable.
>>> In the meantime, I have work to do.
>>> 
>>> thanks,
>>> 
>>> --larry
>>> 
>>> BTW - I meant to quote you in an earlier response and ended up 
>>> saying it
>> was Aaron instead. Not sure what happened there. :-)
>>> 
>>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
>>> 
>>>> Hi Larry (and all),
>>>> 
>>>> Happy Fourth of July to you and yours.
>>>> 
>>>> In our shop Kai and Tianyou are already doing the coding, so I'd 
>>>> defer
>> to
>>>> them on the detailed points.
>>>> 
>>>> My concern here is there may have been a misinterpretation or lack 
>>>> of consensus on what is meant by "clean slate". Hopefully that can 
>>>> be
>> quickly
>>>> cleared up. Certainly we did not mean ignore all that came before. 
>>>> The
>> idea
>>>> was to reset discussions to find common ground and new direction 
>>>> where
>> we
>>>> are working together, not in conflict, on an agreed upon set of 
>>>> design points and tasks. There's been a lot of good discussion and 
>>>> design preceeding that we should figure out how to port over. 
>>>> Nowhere in this picture are self appointed "master JIRAs" and such, 
>>>> which have been disappointing to see crop up, we should be 
>>>> collaboratively coding not planting flags.
>>>> 
>>>> I read Kai's latest document as something approaching today's 
>>>> consensus
>> (or
>>>> at least a common point of view?) rather than a historical document.
>>>> Perhaps he and it can be given equal share of the consideration.
>>>> 
>>>> 
>>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>>>> 
>>>>> Hey Andrew -
>>>>> 
>>>>> I largely agree with that statement.
>>>>> My intention was to let the differences be worked out within the 
>>>>> individual components once they were identified and subtasks created.
>>>>> 
>>>>> My reference to HSSO was really referring to a SSO *server* based
>> design
>>>>> which was not clearly articulated in the earlier documents.
>>>>> We aren't trying to compare and contrast one design over another
>> anymore.
>>>>> 
>>>>> Let's move this collaboration along as we've mapped out and the 
>>>>> differences in the details will reveal themselves and be addressed
>> within
>>>>> their components.
>>>>> 
>>>>> I've actually been looking forward to you weighing in on the 
>>>>> actual discussion points in this thread.
>>>>> Could you do that?
>>>>> 
>>>>> At this point, I am most interested in your thoughts on a single 
>>>>> jira
>> to
>>>>> represent all of this work and whether we should start discussing 
>>>>> the
>> SSO
>>>>> Tokens.
>>>>> If you think there are discussion points missing from that list, 
>>>>> feel
>> free
>>>>> to add to it.
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> --larry
>>>>> 
>>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
>> wrote:
>>>>> 
>>>>>> Hi Larry,
>>>>>> 
>>>>>> Of course I'll let Kai speak for himself. However, let me point 
>>>>>> out
>> that,
>>>>>> while the differences between the competing JIRAs have been 
>>>>>> reduced
>> for
>>>>>> sure, there were some key differences that didn't just disappear.
>>>>>> Subsequent discussion will make that clear. I also disagree with 
>>>>>> your characterization that we have simply endorsed all of the 
>>>>>> design
>> decisions
>>>>>> of the so-called HSSO, this is taking a mile from an inch. We are
>> here to
>>>>>> engage in a collaborative process as peers. I've been encouraged 
>>>>>> by
>> the
>>>>>> spirit of the discussions up to this point and hope that can 
>>>>>> continue beyond one design summit.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay 
>>>>>> <lm...@hortonworks.com>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi Kai -
>>>>>>> 
>>>>>>> I think that I need to clarify something...
>>>>>>> 
>>>>>>> This is not an update for 9533 but a continuation of the 
>>>>>>> discussions
>>>>> that
>>>>>>> are focused on a fresh look at a SSO for Hadoop.
>>>>>>> We've agreed to leave our previous designs behind and therefore 
>>>>>>> we
>>>>> aren't
>>>>>>> really seeing it as an HSSO layered on top of TAS approach or an
>> HSSO vs
>>>>>>> TAS discussion.
>>>>>>> 
>>>>>>> Your latest design revision actually makes it clear that you are 
>>>>>>> now targeting exactly what was described as HSSO - so comparing 
>>>>>>> and
>>>>> contrasting
>>>>>>> is not going to add any value.
>>>>>>> 
>>>>>>> What we need you to do at this point, is to look at those 
>>>>>>> high-level components described on this thread and comment on 
>>>>>>> whether we need additional components or any that are listed 
>>>>>>> that don't seem
>> necessary
>>>>> to
>>>>>>> you and why.
>>>>>>> In other words, we need to define and agree on the work that has 
>>>>>>> to
>> be
>>>>>>> done.
>>>>>>> 
>>>>>>> We also need to determine those components that need to be done
>> before
>>>>>>> anything else can be started.
>>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are 
>>>>>>> central to
>>>>> all
>>>>>>> the other components and should probably be defined and POC'd in
>> short
>>>>>>> order.
>>>>>>> 
>>>>>>> Personally, I think that continuing the separation of 9533 and 
>>>>>>> 9392
>> will
>>>>>>> do this effort a disservice. There doesn't seem to be enough
>> differences
>>>>>>> between the two to justify separate jiras anymore. It may be 
>>>>>>> best to
>>>>> file a
>>>>>>> new one that reflects a single vision without the extra cruft 
>>>>>>> that
>> has
>>>>>>> built up in either of the existing ones. We would certainly 
>>>>>>> reference
>>>>> the
>>>>>>> existing ones within the new one. This approach would align with 
>>>>>>> the
>>>>> spirit
>>>>>>> of the discussions up to this point.
>>>>>>> 
>>>>>>> I am prepared to start a discussion around the shape of the two
>> Hadoop
>>>>> SSO
>>>>>>> tokens: identity and access. If this is what others feel the 
>>>>>>> next
>> topic
>>>>>>> should be.
>>>>>>> If we can identify a jira home for it, we can do it there -
>> otherwise we
>>>>>>> can create another DISCUSS thread for it.
>>>>>>> 
>>>>>>> thanks,
>>>>>>> 
>>>>>>> --larry
>>>>>>> 
>>>>>>> 
>>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
>> wrote:
>>>>>>> 
>>>>>>>> Hi Larry,
>>>>>>>> 
>>>>>>>> Thanks for the update. Good to see that with this update we are 
>>>>>>>> now
>>>>>>> aligned on most points.
>>>>>>>> 
>>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The 
>>>>>>>> new
>>>>>>> revision incorporates feedback and suggestions in related 
>>>>>>> discussion
>>>>> with
>>>>>>> the community, particularly from Microsoft and others attending 
>>>>>>> the Security design lounge session at the Hadoop summit. Summary 
>>>>>>> of the
>>>>> changes:
>>>>>>>> 1.    Revised the approach to now use two tokens, Identity Token
>> plus
>>>>>>> Access Token, particularly considering our authorization 
>>>>>>> framework
>> and
>>>>>>> compatibility with HSSO;
>>>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
>>>>>>> framework into the flow that issues access tokens for clients 
>>>>>>> with
>>>>> identity
>>>>>>> tokens to access services;
>>>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop
>> web
>>>>>>> services;
>>>>>>>> 5.    Added Hadoop RPC access flow regard
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> 
>>>> - Andy
>>>> 
>>>> Problems worthy of attack prove their worth by hitting back. - Piet 
>>>> Hein (via Tom White)
>>> 
>> 
>> 
> 
> 
> --
> Alejandro
>

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Brian Swan <Br...@microsoft.com>.

Hi Alejandro, all-

There seems to be agreement on the broad stroke description of the components needed to achieve pluggable token authentication (I'm sure I'll be corrected if that isn't the case). However, discussion of the details of those components doesn't seem to be moving forward. I think this is because the details are really best understood through code. I also see *a* (i.e. one of many possible) token format and pluggable authentication mechanisms within the RPC layer as components that can have immediate benefit to Hadoop users AND still allow flexibility in the larger design. So, I think the best way to move the conversation of "what we are aiming for" forward is to start looking at code for these components. I am especially interested in moving forward with pluggable authentication mechanisms within the RPC layer and would love to see what others have done in this area (if anything).

Thanks.

-Brian

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com] 
Sent: Wednesday, July 10, 2013 8:15 AM
To: Larry McCay
Cc: common-dev@hadoop.apache.org; daryn@yahoo-inc.com; Kai Zheng
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

Larry, all,

Still is not clear to me what is the end state we are aiming for, or that we even agree on that.

IMO, Instead trying to agree what to do, we should first  agree on the final state, then we see what should be changed to there there, then we see how we change things to get there.

The different documents out there focus more on how.

We not try to say how before we know what.

Thx.




On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com> wrote:

> All -
>
> After combing through this thread - as well as the summit session 
> summary thread, I think that we have the following two items that we 
> can probably move forward with:
>
> 1. TokenAuth method - assuming this means the pluggable authentication 
> mechanisms within the RPC layer (2 votes: Kai and Kyle) 2. An actual 
> Hadoop Token format (2 votes: Brian and myself)
>
> I propose that we attack both of these aspects as one. Let's provide 
> the structure and interfaces of the pluggable framework for use in the 
> RPC layer through leveraging Daryn's pluggability work and POC it with 
> a particular token format (not necessarily the only format ever 
> supported - we just need one to start). If there has already been work 
> done in this area by anyone then please speak up and commit to 
> providing a patch - so that we don't duplicate effort.
>
> @Daryn - is there a particular Jira or set of Jiras that we can look 
> at to discern the pluggability mechanism details? Documentation of it 
> would be great as well.
> @Kai - do you have existing code for the pluggable token 
> authentication mechanism - if not, we can take a stab at representing 
> it with interfaces and/or POC code.
> I can standup and say that we have a token format that we have been 
> working with already and can provide a patch that represents it as a 
> contribution to test out the pluggable tokenAuth.
>
> These patches will provide progress toward code being the central 
> discussion vehicle. As a community, we can then incrementally build on 
> that foundation in order to collaboratively deliver the common vision.
>
> In the absence of any other home for posting such patches, let's 
> assume that they will be attached to HADOOP-9392 - or a dedicated 
> subtask for this particular aspect/s - I will leave that detail to Kai.
>
> @Alejandro, being the only voice on this thread that isn't represented 
> in the votes above, please feel free to agree or disagree with this direction.
>
> thanks,
>
> --larry
>
> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
>
> > Hi Andy -
> >
> >> Happy Fourth of July to you and yours.
> >
> > Same to you and yours. :-)
> > We had some fun in the sun for a change - we've had nothing but rain 
> > on
> the east coast lately.
> >
> >> My concern here is there may have been a misinterpretation or lack 
> >> of consensus on what is meant by "clean slate"
> >
> >
> > Apparently so.
> > On the pre-summit call, I stated that I was interested in 
> > reconciling
> the jiras so that we had one to work from.
> >
> > You recommended that we set them aside for the time being - with the
> understanding that work would continue on your side (and our's as 
> well) - and approach the community discussion from a clean slate.
> > We seemed to do this at the summit session quite well.
> > It was my understanding that this community discussion would live 
> > beyond
> the summit and continue on this list.
> >
> > While closing the summit session we agreed to follow up on 
> > common-dev
> with first a summary then a discussion of the moving parts.
> >
> > I never expected the previous work to be abandoned and fully 
> > expected it
> to inform the discussion that happened here.
> >
> > If you would like to reframe what clean slate was supposed to mean 
> > or
> describe what it means now - that would be welcome - before I waste 
> anymore time trying to facilitate a community discussion that is 
> apparently not wanted.
> >
> >> Nowhere in this
> >> picture are self appointed "master JIRAs" and such, which have been 
> >> disappointing to see crop up, we should be collaboratively coding 
> >> not planting flags.
> >
> > I don't know what you mean by self-appointed master JIRAs.
> > It has certainly not been anyone's intention to disappoint.
> > Any mention of a new JIRA was just to have a clear context to gather 
> > the
> agreed upon points - previous and/or existing JIRAs would easily be linked.
> >
> > Planting flags... I need to go back and read my discussion point about 
> > the
> JIRA and see how this is the impression that was made.
> > That is not how I define success. The only flags that count is code.
> What we are lacking is the roadmap on which to put the code.
> >
> >> I read Kai's latest document as something approaching today's 
> >> consensus
> (or
> >> at least a common point of view?) rather than a historical document.
> >> Perhaps he and it can be given equal share of the consideration.
> >
> > I definitely read it as something that has evolved into something
> approaching what we have been talking about so far. There has not 
> however been enough discussion anywhere near the level of detail in 
> that document and more details are needed for each component in the design.
> > Why the work in that document should not be fed into the community
> discussion as anyone else's would be - I fail to understand.
> >
> > My suggestion continues to be that you should take that document and
> speak to the inventory of moving parts as we agreed.
> > As these are agreed upon, we will ensure that the appropriate 
> > subtasks
> are filed against whatever JIRA is to host them - don't really care 
> much which it is.
> >
> > I don't really want to continue with two separate JIRAs - as I 
> > stated
> long ago - but until we understand what the pieces are and how they 
> relate then they can't be consolidated.
> > Even if 9533 ended up being repurposed as the server instance of the
> work - it should be a subtask of a larger one - if that is to be 9392, 
> so be it.
> > We still need to define all the pieces of the larger picture before 
> > that
> can be done.
> >
> > What I thought was the clean slate approach to the discussion seemed 
> > a
> very reasonable way to make all this happen.
> > If you would like to restate what you intended by it or something 
> > else
> equally as reasonable as a way to move forward that would be awesome.
> >
> > I will be happy to work toward the roadmap with everyone once it is
> articulated, understood and actionable.
> > In the meantime, I have work to do.
> >
> > thanks,
> >
> > --larry
> >
> > BTW - I meant to quote you in an earlier response and ended up 
> > saying it
> was Aaron instead. Not sure what happened there. :-)
> >
> > On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
> >
> >> Hi Larry (and all),
> >>
> >> Happy Fourth of July to you and yours.
> >>
> >> In our shop Kai and Tianyou are already doing the coding, so I'd 
> >> defer
> to
> >> them on the detailed points.
> >>
> >> My concern here is there may have been a misinterpretation or lack 
> >> of consensus on what is meant by "clean slate". Hopefully that can 
> >> be
> quickly
> >> cleared up. Certainly we did not mean ignore all that came before. 
> >> The
> idea
> >> was to reset discussions to find common ground and new direction 
> >> where
> we
> >> are working together, not in conflict, on an agreed upon set of 
> >> design points and tasks. There's been a lot of good discussion and 
> >> design preceeding that we should figure out how to port over. 
> >> Nowhere in this picture are self appointed "master JIRAs" and such, 
> >> which have been disappointing to see crop up, we should be 
> >> collaboratively coding not planting flags.
> >>
> >> I read Kai's latest document as something approaching today's 
> >> consensus
> (or
> >> at least a common point of view?) rather than a historical document.
> >> Perhaps he and it can be given equal share of the consideration.
> >>
> >>
> >> On Wednesday, July 3, 2013, Larry McCay wrote:
> >>
> >>> Hey Andrew -
> >>>
> >>> I largely agree with that statement.
> >>> My intention was to let the differences be worked out within the 
> >>> individual components once they were identified and subtasks created.
> >>>
> >>> My reference to HSSO was really referring to a SSO *server* based
> design
> >>> which was not clearly articulated in the earlier documents.
> >>> We aren't trying to compare and contrast one design over another
> anymore.
> >>>
> >>> Let's move this collaboration along as we've mapped out and the 
> >>> differences in the details will reveal themselves and be addressed
> within
> >>> their components.
> >>>
> >>> I've actually been looking forward to you weighing in on the 
> >>> actual discussion points in this thread.
> >>> Could you do that?
> >>>
> >>> At this point, I am most interested in your thoughts on a single 
> >>> jira
> to
> >>> represent all of this work and whether we should start discussing 
> >>> the
> SSO
> >>> Tokens.
> >>> If you think there are discussion points missing from that list, 
> >>> feel
> free
> >>> to add to it.
> >>>
> >>> thanks,
> >>>
> >>> --larry
> >>>
> >>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
> wrote:
> >>>
> >>>> Hi Larry,
> >>>>
> >>>> Of course I'll let Kai speak for himself. However, let me point 
> >>>> out
> that,
> >>>> while the differences between the competing JIRAs have been 
> >>>> reduced
> for
> >>>> sure, there were some key differences that didn't just disappear.
> >>>> Subsequent discussion will make that clear. I also disagree with 
> >>>> your characterization that we have simply endorsed all of the 
> >>>> design
> decisions
> >>>> of the so-called HSSO, this is taking a mile from an inch. We are
> here to
> >>>> engage in a collaborative process as peers. I've been encouraged 
> >>>> by
> the
> >>>> spirit of the discussions up to this point and hope that can 
> >>>> continue beyond one design summit.
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay 
> >>>> <lm...@hortonworks.com>
> >>> wrote:
> >>>>
> >>>>> Hi Kai -
> >>>>>
> >>>>> I think that I need to clarify something...
> >>>>>
> >>>>> This is not an update for 9533 but a continuation of the 
> >>>>> discussions
> >>> that
> >>>>> are focused on a fresh look at a SSO for Hadoop.
> >>>>> We've agreed to leave our previous designs behind and therefore 
> >>>>> we
> >>> aren't
> >>>>> really seeing it as an HSSO layered on top of TAS approach or an
> HSSO vs
> >>>>> TAS discussion.
> >>>>>
> >>>>> Your latest design revision actually makes it clear that you are 
> >>>>> now targeting exactly what was described as HSSO - so comparing 
> >>>>> and
> >>> contrasting
> >>>>> is not going to add any value.
> >>>>>
> >>>>> What we need you to do at this point, is to look at those 
> >>>>> high-level components described on this thread and comment on 
> >>>>> whether we need additional components or any that are listed 
> >>>>> that don't seem
> necessary
> >>> to
> >>>>> you and why.
> >>>>> In other words, we need to define and agree on the work that has 
> >>>>> to
> be
> >>>>> done.
> >>>>>
> >>>>> We also need to determine those components that need to be done
> before
> >>>>> anything else can be started.
> >>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are 
> >>>>> central to
> >>> all
> >>>>> the other components and should probably be defined and POC'd in
> short
> >>>>> order.
> >>>>>
> >>>>> Personally, I think that continuing the separation of 9533 and 
> >>>>> 9392
> will
> >>>>> do this effort a disservice. There doesn't seem to be enough
> differences
> >>>>> between the two to justify separate jiras anymore. It may be 
> >>>>> best to
> >>> file a
> >>>>> new one that reflects a single vision without the extra cruft 
> >>>>> that
> has
> >>>>> built up in either of the existing ones. We would certainly 
> >>>>> reference
> >>> the
> >>>>> existing ones within the new one. This approach would align with 
> >>>>> the
> >>> spirit
> >>>>> of the discussions up to this point.
> >>>>>
> >>>>> I am prepared to start a discussion around the shape of the two
> Hadoop
> >>> SSO
> >>>>> tokens: identity and access. If this is what others feel the 
> >>>>> next
> topic
> >>>>> should be.
> >>>>> If we can identify a jira home for it, we can do it there -
> otherwise we
> >>>>> can create another DISCUSS thread for it.
> >>>>>
> >>>>> thanks,
> >>>>>
> >>>>> --larry
> >>>>>
> >>>>>
> >>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
> wrote:
> >>>>>
> >>>>>> Hi Larry,
> >>>>>>
> >>>>>> Thanks for the update. Good to see that with this update we are 
> >>>>>> now
> >>>>> aligned on most points.
> >>>>>>
> >>>>>> I have also updated our TokenAuth design in HADOOP-9392. The 
> >>>>>> new
> >>>>> revision incorporates feedback and suggestions in related 
> >>>>> discussion
> >>> with
> >>>>> the community, particularly from Microsoft and others attending 
> >>>>> the Security design lounge session at the Hadoop summit. Summary 
> >>>>> of the
> >>> changes:
> >>>>>> 1.    Revised the approach to now use two tokens, Identity Token
> plus
> >>>>> Access Token, particularly considering our authorization 
> >>>>> framework
> and
> >>>>> compatibility with HSSO;
> >>>>>> 2.    Introduced Authorization Server (AS) from our authorization
> >>>>> framework into the flow that issues access tokens for clients 
> >>>>> with
> >>> identity
> >>>>> tokens to access services;
> >>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
> >>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop
> web
> >>>>> services;
> >>>>>> 5.    Added Hadoop RPC access flow regard
> >>
> >>
> >>
> >> --
> >> Best regards,
> >>
> >>  - Andy
> >>
> >> Problems worthy of attack prove their worth by hitting back. - Piet 
> >> Hein (via Tom White)
> >
>
>


--
Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Larry, all,

Still is not clear to me what is the end state we are aiming for, or that
we even agree on that.

IMO, Instead trying to agree what to do, we should first  agree on the
final state, then we see what should be changed to there there, then we see
how we change things to get there.

The different documents out there focus more on how.

We not try to say how before we know what.

Thx.




On Wed, Jul 10, 2013 at 6:42 AM, Larry McCay <lm...@hortonworks.com> wrote:

> All -
>
> After combing through this thread - as well as the summit session summary
> thread, I think that we have the following two items that we can probably
> move forward with:
>
> 1. TokenAuth method - assuming this means the pluggable authentication
> mechanisms within the RPC layer (2 votes: Kai and Kyle)
> 2. An actual Hadoop Token format (2 votes: Brian and myself)
>
> I propose that we attack both of these aspects as one. Let's provide the
> structure and interfaces of the pluggable framework for use in the RPC
> layer through leveraging Daryn's pluggability work and POC it with a
> particular token format (not necessarily the only format ever supported -
> we just need one to start). If there has already been work done in this
> area by anyone then please speak up and commit to providing a patch - so
> that we don't duplicate effort.
>
> @Daryn - is there a particular Jira or set of Jiras that we can look at to
> discern the pluggability mechanism details? Documentation of it would be
> great as well.
> @Kai - do you have existing code for the pluggable token authentication
> mechanism - if not, we can take a stab at representing it with interfaces
> and/or POC code.
> I can standup and say that we have a token format that we have been
> working with already and can provide a patch that represents it as a
> contribution to test out the pluggable tokenAuth.
>
> These patches will provide progress toward code being the central
> discussion vehicle. As a community, we can then incrementally build on that
> foundation in order to collaboratively deliver the common vision.
>
> In the absence of any other home for posting such patches, let's assume
> that they will be attached to HADOOP-9392 - or a dedicated subtask for this
> particular aspect/s - I will leave that detail to Kai.
>
> @Alejandro, being the only voice on this thread that isn't represented in
> the votes above, please feel free to agree or disagree with this direction.
>
> thanks,
>
> --larry
>
> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
>
> > Hi Andy -
> >
> >> Happy Fourth of July to you and yours.
> >
> > Same to you and yours. :-)
> > We had some fun in the sun for a change - we've had nothing but rain on
> the east coast lately.
> >
> >> My concern here is there may have been a misinterpretation or lack of
> >> consensus on what is meant by "clean slate"
> >
> >
> > Apparently so.
> > On the pre-summit call, I stated that I was interested in reconciling
> the jiras so that we had one to work from.
> >
> > You recommended that we set them aside for the time being - with the
> understanding that work would continue on your side (and our's as well) -
> and approach the community discussion from a clean slate.
> > We seemed to do this at the summit session quite well.
> > It was my understanding that this community discussion would live beyond
> the summit and continue on this list.
> >
> > While closing the summit session we agreed to follow up on common-dev
> with first a summary then a discussion of the moving parts.
> >
> > I never expected the previous work to be abandoned and fully expected it
> to inform the discussion that happened here.
> >
> > If you would like to reframe what clean slate was supposed to mean or
> describe what it means now - that would be welcome - before I waste anymore
> time trying to facilitate a community discussion that is apparently not
> wanted.
> >
> >> Nowhere in this
> >> picture are self appointed "master JIRAs" and such, which have been
> >> disappointing to see crop up, we should be collaboratively coding not
> >> planting flags.
> >
> > I don't know what you mean by self-appointed master JIRAs.
> > It has certainly not been anyone's intention to disappoint.
> > Any mention of a new JIRA was just to have a clear context to gather the
> agreed upon points - previous and/or existing JIRAs would easily be linked.
> >
> > Planting flags… I need to go back and read my discussion point about the
> JIRA and see how this is the impression that was made.
> > That is not how I define success. The only flags that count is code.
> What we are lacking is the roadmap on which to put the code.
> >
> >> I read Kai's latest document as something approaching today's consensus
> (or
> >> at least a common point of view?) rather than a historical document.
> >> Perhaps he and it can be given equal share of the consideration.
> >
> > I definitely read it as something that has evolved into something
> approaching what we have been talking about so far. There has not however
> been enough discussion anywhere near the level of detail in that document
> and more details are needed for each component in the design.
> > Why the work in that document should not be fed into the community
> discussion as anyone else's would be - I fail to understand.
> >
> > My suggestion continues to be that you should take that document and
> speak to the inventory of moving parts as we agreed.
> > As these are agreed upon, we will ensure that the appropriate subtasks
> are filed against whatever JIRA is to host them - don't really care much
> which it is.
> >
> > I don't really want to continue with two separate JIRAs - as I stated
> long ago - but until we understand what the pieces are and how they relate
> then they can't be consolidated.
> > Even if 9533 ended up being repurposed as the server instance of the
> work - it should be a subtask of a larger one - if that is to be 9392, so
> be it.
> > We still need to define all the pieces of the larger picture before that
> can be done.
> >
> > What I thought was the clean slate approach to the discussion seemed a
> very reasonable way to make all this happen.
> > If you would like to restate what you intended by it or something else
> equally as reasonable as a way to move forward that would be awesome.
> >
> > I will be happy to work toward the roadmap with everyone once it is
> articulated, understood and actionable.
> > In the meantime, I have work to do.
> >
> > thanks,
> >
> > --larry
> >
> > BTW - I meant to quote you in an earlier response and ended up saying it
> was Aaron instead. Not sure what happened there. :-)
> >
> > On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
> >
> >> Hi Larry (and all),
> >>
> >> Happy Fourth of July to you and yours.
> >>
> >> In our shop Kai and Tianyou are already doing the coding, so I'd defer
> to
> >> them on the detailed points.
> >>
> >> My concern here is there may have been a misinterpretation or lack of
> >> consensus on what is meant by "clean slate". Hopefully that can be
> quickly
> >> cleared up. Certainly we did not mean ignore all that came before. The
> idea
> >> was to reset discussions to find common ground and new direction where
> we
> >> are working together, not in conflict, on an agreed upon set of design
> >> points and tasks. There's been a lot of good discussion and design
> >> preceeding that we should figure out how to port over. Nowhere in this
> >> picture are self appointed "master JIRAs" and such, which have been
> >> disappointing to see crop up, we should be collaboratively coding not
> >> planting flags.
> >>
> >> I read Kai's latest document as something approaching today's consensus
> (or
> >> at least a common point of view?) rather than a historical document.
> >> Perhaps he and it can be given equal share of the consideration.
> >>
> >>
> >> On Wednesday, July 3, 2013, Larry McCay wrote:
> >>
> >>> Hey Andrew -
> >>>
> >>> I largely agree with that statement.
> >>> My intention was to let the differences be worked out within the
> >>> individual components once they were identified and subtasks created.
> >>>
> >>> My reference to HSSO was really referring to a SSO *server* based
> design
> >>> which was not clearly articulated in the earlier documents.
> >>> We aren't trying to compare and contrast one design over another
> anymore.
> >>>
> >>> Let's move this collaboration along as we've mapped out and the
> >>> differences in the details will reveal themselves and be addressed
> within
> >>> their components.
> >>>
> >>> I've actually been looking forward to you weighing in on the actual
> >>> discussion points in this thread.
> >>> Could you do that?
> >>>
> >>> At this point, I am most interested in your thoughts on a single jira
> to
> >>> represent all of this work and whether we should start discussing the
> SSO
> >>> Tokens.
> >>> If you think there are discussion points missing from that list, feel
> free
> >>> to add to it.
> >>>
> >>> thanks,
> >>>
> >>> --larry
> >>>
> >>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org>
> wrote:
> >>>
> >>>> Hi Larry,
> >>>>
> >>>> Of course I'll let Kai speak for himself. However, let me point out
> that,
> >>>> while the differences between the competing JIRAs have been reduced
> for
> >>>> sure, there were some key differences that didn't just disappear.
> >>>> Subsequent discussion will make that clear. I also disagree with your
> >>>> characterization that we have simply endorsed all of the design
> decisions
> >>>> of the so-called HSSO, this is taking a mile from an inch. We are
> here to
> >>>> engage in a collaborative process as peers. I've been encouraged by
> the
> >>>> spirit of the discussions up to this point and hope that can continue
> >>>> beyond one design summit.
> >>>>
> >>>>
> >>>>
> >>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
> >>> wrote:
> >>>>
> >>>>> Hi Kai -
> >>>>>
> >>>>> I think that I need to clarify something…
> >>>>>
> >>>>> This is not an update for 9533 but a continuation of the discussions
> >>> that
> >>>>> are focused on a fresh look at a SSO for Hadoop.
> >>>>> We've agreed to leave our previous designs behind and therefore we
> >>> aren't
> >>>>> really seeing it as an HSSO layered on top of TAS approach or an
> HSSO vs
> >>>>> TAS discussion.
> >>>>>
> >>>>> Your latest design revision actually makes it clear that you are now
> >>>>> targeting exactly what was described as HSSO - so comparing and
> >>> contrasting
> >>>>> is not going to add any value.
> >>>>>
> >>>>> What we need you to do at this point, is to look at those high-level
> >>>>> components described on this thread and comment on whether we need
> >>>>> additional components or any that are listed that don't seem
> necessary
> >>> to
> >>>>> you and why.
> >>>>> In other words, we need to define and agree on the work that has to
> be
> >>>>> done.
> >>>>>
> >>>>> We also need to determine those components that need to be done
> before
> >>>>> anything else can be started.
> >>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
> >>> all
> >>>>> the other components and should probably be defined and POC'd in
> short
> >>>>> order.
> >>>>>
> >>>>> Personally, I think that continuing the separation of 9533 and 9392
> will
> >>>>> do this effort a disservice. There doesn't seem to be enough
> differences
> >>>>> between the two to justify separate jiras anymore. It may be best to
> >>> file a
> >>>>> new one that reflects a single vision without the extra cruft that
> has
> >>>>> built up in either of the existing ones. We would certainly reference
> >>> the
> >>>>> existing ones within the new one. This approach would align with the
> >>> spirit
> >>>>> of the discussions up to this point.
> >>>>>
> >>>>> I am prepared to start a discussion around the shape of the two
> Hadoop
> >>> SSO
> >>>>> tokens: identity and access. If this is what others feel the next
> topic
> >>>>> should be.
> >>>>> If we can identify a jira home for it, we can do it there -
> otherwise we
> >>>>> can create another DISCUSS thread for it.
> >>>>>
> >>>>> thanks,
> >>>>>
> >>>>> --larry
> >>>>>
> >>>>>
> >>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com>
> wrote:
> >>>>>
> >>>>>> Hi Larry,
> >>>>>>
> >>>>>> Thanks for the update. Good to see that with this update we are now
> >>>>> aligned on most points.
> >>>>>>
> >>>>>> I have also updated our TokenAuth design in HADOOP-9392. The new
> >>>>> revision incorporates feedback and suggestions in related discussion
> >>> with
> >>>>> the community, particularly from Microsoft and others attending the
> >>>>> Security design lounge session at the Hadoop summit. Summary of the
> >>> changes:
> >>>>>> 1.    Revised the approach to now use two tokens, Identity Token
> plus
> >>>>> Access Token, particularly considering our authorization framework
> and
> >>>>> compatibility with HSSO;
> >>>>>> 2.    Introduced Authorization Server (AS) from our authorization
> >>>>> framework into the flow that issues access tokens for clients with
> >>> identity
> >>>>> tokens to access services;
> >>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
> >>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop
> web
> >>>>> services;
> >>>>>> 5.    Added Hadoop RPC access flow regard
> >>
> >>
> >>
> >> --
> >> Best regards,
> >>
> >>  - Andy
> >>
> >> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> >> (via Tom White)
> >
>
>


-- 
Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Daryn Sharp <da...@yahoo-inc.com>.

Sorry for falling out of the loop.  I'm catching up the jiras and discussion, and will comment this afternoon.

Daryn

On Jul 10, 2013, at 8:42 AM, Larry McCay wrote:

> All -
> 
> After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with:
> 
> 1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle)
> 2. An actual Hadoop Token format (2 votes: Brian and myself)
> 
> I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. 
> 
> @Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well.
> @Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code.
> I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth.
> 
> These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision.
> 
> In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask for this particular aspect/s - I will leave that detail to Kai.
> 
> @Alejandro, being the only voice on this thread that isn't represented in the votes above, please feel free to agree or disagree with this direction.
> 
> thanks,
> 
> --larry
> 
> On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:
> 
>> Hi Andy -
>> 
>>> Happy Fourth of July to you and yours.
>> 
>> Same to you and yours. :-)
>> We had some fun in the sun for a change - we've had nothing but rain on the east coast lately.
>> 
>>> My concern here is there may have been a misinterpretation or lack of
>>> consensus on what is meant by "clean slate"
>> 
>> 
>> Apparently so.
>> On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from.
>> 
>> You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate.
>> We seemed to do this at the summit session quite well.
>> It was my understanding that this community discussion would live beyond the summit and continue on this list.
>> 
>> While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts.
>> 
>> I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here.
>> 
>> If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying to facilitate a community discussion that is apparently not wanted.
>> 
>>> Nowhere in this
>>> picture are self appointed "master JIRAs" and such, which have been
>>> disappointing to see crop up, we should be collaboratively coding not
>>> planting flags.
>> 
>> I don't know what you mean by self-appointed master JIRAs.
>> It has certainly not been anyone's intention to disappoint.
>> Any mention of a new JIRA was just to have a clear context to gather the agreed upon points - previous and/or existing JIRAs would easily be linked.
>> 
>> Planting flags… I need to go back and read my discussion point about the JIRA and see how this is the impression that was made.
>> That is not how I define success. The only flags that count is code. What we are lacking is the roadmap on which to put the code.
>> 
>>> I read Kai's latest document as something approaching today's consensus (or
>>> at least a common point of view?) rather than a historical document.
>>> Perhaps he and it can be given equal share of the consideration.
>> 
>> I definitely read it as something that has evolved into something approaching what we have been talking about so far. There has not however been enough discussion anywhere near the level of detail in that document and more details are needed for each component in the design. 
>> Why the work in that document should not be fed into the community discussion as anyone else's would be - I fail to understand.
>> 
>> My suggestion continues to be that you should take that document and speak to the inventory of moving parts as we agreed.
>> As these are agreed upon, we will ensure that the appropriate subtasks are filed against whatever JIRA is to host them - don't really care much which it is.
>> 
>> I don't really want to continue with two separate JIRAs - as I stated long ago - but until we understand what the pieces are and how they relate then they can't be consolidated.
>> Even if 9533 ended up being repurposed as the server instance of the work - it should be a subtask of a larger one - if that is to be 9392, so be it.
>> We still need to define all the pieces of the larger picture before that can be done.
>> 
>> What I thought was the clean slate approach to the discussion seemed a very reasonable way to make all this happen.
>> If you would like to restate what you intended by it or something else equally as reasonable as a way to move forward that would be awesome.
>> 
>> I will be happy to work toward the roadmap with everyone once it is articulated, understood and actionable.
>> In the meantime, I have work to do.
>> 
>> thanks,
>> 
>> --larry
>> 
>> BTW - I meant to quote you in an earlier response and ended up saying it was Aaron instead. Not sure what happened there. :-) 
>> 
>> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
>> 
>>> Hi Larry (and all),
>>> 
>>> Happy Fourth of July to you and yours.
>>> 
>>> In our shop Kai and Tianyou are already doing the coding, so I'd defer to
>>> them on the detailed points.
>>> 
>>> My concern here is there may have been a misinterpretation or lack of
>>> consensus on what is meant by "clean slate". Hopefully that can be quickly
>>> cleared up. Certainly we did not mean ignore all that came before. The idea
>>> was to reset discussions to find common ground and new direction where we
>>> are working together, not in conflict, on an agreed upon set of design
>>> points and tasks. There's been a lot of good discussion and design
>>> preceeding that we should figure out how to port over. Nowhere in this
>>> picture are self appointed "master JIRAs" and such, which have been
>>> disappointing to see crop up, we should be collaboratively coding not
>>> planting flags.
>>> 
>>> I read Kai's latest document as something approaching today's consensus (or
>>> at least a common point of view?) rather than a historical document.
>>> Perhaps he and it can be given equal share of the consideration.
>>> 
>>> 
>>> On Wednesday, July 3, 2013, Larry McCay wrote:
>>> 
>>>> Hey Andrew -
>>>> 
>>>> I largely agree with that statement.
>>>> My intention was to let the differences be worked out within the
>>>> individual components once they were identified and subtasks created.
>>>> 
>>>> My reference to HSSO was really referring to a SSO *server* based design
>>>> which was not clearly articulated in the earlier documents.
>>>> We aren't trying to compare and contrast one design over another anymore.
>>>> 
>>>> Let's move this collaboration along as we've mapped out and the
>>>> differences in the details will reveal themselves and be addressed within
>>>> their components.
>>>> 
>>>> I've actually been looking forward to you weighing in on the actual
>>>> discussion points in this thread.
>>>> Could you do that?
>>>> 
>>>> At this point, I am most interested in your thoughts on a single jira to
>>>> represent all of this work and whether we should start discussing the SSO
>>>> Tokens.
>>>> If you think there are discussion points missing from that list, feel free
>>>> to add to it.
>>>> 
>>>> thanks,
>>>> 
>>>> --larry
>>>> 
>>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
>>>> 
>>>>> Hi Larry,
>>>>> 
>>>>> Of course I'll let Kai speak for himself. However, let me point out that,
>>>>> while the differences between the competing JIRAs have been reduced for
>>>>> sure, there were some key differences that didn't just disappear.
>>>>> Subsequent discussion will make that clear. I also disagree with your
>>>>> characterization that we have simply endorsed all of the design decisions
>>>>> of the so-called HSSO, this is taking a mile from an inch. We are here to
>>>>> engage in a collaborative process as peers. I've been encouraged by the
>>>>> spirit of the discussions up to this point and hope that can continue
>>>>> beyond one design summit.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
>>>> wrote:
>>>>> 
>>>>>> Hi Kai -
>>>>>> 
>>>>>> I think that I need to clarify something…
>>>>>> 
>>>>>> This is not an update for 9533 but a continuation of the discussions
>>>> that
>>>>>> are focused on a fresh look at a SSO for Hadoop.
>>>>>> We've agreed to leave our previous designs behind and therefore we
>>>> aren't
>>>>>> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs
>>>>>> TAS discussion.
>>>>>> 
>>>>>> Your latest design revision actually makes it clear that you are now
>>>>>> targeting exactly what was described as HSSO - so comparing and
>>>> contrasting
>>>>>> is not going to add any value.
>>>>>> 
>>>>>> What we need you to do at this point, is to look at those high-level
>>>>>> components described on this thread and comment on whether we need
>>>>>> additional components or any that are listed that don't seem necessary
>>>> to
>>>>>> you and why.
>>>>>> In other words, we need to define and agree on the work that has to be
>>>>>> done.
>>>>>> 
>>>>>> We also need to determine those components that need to be done before
>>>>>> anything else can be started.
>>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
>>>> all
>>>>>> the other components and should probably be defined and POC'd in short
>>>>>> order.
>>>>>> 
>>>>>> Personally, I think that continuing the separation of 9533 and 9392 will
>>>>>> do this effort a disservice. There doesn't seem to be enough differences
>>>>>> between the two to justify separate jiras anymore. It may be best to
>>>> file a
>>>>>> new one that reflects a single vision without the extra cruft that has
>>>>>> built up in either of the existing ones. We would certainly reference
>>>> the
>>>>>> existing ones within the new one. This approach would align with the
>>>> spirit
>>>>>> of the discussions up to this point.
>>>>>> 
>>>>>> I am prepared to start a discussion around the shape of the two Hadoop
>>>> SSO
>>>>>> tokens: identity and access. If this is what others feel the next topic
>>>>>> should be.
>>>>>> If we can identify a jira home for it, we can do it there - otherwise we
>>>>>> can create another DISCUSS thread for it.
>>>>>> 
>>>>>> thanks,
>>>>>> 
>>>>>> --larry
>>>>>> 
>>>>>> 
>>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
>>>>>> 
>>>>>>> Hi Larry,
>>>>>>> 
>>>>>>> Thanks for the update. Good to see that with this update we are now
>>>>>> aligned on most points.
>>>>>>> 
>>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The new
>>>>>> revision incorporates feedback and suggestions in related discussion
>>>> with
>>>>>> the community, particularly from Microsoft and others attending the
>>>>>> Security design lounge session at the Hadoop summit. Summary of the
>>>> changes:
>>>>>>> 1.    Revised the approach to now use two tokens, Identity Token plus
>>>>>> Access Token, particularly considering our authorization framework and
>>>>>> compatibility with HSSO;
>>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
>>>>>> framework into the flow that issues access tokens for clients with
>>>> identity
>>>>>> tokens to access services;
>>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
>>>>>> services;
>>>>>>> 5.    Added Hadoop RPC access flow regard
>>> 
>>> 
>>> 
>>> -- 
>>> Best regards,
>>> 
>>> - Andy
>>> 
>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>> (via Tom White)
>> 
>

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

All -

After combing through this thread - as well as the summit session summary thread, I think that we have the following two items that we can probably move forward with:

1. TokenAuth method - assuming this means the pluggable authentication mechanisms within the RPC layer (2 votes: Kai and Kyle)
2. An actual Hadoop Token format (2 votes: Brian and myself)

I propose that we attack both of these aspects as one. Let's provide the structure and interfaces of the pluggable framework for use in the RPC layer through leveraging Daryn's pluggability work and POC it with a particular token format (not necessarily the only format ever supported - we just need one to start). If there has already been work done in this area by anyone then please speak up and commit to providing a patch - so that we don't duplicate effort. 

@Daryn - is there a particular Jira or set of Jiras that we can look at to discern the pluggability mechanism details? Documentation of it would be great as well.
@Kai - do you have existing code for the pluggable token authentication mechanism - if not, we can take a stab at representing it with interfaces and/or POC code.
I can standup and say that we have a token format that we have been working with already and can provide a patch that represents it as a contribution to test out the pluggable tokenAuth.

These patches will provide progress toward code being the central discussion vehicle. As a community, we can then incrementally build on that foundation in order to collaboratively deliver the common vision.

In the absence of any other home for posting such patches, let's assume that they will be attached to HADOOP-9392 - or a dedicated subtask for this particular aspect/s - I will leave that detail to Kai.

@Alejandro, being the only voice on this thread that isn't represented in the votes above, please feel free to agree or disagree with this direction.

thanks,

--larry

On Jul 5, 2013, at 3:24 PM, Larry McCay <lm...@hortonworks.com> wrote:

> Hi Andy -
> 
>> Happy Fourth of July to you and yours.
> 
> Same to you and yours. :-)
> We had some fun in the sun for a change - we've had nothing but rain on the east coast lately.
> 
>> My concern here is there may have been a misinterpretation or lack of
>> consensus on what is meant by "clean slate"
> 
> 
> Apparently so.
> On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from.
> 
> You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate.
> We seemed to do this at the summit session quite well.
> It was my understanding that this community discussion would live beyond the summit and continue on this list.
> 
> While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts.
> 
> I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here.
> 
> If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying to facilitate a community discussion that is apparently not wanted.
> 
>> Nowhere in this
>> picture are self appointed "master JIRAs" and such, which have been
>> disappointing to see crop up, we should be collaboratively coding not
>> planting flags.
> 
> I don't know what you mean by self-appointed master JIRAs.
> It has certainly not been anyone's intention to disappoint.
> Any mention of a new JIRA was just to have a clear context to gather the agreed upon points - previous and/or existing JIRAs would easily be linked.
> 
> Planting flags… I need to go back and read my discussion point about the JIRA and see how this is the impression that was made.
> That is not how I define success. The only flags that count is code. What we are lacking is the roadmap on which to put the code.
> 
>> I read Kai's latest document as something approaching today's consensus (or
>> at least a common point of view?) rather than a historical document.
>> Perhaps he and it can be given equal share of the consideration.
> 
> I definitely read it as something that has evolved into something approaching what we have been talking about so far. There has not however been enough discussion anywhere near the level of detail in that document and more details are needed for each component in the design. 
> Why the work in that document should not be fed into the community discussion as anyone else's would be - I fail to understand.
> 
> My suggestion continues to be that you should take that document and speak to the inventory of moving parts as we agreed.
> As these are agreed upon, we will ensure that the appropriate subtasks are filed against whatever JIRA is to host them - don't really care much which it is.
> 
> I don't really want to continue with two separate JIRAs - as I stated long ago - but until we understand what the pieces are and how they relate then they can't be consolidated.
> Even if 9533 ended up being repurposed as the server instance of the work - it should be a subtask of a larger one - if that is to be 9392, so be it.
> We still need to define all the pieces of the larger picture before that can be done.
> 
> What I thought was the clean slate approach to the discussion seemed a very reasonable way to make all this happen.
> If you would like to restate what you intended by it or something else equally as reasonable as a way to move forward that would be awesome.
> 
> I will be happy to work toward the roadmap with everyone once it is articulated, understood and actionable.
> In the meantime, I have work to do.
> 
> thanks,
> 
> --larry
> 
> BTW - I meant to quote you in an earlier response and ended up saying it was Aaron instead. Not sure what happened there. :-) 
> 
> On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:
> 
>> Hi Larry (and all),
>> 
>> Happy Fourth of July to you and yours.
>> 
>> In our shop Kai and Tianyou are already doing the coding, so I'd defer to
>> them on the detailed points.
>> 
>> My concern here is there may have been a misinterpretation or lack of
>> consensus on what is meant by "clean slate". Hopefully that can be quickly
>> cleared up. Certainly we did not mean ignore all that came before. The idea
>> was to reset discussions to find common ground and new direction where we
>> are working together, not in conflict, on an agreed upon set of design
>> points and tasks. There's been a lot of good discussion and design
>> preceeding that we should figure out how to port over. Nowhere in this
>> picture are self appointed "master JIRAs" and such, which have been
>> disappointing to see crop up, we should be collaboratively coding not
>> planting flags.
>> 
>> I read Kai's latest document as something approaching today's consensus (or
>> at least a common point of view?) rather than a historical document.
>> Perhaps he and it can be given equal share of the consideration.
>> 
>> 
>> On Wednesday, July 3, 2013, Larry McCay wrote:
>> 
>>> Hey Andrew -
>>> 
>>> I largely agree with that statement.
>>> My intention was to let the differences be worked out within the
>>> individual components once they were identified and subtasks created.
>>> 
>>> My reference to HSSO was really referring to a SSO *server* based design
>>> which was not clearly articulated in the earlier documents.
>>> We aren't trying to compare and contrast one design over another anymore.
>>> 
>>> Let's move this collaboration along as we've mapped out and the
>>> differences in the details will reveal themselves and be addressed within
>>> their components.
>>> 
>>> I've actually been looking forward to you weighing in on the actual
>>> discussion points in this thread.
>>> Could you do that?
>>> 
>>> At this point, I am most interested in your thoughts on a single jira to
>>> represent all of this work and whether we should start discussing the SSO
>>> Tokens.
>>> If you think there are discussion points missing from that list, feel free
>>> to add to it.
>>> 
>>> thanks,
>>> 
>>> --larry
>>> 
>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
>>> 
>>>> Hi Larry,
>>>> 
>>>> Of course I'll let Kai speak for himself. However, let me point out that,
>>>> while the differences between the competing JIRAs have been reduced for
>>>> sure, there were some key differences that didn't just disappear.
>>>> Subsequent discussion will make that clear. I also disagree with your
>>>> characterization that we have simply endorsed all of the design decisions
>>>> of the so-called HSSO, this is taking a mile from an inch. We are here to
>>>> engage in a collaborative process as peers. I've been encouraged by the
>>>> spirit of the discussions up to this point and hope that can continue
>>>> beyond one design summit.
>>>> 
>>>> 
>>>> 
>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
>>> wrote:
>>>> 
>>>>> Hi Kai -
>>>>> 
>>>>> I think that I need to clarify something…
>>>>> 
>>>>> This is not an update for 9533 but a continuation of the discussions
>>> that
>>>>> are focused on a fresh look at a SSO for Hadoop.
>>>>> We've agreed to leave our previous designs behind and therefore we
>>> aren't
>>>>> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs
>>>>> TAS discussion.
>>>>> 
>>>>> Your latest design revision actually makes it clear that you are now
>>>>> targeting exactly what was described as HSSO - so comparing and
>>> contrasting
>>>>> is not going to add any value.
>>>>> 
>>>>> What we need you to do at this point, is to look at those high-level
>>>>> components described on this thread and comment on whether we need
>>>>> additional components or any that are listed that don't seem necessary
>>> to
>>>>> you and why.
>>>>> In other words, we need to define and agree on the work that has to be
>>>>> done.
>>>>> 
>>>>> We also need to determine those components that need to be done before
>>>>> anything else can be started.
>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
>>> all
>>>>> the other components and should probably be defined and POC'd in short
>>>>> order.
>>>>> 
>>>>> Personally, I think that continuing the separation of 9533 and 9392 will
>>>>> do this effort a disservice. There doesn't seem to be enough differences
>>>>> between the two to justify separate jiras anymore. It may be best to
>>> file a
>>>>> new one that reflects a single vision without the extra cruft that has
>>>>> built up in either of the existing ones. We would certainly reference
>>> the
>>>>> existing ones within the new one. This approach would align with the
>>> spirit
>>>>> of the discussions up to this point.
>>>>> 
>>>>> I am prepared to start a discussion around the shape of the two Hadoop
>>> SSO
>>>>> tokens: identity and access. If this is what others feel the next topic
>>>>> should be.
>>>>> If we can identify a jira home for it, we can do it there - otherwise we
>>>>> can create another DISCUSS thread for it.
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> --larry
>>>>> 
>>>>> 
>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
>>>>> 
>>>>>> Hi Larry,
>>>>>> 
>>>>>> Thanks for the update. Good to see that with this update we are now
>>>>> aligned on most points.
>>>>>> 
>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The new
>>>>> revision incorporates feedback and suggestions in related discussion
>>> with
>>>>> the community, particularly from Microsoft and others attending the
>>>>> Security design lounge session at the Hadoop summit. Summary of the
>>> changes:
>>>>>> 1.    Revised the approach to now use two tokens, Identity Token plus
>>>>> Access Token, particularly considering our authorization framework and
>>>>> compatibility with HSSO;
>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
>>>>> framework into the flow that issues access tokens for clients with
>>> identity
>>>>> tokens to access services;
>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
>>>>> services;
>>>>>> 5.    Added Hadoop RPC access flow regard
>> 
>> 
>> 
>> -- 
>> Best regards,
>> 
>>  - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Hi Andy -

> Happy Fourth of July to you and yours.

Same to you and yours. :-)
We had some fun in the sun for a change - we've had nothing but rain on the east coast lately.

> My concern here is there may have been a misinterpretation or lack of
> consensus on what is meant by "clean slate"

Apparently so.
On the pre-summit call, I stated that I was interested in reconciling the jiras so that we had one to work from.

You recommended that we set them aside for the time being - with the understanding that work would continue on your side (and our's as well) - and approach the community discussion from a clean slate.
We seemed to do this at the summit session quite well.
It was my understanding that this community discussion would live beyond the summit and continue on this list.

While closing the summit session we agreed to follow up on common-dev with first a summary then a discussion of the moving parts.

I never expected the previous work to be abandoned and fully expected it to inform the discussion that happened here.

If you would like to reframe what clean slate was supposed to mean or describe what it means now - that would be welcome - before I waste anymore time trying to facilitate a community discussion that is apparently not wanted.

> Nowhere in this
> picture are self appointed "master JIRAs" and such, which have been
> disappointing to see crop up, we should be collaboratively coding not
> planting flags.

I don't know what you mean by self-appointed master JIRAs.
It has certainly not been anyone's intention to disappoint.
Any mention of a new JIRA was just to have a clear context to gather the agreed upon points - previous and/or existing JIRAs would easily be linked.

Planting flags… I need to go back and read my discussion point about the JIRA and see how this is the impression that was made.
That is not how I define success. The only flags that count is code. What we are lacking is the roadmap on which to put the code.

> I read Kai's latest document as something approaching today's consensus (or
> at least a common point of view?) rather than a historical document.
> Perhaps he and it can be given equal share of the consideration.

I definitely read it as something that has evolved into something approaching what we have been talking about so far. There has not however been enough discussion anywhere near the level of detail in that document and more details are needed for each component in the design. 
Why the work in that document should not be fed into the community discussion as anyone else's would be - I fail to understand.

My suggestion continues to be that you should take that document and speak to the inventory of moving parts as we agreed.
As these are agreed upon, we will ensure that the appropriate subtasks are filed against whatever JIRA is to host them - don't really care much which it is.

I don't really want to continue with two separate JIRAs - as I stated long ago - but until we understand what the pieces are and how they relate then they can't be consolidated.
Even if 9533 ended up being repurposed as the server instance of the work - it should be a subtask of a larger one - if that is to be 9392, so be it.
We still need to define all the pieces of the larger picture before that can be done.

What I thought was the clean slate approach to the discussion seemed a very reasonable way to make all this happen.
If you would like to restate what you intended by it or something else equally as reasonable as a way to move forward that would be awesome.

I will be happy to work toward the roadmap with everyone once it is articulated, understood and actionable.
In the meantime, I have work to do.

thanks,

--larry

BTW - I meant to quote you in an earlier response and ended up saying it was Aaron instead. Not sure what happened there. :-) 

On Jul 4, 2013, at 2:40 PM, Andrew Purtell <ap...@apache.org> wrote:

> Hi Larry (and all),
> 
> Happy Fourth of July to you and yours.
> 
> In our shop Kai and Tianyou are already doing the coding, so I'd defer to
> them on the detailed points.
> 
> My concern here is there may have been a misinterpretation or lack of
> consensus on what is meant by "clean slate". Hopefully that can be quickly
> cleared up. Certainly we did not mean ignore all that came before. The idea
> was to reset discussions to find common ground and new direction where we
> are working together, not in conflict, on an agreed upon set of design
> points and tasks. There's been a lot of good discussion and design
> preceeding that we should figure out how to port over. Nowhere in this
> picture are self appointed "master JIRAs" and such, which have been
> disappointing to see crop up, we should be collaboratively coding not
> planting flags.
> 
> I read Kai's latest document as something approaching today's consensus (or
> at least a common point of view?) rather than a historical document.
> Perhaps he and it can be given equal share of the consideration.
> 
> 
> On Wednesday, July 3, 2013, Larry McCay wrote:
> 
>> Hey Andrew -
>> 
>> I largely agree with that statement.
>> My intention was to let the differences be worked out within the
>> individual components once they were identified and subtasks created.
>> 
>> My reference to HSSO was really referring to a SSO *server* based design
>> which was not clearly articulated in the earlier documents.
>> We aren't trying to compare and contrast one design over another anymore.
>> 
>> Let's move this collaboration along as we've mapped out and the
>> differences in the details will reveal themselves and be addressed within
>> their components.
>> 
>> I've actually been looking forward to you weighing in on the actual
>> discussion points in this thread.
>> Could you do that?
>> 
>> At this point, I am most interested in your thoughts on a single jira to
>> represent all of this work and whether we should start discussing the SSO
>> Tokens.
>> If you think there are discussion points missing from that list, feel free
>> to add to it.
>> 
>> thanks,
>> 
>> --larry
>> 
>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
>> 
>>> Hi Larry,
>>> 
>>> Of course I'll let Kai speak for himself. However, let me point out that,
>>> while the differences between the competing JIRAs have been reduced for
>>> sure, there were some key differences that didn't just disappear.
>>> Subsequent discussion will make that clear. I also disagree with your
>>> characterization that we have simply endorsed all of the design decisions
>>> of the so-called HSSO, this is taking a mile from an inch. We are here to
>>> engage in a collaborative process as peers. I've been encouraged by the
>>> spirit of the discussions up to this point and hope that can continue
>>> beyond one design summit.
>>> 
>>> 
>>> 
>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
>> wrote:
>>> 
>>>> Hi Kai -
>>>> 
>>>> I think that I need to clarify something…
>>>> 
>>>> This is not an update for 9533 but a continuation of the discussions
>> that
>>>> are focused on a fresh look at a SSO for Hadoop.
>>>> We've agreed to leave our previous designs behind and therefore we
>> aren't
>>>> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs
>>>> TAS discussion.
>>>> 
>>>> Your latest design revision actually makes it clear that you are now
>>>> targeting exactly what was described as HSSO - so comparing and
>> contrasting
>>>> is not going to add any value.
>>>> 
>>>> What we need you to do at this point, is to look at those high-level
>>>> components described on this thread and comment on whether we need
>>>> additional components or any that are listed that don't seem necessary
>> to
>>>> you and why.
>>>> In other words, we need to define and agree on the work that has to be
>>>> done.
>>>> 
>>>> We also need to determine those components that need to be done before
>>>> anything else can be started.
>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
>> all
>>>> the other components and should probably be defined and POC'd in short
>>>> order.
>>>> 
>>>> Personally, I think that continuing the separation of 9533 and 9392 will
>>>> do this effort a disservice. There doesn't seem to be enough differences
>>>> between the two to justify separate jiras anymore. It may be best to
>> file a
>>>> new one that reflects a single vision without the extra cruft that has
>>>> built up in either of the existing ones. We would certainly reference
>> the
>>>> existing ones within the new one. This approach would align with the
>> spirit
>>>> of the discussions up to this point.
>>>> 
>>>> I am prepared to start a discussion around the shape of the two Hadoop
>> SSO
>>>> tokens: identity and access. If this is what others feel the next topic
>>>> should be.
>>>> If we can identify a jira home for it, we can do it there - otherwise we
>>>> can create another DISCUSS thread for it.
>>>> 
>>>> thanks,
>>>> 
>>>> --larry
>>>> 
>>>> 
>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
>>>> 
>>>>> Hi Larry,
>>>>> 
>>>>> Thanks for the update. Good to see that with this update we are now
>>>> aligned on most points.
>>>>> 
>>>>> I have also updated our TokenAuth design in HADOOP-9392. The new
>>>> revision incorporates feedback and suggestions in related discussion
>> with
>>>> the community, particularly from Microsoft and others attending the
>>>> Security design lounge session at the Hadoop summit. Summary of the
>> changes:
>>>>> 1.    Revised the approach to now use two tokens, Identity Token plus
>>>> Access Token, particularly considering our authorization framework and
>>>> compatibility with HSSO;
>>>>> 2.    Introduced Authorization Server (AS) from our authorization
>>>> framework into the flow that issues access tokens for clients with
>> identity
>>>> tokens to access services;
>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
>>>> services;
>>>>> 5.    Added Hadoop RPC access flow regard
> 
> 
> 
> -- 
> Best regards,
> 
>   - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by "Zheng, Kai" <ka...@intel.com>.

Hi Alejandro,

Thanks for our summary and points. No correction I'm having and just some updates from our side for further discussion. 

> we should make sure that UserGroupInformation and RPC security logic work with a pluggable GSS implementation.
Right. I'm working on implementing a token authn method in current Hadoop RPC and SASL framework, and changing the UGI class.

> Create a common security component ie 'hadoop-security' to be 'the' security lib for all projects to use.
Sure we will put our codes for the new AuthN & AuthZ frameworks into the 'hadoop-security' component for the ecosystem. I guess this component should be a collection of related projects and it's in line with hadoop-common right?

As we might agree that the key to all of these is to implement the token authentication method for client to service to start with. Hopefully I can finish and provide my working codes as a patch for the discussion.

Thanks & regards,
Kai

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com] 
Sent: Friday, July 05, 2013 4:09 AM
To: common-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

Leaving JIRAs and design docs aside, my recollection from the f2f lounge discussion could be summarized as:

------
1* Decouple users-services authentication from (intra) services-services authentication.

The main motivation for this is to get pluggable authentication and integrated SSO experience for users.

(we never discussed if this is needed for external-apps talking with Hadoop)

2* We should leave the Hadoop delegation tokens alone

No need to make this pluggable as this is an internal authentication mechanism after the 'real' authentication happened.

(this is independent from factoring out all classes we currently have into a common implementation for Hadoop and other projects to use)

3* Being able to replace kerberos with something else for (intra) services-services authentication.

It was suggested that to support deployments where stock Kerberos may not be an option (i.e. cloud) we should make sure that UserGroupInformation and RPC security logic work with a pluggable GSS implementation.

4* Create a common security component ie 'hadoop-security' to be 'the'
security lib for all projects to use.

Create a component/project that would provide the common security pieces for all projects to use.

------

If we agree with this, after any necessary corrections, I think we could distill clear goals from it and start from there.

Thanks.

Tucu & Alejandro

On Thu, Jul 4, 2013 at 11:40 AM, Andrew Purtell <ap...@apache.org> wrote:

> Hi Larry (and all),
>
> Happy Fourth of July to you and yours.
>
> In our shop Kai and Tianyou are already doing the coding, so I'd defer 
> to them on the detailed points.
>
> My concern here is there may have been a misinterpretation or lack of 
> consensus on what is meant by "clean slate". Hopefully that can be 
> quickly cleared up. Certainly we did not mean ignore all that came 
> before. The idea was to reset discussions to find common ground and 
> new direction where we are working together, not in conflict, on an 
> agreed upon set of design points and tasks. There's been a lot of good 
> discussion and design preceeding that we should figure out how to port 
> over. Nowhere in this picture are self appointed "master JIRAs" and 
> such, which have been disappointing to see crop up, we should be 
> collaboratively coding not planting flags.
>
> I read Kai's latest document as something approaching today's 
> consensus (or at least a common point of view?) rather than a historical document.
> Perhaps he and it can be given equal share of the consideration.
>
>
> On Wednesday, July 3, 2013, Larry McCay wrote:
>
> > Hey Andrew -
> >
> > I largely agree with that statement.
> > My intention was to let the differences be worked out within the 
> > individual components once they were identified and subtasks created.
> >
> > My reference to HSSO was really referring to a SSO *server* based 
> > design which was not clearly articulated in the earlier documents.
> > We aren't trying to compare and contrast one design over another anymore.
> >
> > Let's move this collaboration along as we've mapped out and the 
> > differences in the details will reveal themselves and be addressed 
> > within their components.
> >
> > I've actually been looking forward to you weighing in on the actual 
> > discussion points in this thread.
> > Could you do that?
> >
> > At this point, I am most interested in your thoughts on a single 
> > jira to represent all of this work and whether we should start 
> > discussing the SSO Tokens.
> > If you think there are discussion points missing from that list, 
> > feel
> free
> > to add to it.
> >
> > thanks,
> >
> > --larry
> >
> > On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
> >
> > > Hi Larry,
> > >
> > > Of course I'll let Kai speak for himself. However, let me point 
> > > out
> that,
> > > while the differences between the competing JIRAs have been 
> > > reduced for sure, there were some key differences that didn't just disappear.
> > > Subsequent discussion will make that clear. I also disagree with 
> > > your characterization that we have simply endorsed all of the 
> > > design
> decisions
> > > of the so-called HSSO, this is taking a mile from an inch. We are 
> > > here
> to
> > > engage in a collaborative process as peers. I've been encouraged 
> > > by the spirit of the discussions up to this point and hope that 
> > > can continue beyond one design summit.
> > >
> > >
> > >
> > > On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay 
> > > <lm...@hortonworks.com>
> > wrote:
> > >
> > >> Hi Kai -
> > >>
> > >> I think that I need to clarify something...
> > >>
> > >> This is not an update for 9533 but a continuation of the 
> > >> discussions
> > that
> > >> are focused on a fresh look at a SSO for Hadoop.
> > >> We've agreed to leave our previous designs behind and therefore 
> > >> we
> > aren't
> > >> really seeing it as an HSSO layered on top of TAS approach or an 
> > >> HSSO
> vs
> > >> TAS discussion.
> > >>
> > >> Your latest design revision actually makes it clear that you are 
> > >> now targeting exactly what was described as HSSO - so comparing 
> > >> and
> > contrasting
> > >> is not going to add any value.
> > >>
> > >> What we need you to do at this point, is to look at those 
> > >> high-level components described on this thread and comment on 
> > >> whether we need additional components or any that are listed that 
> > >> don't seem necessary
> > to
> > >> you and why.
> > >> In other words, we need to define and agree on the work that has 
> > >> to be done.
> > >>
> > >> We also need to determine those components that need to be done 
> > >> before anything else can be started.
> > >> I happen to agree with Brian that #4 Hadoop SSO Tokens are 
> > >> central to
> > all
> > >> the other components and should probably be defined and POC'd in 
> > >> short order.
> > >>
> > >> Personally, I think that continuing the separation of 9533 and 
> > >> 9392
> will
> > >> do this effort a disservice. There doesn't seem to be enough
> differences
> > >> between the two to justify separate jiras anymore. It may be best 
> > >> to
> > file a
> > >> new one that reflects a single vision without the extra cruft 
> > >> that has built up in either of the existing ones. We would 
> > >> certainly reference
> > the
> > >> existing ones within the new one. This approach would align with 
> > >> the
> > spirit
> > >> of the discussions up to this point.
> > >>
> > >> I am prepared to start a discussion around the shape of the two 
> > >> Hadoop
> > SSO
> > >> tokens: identity and access. If this is what others feel the next
> topic
> > >> should be.
> > >> If we can identify a jira home for it, we can do it there - 
> > >> otherwise
> we
> > >> can create another DISCUSS thread for it.
> > >>
> > >> thanks,
> > >>
> > >> --larry
> > >>
> > >>
> > >> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
> > >>
> > >>> Hi Larry,
> > >>>
> > >>> Thanks for the update. Good to see that with this update we are 
> > >>> now
> > >> aligned on most points.
> > >>>
> > >>> I have also updated our TokenAuth design in HADOOP-9392. The new
> > >> revision incorporates feedback and suggestions in related 
> > >> discussion
> > with
> > >> the community, particularly from Microsoft and others attending 
> > >> the Security design lounge session at the Hadoop summit. Summary 
> > >> of the
> > changes:
> > >>> 1.    Revised the approach to now use two tokens, Identity Token plus
> > >> Access Token, particularly considering our authorization 
> > >> framework and compatibility with HSSO;
> > >>> 2.    Introduced Authorization Server (AS) from our authorization
> > >> framework into the flow that issues access tokens for clients 
> > >> with
> > identity
> > >> tokens to access services;
> > >>> 3.    Refined proxy access token and the proxy/impersonation flow;
> > >>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
> > >> services;
> > >>> 5.    Added Hadoop RPC access flow regard
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet 
> Hein (via Tom White)
>



--
Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Hi Alejandro -

I missed your #4 in my summary and takeaways of the session in another thread on this list.

I believe that the points of discussion were along the lines of:

* put common security libraries into common much the same way as hadoop-auth is today making each available as separate maven modules to be used across the ecosystem
* the was a concern raised that we need to be cognizant of not using common as a "dumping grounds"
	- I believe this to mean that we need to ensure that the libraries that are added there are truly cross cutting and can be used by the other projects across Hadoop
	- I think that security related things will largely be of that nature but we need to keep it in mind

I'm not sure whether #3 is represented in the other summary or not…

There was certainly discussions around the emerging work from Daryn related to pluggable authentication mechanisms within that layer and we will immediately have the options of kerberos, simple and plain. There was also talk of how this can be leveraged to introduce a Hadoop token mechanism as well. 

At the same time, there was talk of the possibility of simply making kerberos easy and a non-issue for intra-cluster use. Certainly we need both of these approaches.
I believe someone used ApacheDS' KDC support as an example - if we could standup an ApacheDS based KDC and configure it and related keytabs easily than the end-to-end story is more palatable to a broader user base. That story being the choice of authentication mechanisms for user authentication and easy provisioning and management of kerberos for intra-cluster service authentication.

If you agree with this extended summary then I can update the other thread with that recollection.
Thanks for providing it!

--larry

On Jul 4, 2013, at 4:09 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> Leaving JIRAs and design docs aside, my recollection from the f2f lounge
> discussion could be summarized as:
> 
> ------
> 1* Decouple users-services authentication from (intra) services-services
> authentication.
> 
> The main motivation for this is to get pluggable authentication and
> integrated SSO experience for users.
> 
> (we never discussed if this is needed for external-apps talking with Hadoop)
> 
> 2* We should leave the Hadoop delegation tokens alone
> 
> No need to make this pluggable as this is an internal authentication
> mechanism after the 'real' authentication happened.
> 
> (this is independent from factoring out all classes we currently have into
> a common implementation for Hadoop and other projects to use)
> 
> 3* Being able to replace kerberos with something else for (intra)
> services-services authentication.
> 
> It was suggested that to support deployments where stock Kerberos may not
> be an option (i.e. cloud) we should make sure that UserGroupInformation and
> RPC security logic work with a pluggable GSS implementation.
> 
> 4* Create a common security component ie 'hadoop-security' to be 'the'
> security lib for all projects to use.
> 
> Create a component/project that would provide the common security pieces
> for all projects to use.
> 
> ------
> 
> If we agree with this, after any necessary corrections, I think we could
> distill clear goals from it and start from there.
> 
> Thanks.
> 
> Tucu & Alejandro
> 
> On Thu, Jul 4, 2013 at 11:40 AM, Andrew Purtell <ap...@apache.org> wrote:
> 
>> Hi Larry (and all),
>> 
>> Happy Fourth of July to you and yours.
>> 
>> In our shop Kai and Tianyou are already doing the coding, so I'd defer to
>> them on the detailed points.
>> 
>> My concern here is there may have been a misinterpretation or lack of
>> consensus on what is meant by "clean slate". Hopefully that can be quickly
>> cleared up. Certainly we did not mean ignore all that came before. The idea
>> was to reset discussions to find common ground and new direction where we
>> are working together, not in conflict, on an agreed upon set of design
>> points and tasks. There's been a lot of good discussion and design
>> preceeding that we should figure out how to port over. Nowhere in this
>> picture are self appointed "master JIRAs" and such, which have been
>> disappointing to see crop up, we should be collaboratively coding not
>> planting flags.
>> 
>> I read Kai's latest document as something approaching today's consensus (or
>> at least a common point of view?) rather than a historical document.
>> Perhaps he and it can be given equal share of the consideration.
>> 
>> 
>> On Wednesday, July 3, 2013, Larry McCay wrote:
>> 
>>> Hey Andrew -
>>> 
>>> I largely agree with that statement.
>>> My intention was to let the differences be worked out within the
>>> individual components once they were identified and subtasks created.
>>> 
>>> My reference to HSSO was really referring to a SSO *server* based design
>>> which was not clearly articulated in the earlier documents.
>>> We aren't trying to compare and contrast one design over another anymore.
>>> 
>>> Let's move this collaboration along as we've mapped out and the
>>> differences in the details will reveal themselves and be addressed within
>>> their components.
>>> 
>>> I've actually been looking forward to you weighing in on the actual
>>> discussion points in this thread.
>>> Could you do that?
>>> 
>>> At this point, I am most interested in your thoughts on a single jira to
>>> represent all of this work and whether we should start discussing the SSO
>>> Tokens.
>>> If you think there are discussion points missing from that list, feel
>> free
>>> to add to it.
>>> 
>>> thanks,
>>> 
>>> --larry
>>> 
>>> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
>>> 
>>>> Hi Larry,
>>>> 
>>>> Of course I'll let Kai speak for himself. However, let me point out
>> that,
>>>> while the differences between the competing JIRAs have been reduced for
>>>> sure, there were some key differences that didn't just disappear.
>>>> Subsequent discussion will make that clear. I also disagree with your
>>>> characterization that we have simply endorsed all of the design
>> decisions
>>>> of the so-called HSSO, this is taking a mile from an inch. We are here
>> to
>>>> engage in a collaborative process as peers. I've been encouraged by the
>>>> spirit of the discussions up to this point and hope that can continue
>>>> beyond one design summit.
>>>> 
>>>> 
>>>> 
>>>> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
>>> wrote:
>>>> 
>>>>> Hi Kai -
>>>>> 
>>>>> I think that I need to clarify something…
>>>>> 
>>>>> This is not an update for 9533 but a continuation of the discussions
>>> that
>>>>> are focused on a fresh look at a SSO for Hadoop.
>>>>> We've agreed to leave our previous designs behind and therefore we
>>> aren't
>>>>> really seeing it as an HSSO layered on top of TAS approach or an HSSO
>> vs
>>>>> TAS discussion.
>>>>> 
>>>>> Your latest design revision actually makes it clear that you are now
>>>>> targeting exactly what was described as HSSO - so comparing and
>>> contrasting
>>>>> is not going to add any value.
>>>>> 
>>>>> What we need you to do at this point, is to look at those high-level
>>>>> components described on this thread and comment on whether we need
>>>>> additional components or any that are listed that don't seem necessary
>>> to
>>>>> you and why.
>>>>> In other words, we need to define and agree on the work that has to be
>>>>> done.
>>>>> 
>>>>> We also need to determine those components that need to be done before
>>>>> anything else can be started.
>>>>> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
>>> all
>>>>> the other components and should probably be defined and POC'd in short
>>>>> order.
>>>>> 
>>>>> Personally, I think that continuing the separation of 9533 and 9392
>> will
>>>>> do this effort a disservice. There doesn't seem to be enough
>> differences
>>>>> between the two to justify separate jiras anymore. It may be best to
>>> file a
>>>>> new one that reflects a single vision without the extra cruft that has
>>>>> built up in either of the existing ones. We would certainly reference
>>> the
>>>>> existing ones within the new one. This approach would align with the
>>> spirit
>>>>> of the discussions up to this point.
>>>>> 
>>>>> I am prepared to start a discussion around the shape of the two Hadoop
>>> SSO
>>>>> tokens: identity and access. If this is what others feel the next
>> topic
>>>>> should be.
>>>>> If we can identify a jira home for it, we can do it there - otherwise
>> we
>>>>> can create another DISCUSS thread for it.
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> --larry
>>>>> 
>>>>> 
>>>>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
>>>>> 
>>>>>> Hi Larry,
>>>>>> 
>>>>>> Thanks for the update. Good to see that with this update we are now
>>>>> aligned on most points.
>>>>>> 
>>>>>> I have also updated our TokenAuth design in HADOOP-9392. The new
>>>>> revision incorporates feedback and suggestions in related discussion
>>> with
>>>>> the community, particularly from Microsoft and others attending the
>>>>> Security design lounge session at the Hadoop summit. Summary of the
>>> changes:
>>>>>> 1.    Revised the approach to now use two tokens, Identity Token plus
>>>>> Access Token, particularly considering our authorization framework and
>>>>> compatibility with HSSO;
>>>>>> 2.    Introduced Authorization Server (AS) from our authorization
>>>>> framework into the flow that issues access tokens for clients with
>>> identity
>>>>> tokens to access services;
>>>>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>>>>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
>>>>> services;
>>>>>> 5.    Added Hadoop RPC access flow regard
>> 
>> 
>> 
>> --
>> Best regards,
>> 
>>   - Andy
>> 
>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>> (via Tom White)
>> 
> 
> 
> 
> -- 
> Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Alejandro Abdelnur <tu...@cloudera.com>.

Leaving JIRAs and design docs aside, my recollection from the f2f lounge
discussion could be summarized as:

------
1* Decouple users-services authentication from (intra) services-services
authentication.

The main motivation for this is to get pluggable authentication and
integrated SSO experience for users.

(we never discussed if this is needed for external-apps talking with Hadoop)

2* We should leave the Hadoop delegation tokens alone

No need to make this pluggable as this is an internal authentication
mechanism after the 'real' authentication happened.

(this is independent from factoring out all classes we currently have into
a common implementation for Hadoop and other projects to use)

3* Being able to replace kerberos with something else for (intra)
services-services authentication.

It was suggested that to support deployments where stock Kerberos may not
be an option (i.e. cloud) we should make sure that UserGroupInformation and
RPC security logic work with a pluggable GSS implementation.

4* Create a common security component ie 'hadoop-security' to be 'the'
security lib for all projects to use.

Create a component/project that would provide the common security pieces
for all projects to use.

------

If we agree with this, after any necessary corrections, I think we could
distill clear goals from it and start from there.

Thanks.

Tucu & Alejandro

On Thu, Jul 4, 2013 at 11:40 AM, Andrew Purtell <ap...@apache.org> wrote:

> Hi Larry (and all),
>
> Happy Fourth of July to you and yours.
>
> In our shop Kai and Tianyou are already doing the coding, so I'd defer to
> them on the detailed points.
>
> My concern here is there may have been a misinterpretation or lack of
> consensus on what is meant by "clean slate". Hopefully that can be quickly
> cleared up. Certainly we did not mean ignore all that came before. The idea
> was to reset discussions to find common ground and new direction where we
> are working together, not in conflict, on an agreed upon set of design
> points and tasks. There's been a lot of good discussion and design
> preceeding that we should figure out how to port over. Nowhere in this
> picture are self appointed "master JIRAs" and such, which have been
> disappointing to see crop up, we should be collaboratively coding not
> planting flags.
>
> I read Kai's latest document as something approaching today's consensus (or
> at least a common point of view?) rather than a historical document.
> Perhaps he and it can be given equal share of the consideration.
>
>
> On Wednesday, July 3, 2013, Larry McCay wrote:
>
> > Hey Andrew -
> >
> > I largely agree with that statement.
> > My intention was to let the differences be worked out within the
> > individual components once they were identified and subtasks created.
> >
> > My reference to HSSO was really referring to a SSO *server* based design
> > which was not clearly articulated in the earlier documents.
> > We aren't trying to compare and contrast one design over another anymore.
> >
> > Let's move this collaboration along as we've mapped out and the
> > differences in the details will reveal themselves and be addressed within
> > their components.
> >
> > I've actually been looking forward to you weighing in on the actual
> > discussion points in this thread.
> > Could you do that?
> >
> > At this point, I am most interested in your thoughts on a single jira to
> > represent all of this work and whether we should start discussing the SSO
> > Tokens.
> > If you think there are discussion points missing from that list, feel
> free
> > to add to it.
> >
> > thanks,
> >
> > --larry
> >
> > On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
> >
> > > Hi Larry,
> > >
> > > Of course I'll let Kai speak for himself. However, let me point out
> that,
> > > while the differences between the competing JIRAs have been reduced for
> > > sure, there were some key differences that didn't just disappear.
> > > Subsequent discussion will make that clear. I also disagree with your
> > > characterization that we have simply endorsed all of the design
> decisions
> > > of the so-called HSSO, this is taking a mile from an inch. We are here
> to
> > > engage in a collaborative process as peers. I've been encouraged by the
> > > spirit of the discussions up to this point and hope that can continue
> > > beyond one design summit.
> > >
> > >
> > >
> > > On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
> > wrote:
> > >
> > >> Hi Kai -
> > >>
> > >> I think that I need to clarify something…
> > >>
> > >> This is not an update for 9533 but a continuation of the discussions
> > that
> > >> are focused on a fresh look at a SSO for Hadoop.
> > >> We've agreed to leave our previous designs behind and therefore we
> > aren't
> > >> really seeing it as an HSSO layered on top of TAS approach or an HSSO
> vs
> > >> TAS discussion.
> > >>
> > >> Your latest design revision actually makes it clear that you are now
> > >> targeting exactly what was described as HSSO - so comparing and
> > contrasting
> > >> is not going to add any value.
> > >>
> > >> What we need you to do at this point, is to look at those high-level
> > >> components described on this thread and comment on whether we need
> > >> additional components or any that are listed that don't seem necessary
> > to
> > >> you and why.
> > >> In other words, we need to define and agree on the work that has to be
> > >> done.
> > >>
> > >> We also need to determine those components that need to be done before
> > >> anything else can be started.
> > >> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
> > all
> > >> the other components and should probably be defined and POC'd in short
> > >> order.
> > >>
> > >> Personally, I think that continuing the separation of 9533 and 9392
> will
> > >> do this effort a disservice. There doesn't seem to be enough
> differences
> > >> between the two to justify separate jiras anymore. It may be best to
> > file a
> > >> new one that reflects a single vision without the extra cruft that has
> > >> built up in either of the existing ones. We would certainly reference
> > the
> > >> existing ones within the new one. This approach would align with the
> > spirit
> > >> of the discussions up to this point.
> > >>
> > >> I am prepared to start a discussion around the shape of the two Hadoop
> > SSO
> > >> tokens: identity and access. If this is what others feel the next
> topic
> > >> should be.
> > >> If we can identify a jira home for it, we can do it there - otherwise
> we
> > >> can create another DISCUSS thread for it.
> > >>
> > >> thanks,
> > >>
> > >> --larry
> > >>
> > >>
> > >> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
> > >>
> > >>> Hi Larry,
> > >>>
> > >>> Thanks for the update. Good to see that with this update we are now
> > >> aligned on most points.
> > >>>
> > >>> I have also updated our TokenAuth design in HADOOP-9392. The new
> > >> revision incorporates feedback and suggestions in related discussion
> > with
> > >> the community, particularly from Microsoft and others attending the
> > >> Security design lounge session at the Hadoop summit. Summary of the
> > changes:
> > >>> 1.    Revised the approach to now use two tokens, Identity Token plus
> > >> Access Token, particularly considering our authorization framework and
> > >> compatibility with HSSO;
> > >>> 2.    Introduced Authorization Server (AS) from our authorization
> > >> framework into the flow that issues access tokens for clients with
> > identity
> > >> tokens to access services;
> > >>> 3.    Refined proxy access token and the proxy/impersonation flow;
> > >>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
> > >> services;
> > >>> 5.    Added Hadoop RPC access flow regard
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>



-- 
Alejandro

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Andrew Purtell <ap...@apache.org>.

Hi Larry (and all),

Happy Fourth of July to you and yours.

In our shop Kai and Tianyou are already doing the coding, so I'd defer to
them on the detailed points.

My concern here is there may have been a misinterpretation or lack of
consensus on what is meant by "clean slate". Hopefully that can be quickly
cleared up. Certainly we did not mean ignore all that came before. The idea
was to reset discussions to find common ground and new direction where we
are working together, not in conflict, on an agreed upon set of design
points and tasks. There's been a lot of good discussion and design
preceeding that we should figure out how to port over. Nowhere in this
picture are self appointed "master JIRAs" and such, which have been
disappointing to see crop up, we should be collaboratively coding not
planting flags.

I read Kai's latest document as something approaching today's consensus (or
at least a common point of view?) rather than a historical document.
Perhaps he and it can be given equal share of the consideration.


On Wednesday, July 3, 2013, Larry McCay wrote:

> Hey Andrew -
>
> I largely agree with that statement.
> My intention was to let the differences be worked out within the
> individual components once they were identified and subtasks created.
>
> My reference to HSSO was really referring to a SSO *server* based design
> which was not clearly articulated in the earlier documents.
> We aren't trying to compare and contrast one design over another anymore.
>
> Let's move this collaboration along as we've mapped out and the
> differences in the details will reveal themselves and be addressed within
> their components.
>
> I've actually been looking forward to you weighing in on the actual
> discussion points in this thread.
> Could you do that?
>
> At this point, I am most interested in your thoughts on a single jira to
> represent all of this work and whether we should start discussing the SSO
> Tokens.
> If you think there are discussion points missing from that list, feel free
> to add to it.
>
> thanks,
>
> --larry
>
> On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:
>
> > Hi Larry,
> >
> > Of course I'll let Kai speak for himself. However, let me point out that,
> > while the differences between the competing JIRAs have been reduced for
> > sure, there were some key differences that didn't just disappear.
> > Subsequent discussion will make that clear. I also disagree with your
> > characterization that we have simply endorsed all of the design decisions
> > of the so-called HSSO, this is taking a mile from an inch. We are here to
> > engage in a collaborative process as peers. I've been encouraged by the
> > spirit of the discussions up to this point and hope that can continue
> > beyond one design summit.
> >
> >
> >
> > On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com>
> wrote:
> >
> >> Hi Kai -
> >>
> >> I think that I need to clarify something…
> >>
> >> This is not an update for 9533 but a continuation of the discussions
> that
> >> are focused on a fresh look at a SSO for Hadoop.
> >> We've agreed to leave our previous designs behind and therefore we
> aren't
> >> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs
> >> TAS discussion.
> >>
> >> Your latest design revision actually makes it clear that you are now
> >> targeting exactly what was described as HSSO - so comparing and
> contrasting
> >> is not going to add any value.
> >>
> >> What we need you to do at this point, is to look at those high-level
> >> components described on this thread and comment on whether we need
> >> additional components or any that are listed that don't seem necessary
> to
> >> you and why.
> >> In other words, we need to define and agree on the work that has to be
> >> done.
> >>
> >> We also need to determine those components that need to be done before
> >> anything else can be started.
> >> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to
> all
> >> the other components and should probably be defined and POC'd in short
> >> order.
> >>
> >> Personally, I think that continuing the separation of 9533 and 9392 will
> >> do this effort a disservice. There doesn't seem to be enough differences
> >> between the two to justify separate jiras anymore. It may be best to
> file a
> >> new one that reflects a single vision without the extra cruft that has
> >> built up in either of the existing ones. We would certainly reference
> the
> >> existing ones within the new one. This approach would align with the
> spirit
> >> of the discussions up to this point.
> >>
> >> I am prepared to start a discussion around the shape of the two Hadoop
> SSO
> >> tokens: identity and access. If this is what others feel the next topic
> >> should be.
> >> If we can identify a jira home for it, we can do it there - otherwise we
> >> can create another DISCUSS thread for it.
> >>
> >> thanks,
> >>
> >> --larry
> >>
> >>
> >> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
> >>
> >>> Hi Larry,
> >>>
> >>> Thanks for the update. Good to see that with this update we are now
> >> aligned on most points.
> >>>
> >>> I have also updated our TokenAuth design in HADOOP-9392. The new
> >> revision incorporates feedback and suggestions in related discussion
> with
> >> the community, particularly from Microsoft and others attending the
> >> Security design lounge session at the Hadoop summit. Summary of the
> changes:
> >>> 1.    Revised the approach to now use two tokens, Identity Token plus
> >> Access Token, particularly considering our authorization framework and
> >> compatibility with HSSO;
> >>> 2.    Introduced Authorization Server (AS) from our authorization
> >> framework into the flow that issues access tokens for clients with
> identity
> >> tokens to access services;
> >>> 3.    Refined proxy access token and the proxy/impersonation flow;
> >>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
> >> services;
> >>> 5.    Added Hadoop RPC access flow regard



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Hey Andrew - 

I largely agree with that statement.
My intention was to let the differences be worked out within the individual components once they were identified and subtasks created.

My reference to HSSO was really referring to a SSO *server* based design which was not clearly articulated in the earlier documents.
We aren't trying to compare and contrast one design over another anymore.

Let's move this collaboration along as we've mapped out and the differences in the details will reveal themselves and be addressed within their components.

I've actually been looking forward to you weighing in on the actual discussion points in this thread.
Could you do that?

At this point, I am most interested in your thoughts on a single jira to represent all of this work and whether we should start discussing the SSO Tokens.
If you think there are discussion points missing from that list, feel free to add to it.

thanks,

--larry

On Jul 3, 2013, at 7:35 PM, Andrew Purtell <ap...@apache.org> wrote:

> Hi Larry,
> 
> Of course I'll let Kai speak for himself. However, let me point out that,
> while the differences between the competing JIRAs have been reduced for
> sure, there were some key differences that didn't just disappear.
> Subsequent discussion will make that clear. I also disagree with your
> characterization that we have simply endorsed all of the design decisions
> of the so-called HSSO, this is taking a mile from an inch. We are here to
> engage in a collaborative process as peers. I've been encouraged by the
> spirit of the discussions up to this point and hope that can continue
> beyond one design summit.
> 
> 
> 
> On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com> wrote:
> 
>> Hi Kai -
>> 
>> I think that I need to clarify something…
>> 
>> This is not an update for 9533 but a continuation of the discussions that
>> are focused on a fresh look at a SSO for Hadoop.
>> We've agreed to leave our previous designs behind and therefore we aren't
>> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs
>> TAS discussion.
>> 
>> Your latest design revision actually makes it clear that you are now
>> targeting exactly what was described as HSSO - so comparing and contrasting
>> is not going to add any value.
>> 
>> What we need you to do at this point, is to look at those high-level
>> components described on this thread and comment on whether we need
>> additional components or any that are listed that don't seem necessary to
>> you and why.
>> In other words, we need to define and agree on the work that has to be
>> done.
>> 
>> We also need to determine those components that need to be done before
>> anything else can be started.
>> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all
>> the other components and should probably be defined and POC'd in short
>> order.
>> 
>> Personally, I think that continuing the separation of 9533 and 9392 will
>> do this effort a disservice. There doesn't seem to be enough differences
>> between the two to justify separate jiras anymore. It may be best to file a
>> new one that reflects a single vision without the extra cruft that has
>> built up in either of the existing ones. We would certainly reference the
>> existing ones within the new one. This approach would align with the spirit
>> of the discussions up to this point.
>> 
>> I am prepared to start a discussion around the shape of the two Hadoop SSO
>> tokens: identity and access. If this is what others feel the next topic
>> should be.
>> If we can identify a jira home for it, we can do it there - otherwise we
>> can create another DISCUSS thread for it.
>> 
>> thanks,
>> 
>> --larry
>> 
>> 
>> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
>> 
>>> Hi Larry,
>>> 
>>> Thanks for the update. Good to see that with this update we are now
>> aligned on most points.
>>> 
>>> I have also updated our TokenAuth design in HADOOP-9392. The new
>> revision incorporates feedback and suggestions in related discussion with
>> the community, particularly from Microsoft and others attending the
>> Security design lounge session at the Hadoop summit. Summary of the changes:
>>> 1.    Revised the approach to now use two tokens, Identity Token plus
>> Access Token, particularly considering our authorization framework and
>> compatibility with HSSO;
>>> 2.    Introduced Authorization Server (AS) from our authorization
>> framework into the flow that issues access tokens for clients with identity
>> tokens to access services;
>>> 3.    Refined proxy access token and the proxy/impersonation flow;
>>> 4.    Refined the browser web SSO flow regarding access to Hadoop web
>> services;
>>> 5.    Added Hadoop RPC access flow regarding CLI clients accessing
>> Hadoop services via RPC/SASL;
>>> 6.    Added client authentication integration flow to illustrate how
>> desktop logins can be integrated into the authentication process to TAS to
>> exchange identity token;
>>> 7.    Introduced fine grained access control flow from authorization
>> framework, I have put it in appendices section for the reference;
>>> 8.    Added a detailed flow to illustrate Hadoop Simple authentication
>> over TokenAuth, in the appendices section;
>>> 9.    Added secured task launcher in appendices as possible solutions
>> for Windows platform;
>>> 10.    Removed low level contents, and not so relevant parts into
>> appendices section from the main body.
>>> 
>>> As we all think about how to layer HSSO on TAS in TokenAuth framework,
>> please take some time to look at the doc and then let's discuss the gaps we
>> might have. I would like to discuss these gaps with focus on the
>> implementations details so we are all moving towards getting code done.
>> Let's continue this part of the discussion in HADOOP-9392 to allow for
>> better tracking on the JIRA itself. For discussions related to Centralized
>> SSO server, suggest we continue to use HADOOP-9533 to consolidate all
>> discussion related to that JIRA. That way we don't need extra umbrella
>> JIRAs.
>>> 
>>> I agree we should speed up these discussions, agree on some of the
>> implementation specifics so both us can get moving on the code while not
>> stepping on each other in our work.
>>> 
>>> Look forward to your comments and comments from others in the community.
>> Thanks.
>>> 
>>> Regards,
>>> Kai
>>> 
>>> -----Original Message-----
>>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>>> Sent: Wednesday, July 03, 2013 4:04 AM
>>> To: common-dev@hadoop.apache.org
>>> Subject: [DISCUSS] Hadoop SSO/Token Server Components
>>> 
>>> All -
>>> 
>>> As a follow up to the discussions that were had during Hadoop Summit, I
>> would like to introduce the discussion topic around the moving parts of a
>> Hadoop SSO/Token Service.
>>> There are a couple of related Jira's that can be referenced and may or
>> may not be updated as a result of this discuss thread.
>>> 
>>> https://issues.apache.org/jira/browse/HADOOP-9533
>>> https://issues.apache.org/jira/browse/HADOOP-9392
>>> 
>>> As the first aspect of the discussion, we should probably state the
>> overall goals and scoping for this effort:
>>> * An alternative authentication mechanism to Kerberos for user
>> authentication
>>> * A broader capability for integration into enterprise identity and SSO
>> solutions
>>> * Possibly the advertisement/negotiation of available authentication
>> mechanisms
>>> * Backward compatibility for the existing use of Kerberos
>>> * No (or minimal) changes to existing Hadoop tokens (delegation, job,
>> block access, etc)
>>> * Pluggable authentication mechanisms across: RPC, REST and webui
>> enforcement points
>>> * Continued support for existing authorization policy/ACLs, etc
>>> * Keeping more fine grained authorization policies in mind - like
>> attribute based access control
>>>      - fine grained access control is a separate but related effort
>> that we must not preclude with this effort
>>> * Cross cluster SSO
>>> 
>>> In order to tease out the moving parts here are a couple high level and
>> simplified descriptions of SSO interaction flow:
>>>                              +------+
>>>      +------+ credentials 1 | SSO  |
>>>      |CLIENT|-------------->|SERVER|
>>>      +------+  :tokens      +------+
>>>        2 |
>>>          | access token
>>>          V :requested resource
>>>      +-------+
>>>      |HADOOP |
>>>      |SERVICE|
>>>      +-------+
>>> 
>>> The above diagram represents the simplest interaction model for an SSO
>> service in Hadoop.
>>> 1. client authenticates to SSO service and acquires an access token
>>> a. client presents credentials to an authentication service endpoint
>> exposed by the SSO server (AS) and receives a token representing the
>> authentication event and verified identity
>>> b. client then presents the identity token from 1.a. to the token
>> endpoint exposed by the SSO server (TGS) to request an access token to a
>> particular Hadoop service and receives an access token 2. client presents
>> the Hadoop access token to the Hadoop service for which the access token
>> has been granted and requests the desired resource or services
>>> a. access token is presented as appropriate for the service endpoint
>> protocol being used
>>> b. Hadoop service token validation handler validates the token and
>> verifies its integrity and the identity of the issuer
>>> 
>>>   +------+
>>>   |  IdP |
>>>   +------+
>>>   1   ^ credentials
>>>       | :idp_token
>>>       |                      +------+
>>>      +------+  idp_token  2 | SSO  |
>>>      |CLIENT|-------------->|SERVER|
>>>      +------+  :tokens      +------+
>>>        3 |
>>>          | access token
>>>          V :requested resource
>>>      +-------+
>>>      |HADOOP |
>>>      |SERVICE|
>>>      +-------+
>>> 
>>> 
>>> The above diagram represents a slightly more complicated interaction
>> model for an SSO service in Hadoop that removes Hadoop from the credential
>> collection business.
>>> 1. client authenticates to a trusted identity provider within the
>> enterprise and acquires an IdP specific token
>>> a. client presents credentials to an enterprise IdP and receives a
>> token representing the authentication identity 2. client authenticates to
>> SSO service and acquires an access token
>>> a. client presents idp_token to an authentication service endpoint
>> exposed by the SSO server (AS) and receives a token representing the
>> authentication event and verified identity
>>> b. client then presents the identity token from 2.a. to the token
>> endpoint exposed by the SSO server (TGS) to request an access token to a
>> particular Hadoop service and receives an access token 3. client presents
>> the Hadoop access token to the Hadoop service for which the access token
>> has been granted and requests the desired resource or services
>>> a. access token is presented as appropriate for the service endpoint
>> protocol being used
>>> b. Hadoop service token validation handler validates the token and
>> verifies its integrity and the identity of the issuer
>>> 
>>> Considering the above set of goals and high level interaction flow
>> description, we can start to discuss the component inventory required to
>> accomplish this vision:
>>> 
>>> 1. SSO Server Instance: this component must be able to expose endpoints
>> for both authentication of users by collecting and validating credentials
>> and federation of identities represented by tokens from trusted IdPs within
>> the enterprise. The endpoints should be composable so as to allow for
>> multifactor authentication mechanisms. They will also need to return tokens
>> that represent the authentication event and verified identity as well as
>> access tokens for specific Hadoop services.
>>> 
>>> 2. Authentication Providers: pluggable authentication mechanisms must be
>> easily created and configured for use within the SSO server instance. They
>> will ideally allow the enterprise to plugin their preferred components from
>> off the shelf as well as provide custom providers. Supporting existing
>> standards for such authentication providers should be a top priority
>> concern. There are a number of standard approaches in use in the Java
>> world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A
>> pluggable provider architecture that allows the enterprise to leverage
>> existing investments in these technologies and existing skill sets would be
>> ideal.
>>> 
>>> 3. Token Authority: a token authority component would need to have the
>> ability to issue, verify and revoke tokens. This authority will need to be
>> trusted by all enforcement points that need to verify incoming tokens.
>> Using something like PKI for establishing trust will be required.
>>> 
>>> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will
>> need to be considered in order to determine the means by which trust and
>> integrity are ensured while using them. There may be some abstraction of
>> the underlying format provided through interface based design but all token
>> implementations will need to have the same attributes and capabilities in
>> terms of validation and cryptographic verification.
>>> 
>>> 5. SSO Protocol: the lowest common denominator protocol for SSO server
>> interactions across client types would likely be REST. Depending on the
>> REST client in use it may require explicitly coding to the token flow
>> described in the earlier interaction descriptions or a plugin may be
>> provided for things like HTTPClient, curl, etc. RPC clients will have this
>> taken care for them within the SASL layer and will leverage the REST
>> endpoints as well. This likely implies trust requirements for the RPC
>> client to be able to trust the SSO server's identity cert that is presented
>> over SSL.
>>> 
>>> 6. REST Client Agent Plugins: required for encapsulating the interaction
>> with the SSO server for the client programming models. We may need these
>> for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
>>> 
>>> 7. Server Side Authentication Handlers: the server side of the REST, RPC
>> or webui connection will need to be able to validate and verify the
>> incoming Hadoop tokens in order to grant or deny access to requested
>> resources.
>>> 
>>> 8. Credential/Trust Management: throughout the system - on client and
>> server sides - we will need to manage and provide access to PKI and
>> potentially shared secret artifacts in order to establish the required
>> trust relationships to replace the mutual authentication that would be
>> otherwise provided by using kerberos everywhere.
>>> 
>>> So, discussion points:
>>> 
>>> 1. Are there additional components that would be required for a Hadoop
>> SSO service?
>>> 2. Should any of the above described components be considered not
>> actually necessary or poorly described?
>>> 2. Should we create a new umbrella Jira to identify each of these as a
>> subtask?
>>> 3. Should we just continue to use 9533 for the SSO server and add
>> additional subtasks?
>>> 4. What are the natural seams of separation between these components and
>> any dependencies between one and another that affect priority?
>>> 
>>> Obviously, each component that we identify will have a jira of its own -
>> more than likely - so we are only trying to identify the high level
>> descriptions for now.
>>> 
>>> Can we try and drive this discussion to a close by the end of the week?
>> This will allow us to start breaking out into component implementation
>> plans.
>>> 
>>> thanks,
>>> 
>>> --larry
>> 
>> 
> 
> 
> -- 
> Best regards,
> 
>   - Andy
> 
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Andrew Purtell <ap...@apache.org>.

Hi Larry,

Of course I'll let Kai speak for himself. However, let me point out that,
while the differences between the competing JIRAs have been reduced for
sure, there were some key differences that didn't just disappear.
Subsequent discussion will make that clear. I also disagree with your
characterization that we have simply endorsed all of the design decisions
of the so-called HSSO, this is taking a mile from an inch. We are here to
engage in a collaborative process as peers. I've been encouraged by the
spirit of the discussions up to this point and hope that can continue
beyond one design summit.



On Wed, Jul 3, 2013 at 1:10 PM, Larry McCay <lm...@hortonworks.com> wrote:

> Hi Kai -
>
> I think that I need to clarify something…
>
> This is not an update for 9533 but a continuation of the discussions that
> are focused on a fresh look at a SSO for Hadoop.
> We've agreed to leave our previous designs behind and therefore we aren't
> really seeing it as an HSSO layered on top of TAS approach or an HSSO vs
> TAS discussion.
>
> Your latest design revision actually makes it clear that you are now
> targeting exactly what was described as HSSO - so comparing and contrasting
> is not going to add any value.
>
> What we need you to do at this point, is to look at those high-level
> components described on this thread and comment on whether we need
> additional components or any that are listed that don't seem necessary to
> you and why.
> In other words, we need to define and agree on the work that has to be
> done.
>
> We also need to determine those components that need to be done before
> anything else can be started.
> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all
> the other components and should probably be defined and POC'd in short
> order.
>
> Personally, I think that continuing the separation of 9533 and 9392 will
> do this effort a disservice. There doesn't seem to be enough differences
> between the two to justify separate jiras anymore. It may be best to file a
> new one that reflects a single vision without the extra cruft that has
> built up in either of the existing ones. We would certainly reference the
> existing ones within the new one. This approach would align with the spirit
> of the discussions up to this point.
>
> I am prepared to start a discussion around the shape of the two Hadoop SSO
> tokens: identity and access. If this is what others feel the next topic
> should be.
> If we can identify a jira home for it, we can do it there - otherwise we
> can create another DISCUSS thread for it.
>
> thanks,
>
> --larry
>
>
> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
>
> > Hi Larry,
> >
> > Thanks for the update. Good to see that with this update we are now
> aligned on most points.
> >
> > I have also updated our TokenAuth design in HADOOP-9392. The new
> revision incorporates feedback and suggestions in related discussion with
> the community, particularly from Microsoft and others attending the
> Security design lounge session at the Hadoop summit. Summary of the changes:
> > 1.    Revised the approach to now use two tokens, Identity Token plus
> Access Token, particularly considering our authorization framework and
> compatibility with HSSO;
> > 2.    Introduced Authorization Server (AS) from our authorization
> framework into the flow that issues access tokens for clients with identity
> tokens to access services;
> > 3.    Refined proxy access token and the proxy/impersonation flow;
> > 4.    Refined the browser web SSO flow regarding access to Hadoop web
> services;
> > 5.    Added Hadoop RPC access flow regarding CLI clients accessing
> Hadoop services via RPC/SASL;
> > 6.    Added client authentication integration flow to illustrate how
> desktop logins can be integrated into the authentication process to TAS to
> exchange identity token;
> > 7.    Introduced fine grained access control flow from authorization
> framework, I have put it in appendices section for the reference;
> > 8.    Added a detailed flow to illustrate Hadoop Simple authentication
> over TokenAuth, in the appendices section;
> > 9.    Added secured task launcher in appendices as possible solutions
> for Windows platform;
> > 10.    Removed low level contents, and not so relevant parts into
> appendices section from the main body.
> >
> > As we all think about how to layer HSSO on TAS in TokenAuth framework,
> please take some time to look at the doc and then let's discuss the gaps we
> might have. I would like to discuss these gaps with focus on the
> implementations details so we are all moving towards getting code done.
> Let's continue this part of the discussion in HADOOP-9392 to allow for
> better tracking on the JIRA itself. For discussions related to Centralized
> SSO server, suggest we continue to use HADOOP-9533 to consolidate all
> discussion related to that JIRA. That way we don't need extra umbrella
> JIRAs.
> >
> > I agree we should speed up these discussions, agree on some of the
> implementation specifics so both us can get moving on the code while not
> stepping on each other in our work.
> >
> > Look forward to your comments and comments from others in the community.
> Thanks.
> >
> > Regards,
> > Kai
> >
> > -----Original Message-----
> > From: Larry McCay [mailto:lmccay@hortonworks.com]
> > Sent: Wednesday, July 03, 2013 4:04 AM
> > To: common-dev@hadoop.apache.org
> > Subject: [DISCUSS] Hadoop SSO/Token Server Components
> >
> > All -
> >
> > As a follow up to the discussions that were had during Hadoop Summit, I
> would like to introduce the discussion topic around the moving parts of a
> Hadoop SSO/Token Service.
> > There are a couple of related Jira's that can be referenced and may or
> may not be updated as a result of this discuss thread.
> >
> > https://issues.apache.org/jira/browse/HADOOP-9533
> > https://issues.apache.org/jira/browse/HADOOP-9392
> >
> > As the first aspect of the discussion, we should probably state the
> overall goals and scoping for this effort:
> > * An alternative authentication mechanism to Kerberos for user
> authentication
> > * A broader capability for integration into enterprise identity and SSO
> solutions
> > * Possibly the advertisement/negotiation of available authentication
> mechanisms
> > * Backward compatibility for the existing use of Kerberos
> > * No (or minimal) changes to existing Hadoop tokens (delegation, job,
> block access, etc)
> > * Pluggable authentication mechanisms across: RPC, REST and webui
> enforcement points
> > * Continued support for existing authorization policy/ACLs, etc
> > * Keeping more fine grained authorization policies in mind - like
> attribute based access control
> >       - fine grained access control is a separate but related effort
> that we must not preclude with this effort
> > * Cross cluster SSO
> >
> > In order to tease out the moving parts here are a couple high level and
> simplified descriptions of SSO interaction flow:
> >                               +------+
> >       +------+ credentials 1 | SSO  |
> >       |CLIENT|-------------->|SERVER|
> >       +------+  :tokens      +------+
> >         2 |
> >           | access token
> >           V :requested resource
> >       +-------+
> >       |HADOOP |
> >       |SERVICE|
> >       +-------+
> >
> > The above diagram represents the simplest interaction model for an SSO
> service in Hadoop.
> > 1. client authenticates to SSO service and acquires an access token
> >  a. client presents credentials to an authentication service endpoint
> exposed by the SSO server (AS) and receives a token representing the
> authentication event and verified identity
> >  b. client then presents the identity token from 1.a. to the token
> endpoint exposed by the SSO server (TGS) to request an access token to a
> particular Hadoop service and receives an access token 2. client presents
> the Hadoop access token to the Hadoop service for which the access token
> has been granted and requests the desired resource or services
> >  a. access token is presented as appropriate for the service endpoint
> protocol being used
> >  b. Hadoop service token validation handler validates the token and
> verifies its integrity and the identity of the issuer
> >
> >    +------+
> >    |  IdP |
> >    +------+
> >    1   ^ credentials
> >        | :idp_token
> >        |                      +------+
> >       +------+  idp_token  2 | SSO  |
> >       |CLIENT|-------------->|SERVER|
> >       +------+  :tokens      +------+
> >         3 |
> >           | access token
> >           V :requested resource
> >       +-------+
> >       |HADOOP |
> >       |SERVICE|
> >       +-------+
> >
> >
> > The above diagram represents a slightly more complicated interaction
> model for an SSO service in Hadoop that removes Hadoop from the credential
> collection business.
> > 1. client authenticates to a trusted identity provider within the
> enterprise and acquires an IdP specific token
> >  a. client presents credentials to an enterprise IdP and receives a
> token representing the authentication identity 2. client authenticates to
> SSO service and acquires an access token
> >  a. client presents idp_token to an authentication service endpoint
> exposed by the SSO server (AS) and receives a token representing the
> authentication event and verified identity
> >  b. client then presents the identity token from 2.a. to the token
> endpoint exposed by the SSO server (TGS) to request an access token to a
> particular Hadoop service and receives an access token 3. client presents
> the Hadoop access token to the Hadoop service for which the access token
> has been granted and requests the desired resource or services
> >  a. access token is presented as appropriate for the service endpoint
> protocol being used
> >  b. Hadoop service token validation handler validates the token and
> verifies its integrity and the identity of the issuer
> >
> > Considering the above set of goals and high level interaction flow
> description, we can start to discuss the component inventory required to
> accomplish this vision:
> >
> > 1. SSO Server Instance: this component must be able to expose endpoints
> for both authentication of users by collecting and validating credentials
> and federation of identities represented by tokens from trusted IdPs within
> the enterprise. The endpoints should be composable so as to allow for
> multifactor authentication mechanisms. They will also need to return tokens
> that represent the authentication event and verified identity as well as
> access tokens for specific Hadoop services.
> >
> > 2. Authentication Providers: pluggable authentication mechanisms must be
> easily created and configured for use within the SSO server instance. They
> will ideally allow the enterprise to plugin their preferred components from
> off the shelf as well as provide custom providers. Supporting existing
> standards for such authentication providers should be a top priority
> concern. There are a number of standard approaches in use in the Java
> world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A
> pluggable provider architecture that allows the enterprise to leverage
> existing investments in these technologies and existing skill sets would be
> ideal.
> >
> > 3. Token Authority: a token authority component would need to have the
> ability to issue, verify and revoke tokens. This authority will need to be
> trusted by all enforcement points that need to verify incoming tokens.
> Using something like PKI for establishing trust will be required.
> >
> > 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will
> need to be considered in order to determine the means by which trust and
> integrity are ensured while using them. There may be some abstraction of
> the underlying format provided through interface based design but all token
> implementations will need to have the same attributes and capabilities in
> terms of validation and cryptographic verification.
> >
> > 5. SSO Protocol: the lowest common denominator protocol for SSO server
> interactions across client types would likely be REST. Depending on the
> REST client in use it may require explicitly coding to the token flow
> described in the earlier interaction descriptions or a plugin may be
> provided for things like HTTPClient, curl, etc. RPC clients will have this
> taken care for them within the SASL layer and will leverage the REST
> endpoints as well. This likely implies trust requirements for the RPC
> client to be able to trust the SSO server's identity cert that is presented
> over SSL.
> >
> > 6. REST Client Agent Plugins: required for encapsulating the interaction
> with the SSO server for the client programming models. We may need these
> for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
> >
> > 7. Server Side Authentication Handlers: the server side of the REST, RPC
> or webui connection will need to be able to validate and verify the
> incoming Hadoop tokens in order to grant or deny access to requested
> resources.
> >
> > 8. Credential/Trust Management: throughout the system - on client and
> server sides - we will need to manage and provide access to PKI and
> potentially shared secret artifacts in order to establish the required
> trust relationships to replace the mutual authentication that would be
> otherwise provided by using kerberos everywhere.
> >
> > So, discussion points:
> >
> > 1. Are there additional components that would be required for a Hadoop
> SSO service?
> > 2. Should any of the above described components be considered not
> actually necessary or poorly described?
> > 2. Should we create a new umbrella Jira to identify each of these as a
> subtask?
> > 3. Should we just continue to use 9533 for the SSO server and add
> additional subtasks?
> > 4. What are the natural seams of separation between these components and
> any dependencies between one and another that affect priority?
> >
> > Obviously, each component that we identify will have a jira of its own -
> more than likely - so we are only trying to identify the high level
> descriptions for now.
> >
> > Can we try and drive this discussion to a close by the end of the week?
> This will allow us to start breaking out into component implementation
> plans.
> >
> > thanks,
> >
> > --larry
>
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Hi Tianyou -

As was discussed on the pre-summit calls, we were approaching the summit from a clean slate.
Perhaps, that wasn't articulated in the summary of those calls very well - I thought that it was.

In any case, the agreed upon approach to move forward was to agree on the moving parts that needed to be worked on, prioritize them and start creating subtasks for them.
Using one of the existing jiras would work for this but using both doesn't make a lot of sense to me.

My wording regarding the alignment of 9392 and 9533 is regrettable.
The point is that the SSO server instance/s based approach that is now apparent in 9392 is very much the same thing that 9533 attempted to introduce.

Yes, there are a number of differences that exist in details that are in the documents. If we are starting from a clean slate then it is too early to talk about many of those details.
Part of the difficulty in reconciling the two jiras has been related to having to consume the whole thing at once and try and agree on all the details of all the components - much like trying to boil the ocean all at once.
Starting anew allows us to:
1. establish and agree on the components and broad stroke interaction patterns.
2. identify individual pieces to work on and agree on their finer details - this is where the differences will be rationalized
3. break up the workload and deliver the overall vision

This approach allows us to boil the ocean one pot at a time.

If you would like to keep the jiras separate - which I would see as unfortunate - then the server instance aspects should be in 9533.
This would include the endpoints used for the flows, the hosting of the pluggable authentication mechanisms created in 9392, trust relationship management required across instances, etc.
9533 is a jira for a Hadoop SSO Server.

Unfortunately, I believe this approach would leave us exactly where where we started.

So, again the discussion points were not really addressed.
It seems that you and Kai have provided your preference for the jira question - though you have really added another option which is keep things the same - which we can make work.

We still need an opinion on the list of components in this thread.
My suggestion is that you take your document and make sure that from a high level all the major components are represented here. 
If not, describe anything else that is needed and why.

We also need to determine the first component to drill down into. Brain and I both see the HSSO Tokens as central to the implementations of other components and should probably be tackled first.
By the way, this drilling down into the details of each of the components is where we will rationalize the differences in implementations/approaches.

> Our updated design doc clearly addresses the authorization and proxy flow which are important for users.
Yes - this is goodness. I don't see the fact that more flows are described as a difference.
Those use cases that are needed by our users will need to be implemented.
Once we get to the components that need to provide for these flows we will need to define them for that component/s.

> HSSO can continue to be layered on top of TAS via federation.

I don't know what this actually means.
HSSO was to be a SSO server instance that hosts the endpoints for the required flows in acquiring the necessary tokens.
You would have to explain to me what layering on top of TAS via federation means.
In fact, I don't even want to reference HSSO in this thread anymore - its aspects are represented in the components list of this thread as the SSO Server Instance.

> Please review the design doc we have uploaded to understand the differences. I am sure Kai will also add more details about the differences between these JIRAs.

At this point, it is important that you make sure the components represented in this thread are sufficient for your ideas.
We will not be well served by continuing to compare and contrast.
This thread and those to follow are part of the collaboration process - once the work items are identified through this thread the collaboration on individual components can certainly happen in jiras.
If we want a new jira to host this higher level discussion that is fine too.
You should use your work on 9392 within this process to help drive the discussion and definition of the components identified here.

So…

At this point, I think that we should commit to moving this thread forward and not backward by pointing to silo'd jiras.
This highest level pass of identifying the components should have been the easy part.
We need close down on this list and move on to the more challenging discussions of the component details.

Can we do this?
Is there another approach that folks would like to take here?

thanks,

--larry

On Jul 4, 2013, at 12:19 AM, "Li, Tianyou" <ti...@intel.com> wrote:

> Hi Larry,
> 
> I participated in the design discussion at Hadoop Summit. I do not remember there was any discussion of abandoning the current JIRAs which tracks a lot of good input from others in the community and important for us to consider as we move forward with the work. Recommend we continue to move forward with the two JIRAs that we have already been respectively working on, as well other JIRAs that others in the community continue to work on.
> 
> "Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value."
> That is not my understanding. As Kai has pointed out in response to your comment on HADOOP-9392, a lot of these updates predate last week's discussion at the summit. Fortunately the discussion at the summit was in line with our thinking on the required revisions from discussing with others in the community prior to the summit. Our updated design doc clearly addresses the authorization and proxy flow which are important for users. HSSO can continue to be layered on top of TAS via federation.
> 
> "Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore."
> Actually I see many key differences between 9392 and 9533. Andrew and Kai has also pointed out there are key differences when comparing 9392 and 9533. Please review the design doc we have uploaded to understand the differences. I am sure Kai will also add more details about the differences between these JIRAs.
> 
> The work proposed by us on 9392 addresses additional user needs beyond what 9533 proposes to implement. We should figure out some of the implementation specifics for those JIRAs so both of us can keep moving on the code without colliding. Kai has also recommended the same as his preference in response to your comment on 9392.
> 
> Let's work that out as a community of peers so we can all agree on an approach to move forward collaboratively.
> 
> Thanks,
> Tianyou
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com] 
> Sent: Thursday, July 04, 2013 4:10 AM
> To: Zheng, Kai
> Cc: common-dev@hadoop.apache.org
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> 
> Hi Kai -
> 
> I think that I need to clarify something...
> 
> This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop.
> We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion.
> 
> Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value.
> 
> What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why.
> In other words, we need to define and agree on the work that has to be done.
> 
> We also need to determine those components that need to be done before anything else can be started.
> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order.
> 
> Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point.
> 
> I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be.
> If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it.
> 
> thanks,
> 
> --larry
> 
> 
> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
> 
>> Hi Larry,
>> 
>> Thanks for the update. Good to see that with this update we are now aligned on most points.
>> 
>> I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes:
>> 1.    Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO;
>> 2.    Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services;
>> 3.    Refined proxy access token and the proxy/impersonation flow;
>> 4.    Refined the browser web SSO flow regarding access to Hadoop web services;
>> 5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL;
>> 6.    Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token;
>> 7.    Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference;
>> 8.    Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section;
>> 9.    Added secured task launcher in appendices as possible solutions for Windows platform;
>> 10.    Removed low level contents, and not so relevant parts into appendices section from the main body.
>> 
>> As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs.
>> 
>> I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work.
>> 
>> Look forward to your comments and comments from others in the community. Thanks.
>> 
>> Regards,
>> Kai
>> 
>> -----Original Message-----
>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>> Sent: Wednesday, July 03, 2013 4:04 AM
>> To: common-dev@hadoop.apache.org
>> Subject: [DISCUSS] Hadoop SSO/Token Server Components
>> 
>> All -
>> 
>> As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
>> There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.
>> 
>> https://issues.apache.org/jira/browse/HADOOP-9533
>> https://issues.apache.org/jira/browse/HADOOP-9392
>> 
>> As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
>> * An alternative authentication mechanism to Kerberos for user 
>> authentication
>> * A broader capability for integration into enterprise identity and 
>> SSO solutions
>> * Possibly the advertisement/negotiation of available authentication 
>> mechanisms
>> * Backward compatibility for the existing use of Kerberos
>> * No (or minimal) changes to existing Hadoop tokens (delegation, job, 
>> block access, etc)
>> * Pluggable authentication mechanisms across: RPC, REST and webui 
>> enforcement points
>> * Continued support for existing authorization policy/ACLs, etc
>> * Keeping more fine grained authorization policies in mind - like attribute based access control
>> 	- fine grained access control is a separate but related effort that 
>> we must not preclude with this effort
>> * Cross cluster SSO
>> 
>> In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
>>                              +------+
>> 	+------+ credentials 1 | SSO  |
>> 	|CLIENT|-------------->|SERVER|
>> 	+------+  :tokens      +------+
>> 	  2 |                    
>> 	    | access token
>> 	    V :requested resource
>> 	+-------+
>> 	|HADOOP |
>> 	|SERVICE|
>> 	+-------+
>> 	
>> The above diagram represents the simplest interaction model for an SSO service in Hadoop.
>> 1. client authenticates to SSO service and acquires an access token  
>> a. client presents credentials to an authentication service endpoint 
>> exposed by the SSO server (AS) and receives a token representing the 
>> authentication event and verified identity  b. client then presents 
>> the identity token from 1.a. to the token endpoint exposed by the SSO 
>> server (TGS) to request an access token to a particular Hadoop service 
>> and receives an access token 2. client presents the Hadoop access 
>> token to the Hadoop service for which the access token has been 
>> granted and requests the desired resource or services  a. access token 
>> is presented as appropriate for the service endpoint protocol being 
>> used  b. Hadoop service token validation handler validates the token 
>> and verifies its integrity and the identity of the issuer
>> 
>>   +------+
>>   |  IdP |
>>   +------+
>>   1   ^ credentials
>>       | :idp_token
>>       |                      +------+
>> 	+------+  idp_token  2 | SSO  |
>> 	|CLIENT|-------------->|SERVER|
>> 	+------+  :tokens      +------+
>> 	  3 |                    
>> 	    | access token
>> 	    V :requested resource
>> 	+-------+
>> 	|HADOOP |
>> 	|SERVICE|
>> 	+-------+
>> 	
>> 
>> The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
>> 1. client authenticates to a trusted identity provider within the 
>> enterprise and acquires an IdP specific token  a. client presents 
>> credentials to an enterprise IdP and receives a token representing the 
>> authentication identity 2. client authenticates to SSO service and 
>> acquires an access token  a. client presents idp_token to an 
>> authentication service endpoint exposed by the SSO server (AS) and 
>> receives a token representing the authentication event and verified 
>> identity  b. client then presents the identity token from 2.a. to the 
>> token endpoint exposed by the SSO server (TGS) to request an access 
>> token to a particular Hadoop service and receives an access token 3. 
>> client presents the Hadoop access token to the Hadoop service for 
>> which the access token has been granted and requests the desired 
>> resource or services  a. access token is presented as appropriate for 
>> the service endpoint protocol being used  b. Hadoop service token 
>> validation handler validates the token and verifies its integrity and 
>> the identity of the issuer
>> 	
>> Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:
>> 
>> 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.
>> 
>> 2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.
>> 
>> 3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.
>> 
>> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.
>> 
>> 5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 
>> 
>> 6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
>> 
>> 7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.
>> 
>> 8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.
>> 
>> So, discussion points:
>> 
>> 1. Are there additional components that would be required for a Hadoop SSO service?
>> 2. Should any of the above described components be considered not actually necessary or poorly described?
>> 2. Should we create a new umbrella Jira to identify each of these as a subtask?
>> 3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
>> 4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
>> 
>> Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.
>> 
>> Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.
>> 
>> thanks,
>> 
>> --larry
>

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by "Li, Tianyou" <ti...@intel.com>.

Hi Larry,
 
I participated in the design discussion at Hadoop Summit. I do not remember there was any discussion of abandoning the current JIRAs which tracks a lot of good input from others in the community and important for us to consider as we move forward with the work. Recommend we continue to move forward with the two JIRAs that we have already been respectively working on, as well other JIRAs that others in the community continue to work on.
 
"Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value."
That is not my understanding. As Kai has pointed out in response to your comment on HADOOP-9392, a lot of these updates predate last week's discussion at the summit. Fortunately the discussion at the summit was in line with our thinking on the required revisions from discussing with others in the community prior to the summit. Our updated design doc clearly addresses the authorization and proxy flow which are important for users. HSSO can continue to be layered on top of TAS via federation.
 
"Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore."
Actually I see many key differences between 9392 and 9533. Andrew and Kai has also pointed out there are key differences when comparing 9392 and 9533. Please review the design doc we have uploaded to understand the differences. I am sure Kai will also add more details about the differences between these JIRAs.
 
The work proposed by us on 9392 addresses additional user needs beyond what 9533 proposes to implement. We should figure out some of the implementation specifics for those JIRAs so both of us can keep moving on the code without colliding. Kai has also recommended the same as his preference in response to your comment on 9392.

Let's work that out as a community of peers so we can all agree on an approach to move forward collaboratively.

Thanks,
Tianyou

-----Original Message-----
From: Larry McCay [mailto:lmccay@hortonworks.com] 
Sent: Thursday, July 04, 2013 4:10 AM
To: Zheng, Kai
Cc: common-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

Hi Kai -

I think that I need to clarify something...

This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop.
We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion.

Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value.

What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why.
In other words, we need to define and agree on the work that has to be done.

We also need to determine those components that need to be done before anything else can be started.
I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order.

Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point.

I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be.
If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it.

thanks,

--larry


On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:

> Hi Larry,
> 
> Thanks for the update. Good to see that with this update we are now aligned on most points.
> 
> I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes:
> 1.    Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO;
> 2.    Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services;
> 3.    Refined proxy access token and the proxy/impersonation flow;
> 4.    Refined the browser web SSO flow regarding access to Hadoop web services;
> 5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL;
> 6.    Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token;
> 7.    Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference;
> 8.    Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section;
> 9.    Added secured task launcher in appendices as possible solutions for Windows platform;
> 10.    Removed low level contents, and not so relevant parts into appendices section from the main body.
> 
> As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs.
> 
> I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work.
> 
> Look forward to your comments and comments from others in the community. Thanks.
> 
> Regards,
> Kai
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com]
> Sent: Wednesday, July 03, 2013 4:04 AM
> To: common-dev@hadoop.apache.org
> Subject: [DISCUSS] Hadoop SSO/Token Server Components
> 
> All -
> 
> As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
> There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.
> 
> https://issues.apache.org/jira/browse/HADOOP-9533
> https://issues.apache.org/jira/browse/HADOOP-9392
> 
> As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
> * An alternative authentication mechanism to Kerberos for user 
> authentication
> * A broader capability for integration into enterprise identity and 
> SSO solutions
> * Possibly the advertisement/negotiation of available authentication 
> mechanisms
> * Backward compatibility for the existing use of Kerberos
> * No (or minimal) changes to existing Hadoop tokens (delegation, job, 
> block access, etc)
> * Pluggable authentication mechanisms across: RPC, REST and webui 
> enforcement points
> * Continued support for existing authorization policy/ACLs, etc
> * Keeping more fine grained authorization policies in mind - like attribute based access control
> 	- fine grained access control is a separate but related effort that 
> we must not preclude with this effort
> * Cross cluster SSO
> 
> In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
>                               +------+
> 	+------+ credentials 1 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  2 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> The above diagram represents the simplest interaction model for an SSO service in Hadoop.
> 1. client authenticates to SSO service and acquires an access token  
> a. client presents credentials to an authentication service endpoint 
> exposed by the SSO server (AS) and receives a token representing the 
> authentication event and verified identity  b. client then presents 
> the identity token from 1.a. to the token endpoint exposed by the SSO 
> server (TGS) to request an access token to a particular Hadoop service 
> and receives an access token 2. client presents the Hadoop access 
> token to the Hadoop service for which the access token has been 
> granted and requests the desired resource or services  a. access token 
> is presented as appropriate for the service endpoint protocol being 
> used  b. Hadoop service token validation handler validates the token 
> and verifies its integrity and the identity of the issuer
> 
>    +------+
>    |  IdP |
>    +------+
>    1   ^ credentials
>        | :idp_token
>        |                      +------+
> 	+------+  idp_token  2 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  3 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> 
> The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
> 1. client authenticates to a trusted identity provider within the 
> enterprise and acquires an IdP specific token  a. client presents 
> credentials to an enterprise IdP and receives a token representing the 
> authentication identity 2. client authenticates to SSO service and 
> acquires an access token  a. client presents idp_token to an 
> authentication service endpoint exposed by the SSO server (AS) and 
> receives a token representing the authentication event and verified 
> identity  b. client then presents the identity token from 2.a. to the 
> token endpoint exposed by the SSO server (TGS) to request an access 
> token to a particular Hadoop service and receives an access token 3. 
> client presents the Hadoop access token to the Hadoop service for 
> which the access token has been granted and requests the desired 
> resource or services  a. access token is presented as appropriate for 
> the service endpoint protocol being used  b. Hadoop service token 
> validation handler validates the token and verifies its integrity and 
> the identity of the issuer
> 	
> Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:
> 
> 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.
> 
> 2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.
> 
> 3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.
> 
> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.
> 
> 5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 
> 
> 6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
> 
> 7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.
> 
> 8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.
> 
> So, discussion points:
> 
> 1. Are there additional components that would be required for a Hadoop SSO service?
> 2. Should any of the above described components be considered not actually necessary or poorly described?
> 2. Should we create a new umbrella Jira to identify each of these as a subtask?
> 3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
> 4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
> 
> Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.
> 
> Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.
> 
> thanks,
> 
> --larry

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

*sigh*

I'm not sure how I am failing to communicate this but will try to briefly do it again…

I never asked for differences between the two silo'd jiras and am attempting to not speak to them within this thread as that is causing thrashing that we can't really afford.

There have been a number of folks working on security features within the community across projects. Many of these things have been rather isolated things that needed to be done and not much community involvement was needed. As we look into these larger endeavors working in silos without a cohesive community is a problem. We are trying to introduce a community for security as a cross cutting concern throughout the Hadoop ecosystem. 

In order to do this, we need to step back and approach the whole effort as a community. We identified a couple ways to start this:
1. using common-dev as the security community email list - at least for the time being
2. finding a wiki space to articulate a holistic view of the security model and drive changes from that common understanding
3. begin the community work by focusing on this authentication alternative to kerberos

Here is what was agreed upon to be discussed by the community for #3 above:
1. restart with a clean slate - define and meet the goals of the community with a single design/vision
2. scope the effort to authentication while keeping in mind not to preclude other related aspects of the Hadoop security roadmap - authorization, auditing, etc
3. we are looking for an alternative to kerberos authentication for users - not for services - for at least for the first phase services would continue to authenticate using kerberos - though it needs to be made easier
4. we would enumerate the high level components needed for this kerberos alternative
5. we would then drill down into the details of the components
5. finally identify the seams of separation that allow for parallel work and get the vision delivered

This email was intended to facilitate the discussion of those things.
To compare and contrast the two silo'd jiras sets this community work back instead of moving it forward.

We have a need with a very manageable scope and could use your help in defining from the context of your current work.

As Aaron stated, the community discussions around this topic have been encouraging and I also hope that they and the security community continue and grow.

Regarding the discussion points that still have not been addressed, I can see one possible additional component - though perhaps it is an aspect of the authentication providers - that you list below as a one of the "differences". That would be your thinking around the use of domains for multi-tenancy. I have trouble separating user domains from the IdPs deployed in the enterprise or cloud environment. Can you elaborate on how these domains relate to those that may be found within a particular IdP offering and how they work together or complement each other? We should be able to determine whether it is an aspect of the pluggable authentication providers or something that should be considered a separate component from that description.

I will be less available for the rest of the day - 4th of July stuff.

On Jul 4, 2013, at 7:21 AM, "Zheng, Kai" <ka...@intel.com> wrote:

> Hi Larry,
> 
> Our design from its first revision focuses on and provides comprehensive support to allow pluggable authentication mechanisms based on a common token, trying to address single sign on issues across the ecosystem to support access to Hadoop services via RPC, REST, and web browser SSO flow. The updated design doc adds even more texts and flows to explain or illustrate these existing items in details as requested by some on the JIRA.
> 
> Additional to the identity token we had proposed, we adopted access token and adapted the approach not only for sake of making TokenAuth compatible with HSSO, but also for better support of fine grained access control, and seamless integration with our authorization framework and even 3rd party authorization service like OAuth Authorization Server. We regard these as important because Hadoop is evolving into an enterprise and cloud platform that needs a complete authN and authZ solution and without this support we would need future rework to complete the solution.
> 
> Since you asked about the differences between TokenAuth and HSSO, here are some key ones:
> 
> TokenAuth supports TAS federation to allow clients to access multiple clusters without a centralized SSO server while HSSO provides a centralized SSO server for multiple clusters.
> 
> TokenAuth integrates authorization framework with auditing support in order to provide a complete solution for enterprise data access security. This allows administrators to administrate security polices centrally and have the polices be enforced consistently across components in the ecosystem in a pluggable way that supports different authorization models like RBAC, ABAC and even XACML standards.
> 
> TokenAuth targets support for domain based authN & authZ to allow multi-tenant deployments. Authentication and authorization rules can be configured and enforced per domain, which allows organizations to manage their individual policies separately while sharing a common large pool of resources.
> 
> TokenAuth addresses proxy/impersonation case with flow as Tianyou mentioned, where a service can proxy client to access another service in a secured and constrained way.
> 
> Regarding token based authentication plus SSO and unified authorization framework, HADOOP-9392 and HADOOP-9466 let's continue to use these as umbrella JIRAs for these efforts. HSSO targets support for centralized SSO server for multiple clusters and as we have pointed out before is a nice subset of the work proposed on HADOOP-9392. Let's align these two JIRAs and address the question Kevin raised multiple times in 9392/9533 JIRAs, "How can HSSO and TAS work together? What is the relationship?". The design update I provided was meant to provide the necessary details so we can nail down that relationship and collaborate on the implementation of these JIRAs.
> 
> As you have also confirmed, this design aligns with related community discussions, so let's continue our collaborative effort to contribute code to these JIRAs.
> 
> Regards,
> Kai
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com] 
> Sent: Thursday, July 04, 2013 4:10 AM
> To: Zheng, Kai
> Cc: common-dev@hadoop.apache.org
> Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components
> 
> Hi Kai -
> 
> I think that I need to clarify something...
> 
> This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop.
> We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion.
> 
> Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value.
> 
> What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why.
> In other words, we need to define and agree on the work that has to be done.
> 
> We also need to determine those components that need to be done before anything else can be started.
> I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order.
> 
> Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point.
> 
> I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be.
> If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it.
> 
> thanks,
> 
> --larry
> 
> 
> On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:
> 
>> Hi Larry,
>> 
>> Thanks for the update. Good to see that with this update we are now aligned on most points.
>> 
>> I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes:
>> 1.    Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO;
>> 2.    Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services;
>> 3.    Refined proxy access token and the proxy/impersonation flow;
>> 4.    Refined the browser web SSO flow regarding access to Hadoop web services;
>> 5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL;
>> 6.    Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token;
>> 7.    Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference;
>> 8.    Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section;
>> 9.    Added secured task launcher in appendices as possible solutions for Windows platform;
>> 10.    Removed low level contents, and not so relevant parts into appendices section from the main body.
>> 
>> As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs.
>> 
>> I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work.
>> 
>> Look forward to your comments and comments from others in the community. Thanks.
>> 
>> Regards,
>> Kai
>> 
>> -----Original Message-----
>> From: Larry McCay [mailto:lmccay@hortonworks.com]
>> Sent: Wednesday, July 03, 2013 4:04 AM
>> To: common-dev@hadoop.apache.org
>> Subject: [DISCUSS] Hadoop SSO/Token Server Components
>> 
>> All -
>> 
>> As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
>> There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.
>> 
>> https://issues.apache.org/jira/browse/HADOOP-9533
>> https://issues.apache.org/jira/browse/HADOOP-9392
>> 
>> As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
>> * An alternative authentication mechanism to Kerberos for user 
>> authentication
>> * A broader capability for integration into enterprise identity and 
>> SSO solutions
>> * Possibly the advertisement/negotiation of available authentication 
>> mechanisms
>> * Backward compatibility for the existing use of Kerberos
>> * No (or minimal) changes to existing Hadoop tokens (delegation, job, 
>> block access, etc)
>> * Pluggable authentication mechanisms across: RPC, REST and webui 
>> enforcement points
>> * Continued support for existing authorization policy/ACLs, etc
>> * Keeping more fine grained authorization policies in mind - like attribute based access control
>> 	- fine grained access control is a separate but related effort that 
>> we must not preclude with this effort
>> * Cross cluster SSO
>> 
>> In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
>>                              +------+
>> 	+------+ credentials 1 | SSO  |
>> 	|CLIENT|-------------->|SERVER|
>> 	+------+  :tokens      +------+
>> 	  2 |                    
>> 	    | access token
>> 	    V :requested resource
>> 	+-------+
>> 	|HADOOP |
>> 	|SERVICE|
>> 	+-------+
>> 	
>> The above diagram represents the simplest interaction model for an SSO service in Hadoop.
>> 1. client authenticates to SSO service and acquires an access token  
>> a. client presents credentials to an authentication service endpoint 
>> exposed by the SSO server (AS) and receives a token representing the 
>> authentication event and verified identity  b. client then presents 
>> the identity token from 1.a. to the token endpoint exposed by the SSO 
>> server (TGS) to request an access token to a particular Hadoop service 
>> and receives an access token 2. client presents the Hadoop access 
>> token to the Hadoop service for which the access token has been 
>> granted and requests the desired resource or services  a. access token 
>> is presented as appropriate for the service endpoint protocol being 
>> used  b. Hadoop service token validation handler validates the token 
>> and verifies its integrity and the identity of the issuer
>> 
>>   +------+
>>   |  IdP |
>>   +------+
>>   1   ^ credentials
>>       | :idp_token
>>       |                      +------+
>> 	+------+  idp_token  2 | SSO  |
>> 	|CLIENT|-------------->|SERVER|
>> 	+------+  :tokens      +------+
>> 	  3 |                    
>> 	    | access token
>> 	    V :requested resource
>> 	+-------+
>> 	|HADOOP |
>> 	|SERVICE|
>> 	+-------+
>> 	
>> 
>> The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
>> 1. client authenticates to a trusted identity provider within the 
>> enterprise and acquires an IdP specific token  a. client presents 
>> credentials to an enterprise IdP and receives a token representing the 
>> authentication identity 2. client authenticates to SSO service and 
>> acquires an access token  a. client presents idp_token to an 
>> authentication service endpoint exposed by the SSO server (AS) and 
>> receives a token representing the authentication event and verified 
>> identity  b. client then presents the identity token from 2.a. to the 
>> token endpoint exposed by the SSO server (TGS) to request an access 
>> token to a particular Hadoop service and receives an access token 3. 
>> client presents the Hadoop access token to the Hadoop service for 
>> which the access token has been granted and requests the desired 
>> resource or services  a. access token is presented as appropriate for 
>> the service endpoint protocol being used  b. Hadoop service token 
>> validation handler validates the token and verifies its integrity and 
>> the identity of the issuer
>> 	
>> Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:
>> 
>> 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.
>> 
>> 2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.
>> 
>> 3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.
>> 
>> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.
>> 
>> 5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 
>> 
>> 6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
>> 
>> 7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.
>> 
>> 8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.
>> 
>> So, discussion points:
>> 
>> 1. Are there additional components that would be required for a Hadoop SSO service?
>> 2. Should any of the above described components be considered not actually necessary or poorly described?
>> 2. Should we create a new umbrella Jira to identify each of these as a subtask?
>> 3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
>> 4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
>> 
>> Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.
>> 
>> Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.
>> 
>> thanks,
>> 
>> --larry
>

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by "Zheng, Kai" <ka...@intel.com>.

Hi Larry,
 
Our design from its first revision focuses on and provides comprehensive support to allow pluggable authentication mechanisms based on a common token, trying to address single sign on issues across the ecosystem to support access to Hadoop services via RPC, REST, and web browser SSO flow. The updated design doc adds even more texts and flows to explain or illustrate these existing items in details as requested by some on the JIRA.
 
Additional to the identity token we had proposed, we adopted access token and adapted the approach not only for sake of making TokenAuth compatible with HSSO, but also for better support of fine grained access control, and seamless integration with our authorization framework and even 3rd party authorization service like OAuth Authorization Server. We regard these as important because Hadoop is evolving into an enterprise and cloud platform that needs a complete authN and authZ solution and without this support we would need future rework to complete the solution.
 
Since you asked about the differences between TokenAuth and HSSO, here are some key ones:
 
TokenAuth supports TAS federation to allow clients to access multiple clusters without a centralized SSO server while HSSO provides a centralized SSO server for multiple clusters.
 
TokenAuth integrates authorization framework with auditing support in order to provide a complete solution for enterprise data access security. This allows administrators to administrate security polices centrally and have the polices be enforced consistently across components in the ecosystem in a pluggable way that supports different authorization models like RBAC, ABAC and even XACML standards.
 
TokenAuth targets support for domain based authN & authZ to allow multi-tenant deployments. Authentication and authorization rules can be configured and enforced per domain, which allows organizations to manage their individual policies separately while sharing a common large pool of resources.
 
TokenAuth addresses proxy/impersonation case with flow as Tianyou mentioned, where a service can proxy client to access another service in a secured and constrained way.
 
Regarding token based authentication plus SSO and unified authorization framework, HADOOP-9392 and HADOOP-9466 let's continue to use these as umbrella JIRAs for these efforts. HSSO targets support for centralized SSO server for multiple clusters and as we have pointed out before is a nice subset of the work proposed on HADOOP-9392. Let's align these two JIRAs and address the question Kevin raised multiple times in 9392/9533 JIRAs, "How can HSSO and TAS work together? What is the relationship?". The design update I provided was meant to provide the necessary details so we can nail down that relationship and collaborate on the implementation of these JIRAs.

As you have also confirmed, this design aligns with related community discussions, so let's continue our collaborative effort to contribute code to these JIRAs.

Regards,
Kai

-----Original Message-----
From: Larry McCay [mailto:lmccay@hortonworks.com] 
Sent: Thursday, July 04, 2013 4:10 AM
To: Zheng, Kai
Cc: common-dev@hadoop.apache.org
Subject: Re: [DISCUSS] Hadoop SSO/Token Server Components

Hi Kai -

I think that I need to clarify something...

This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop.
We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion.

Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value.

What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why.
In other words, we need to define and agree on the work that has to be done.

We also need to determine those components that need to be done before anything else can be started.
I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order.

Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point.

I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be.
If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it.

thanks,

--larry


On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:

> Hi Larry,
> 
> Thanks for the update. Good to see that with this update we are now aligned on most points.
> 
> I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes:
> 1.    Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO;
> 2.    Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services;
> 3.    Refined proxy access token and the proxy/impersonation flow;
> 4.    Refined the browser web SSO flow regarding access to Hadoop web services;
> 5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL;
> 6.    Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token;
> 7.    Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference;
> 8.    Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section;
> 9.    Added secured task launcher in appendices as possible solutions for Windows platform;
> 10.    Removed low level contents, and not so relevant parts into appendices section from the main body.
> 
> As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs.
> 
> I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work.
> 
> Look forward to your comments and comments from others in the community. Thanks.
> 
> Regards,
> Kai
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com]
> Sent: Wednesday, July 03, 2013 4:04 AM
> To: common-dev@hadoop.apache.org
> Subject: [DISCUSS] Hadoop SSO/Token Server Components
> 
> All -
> 
> As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
> There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.
> 
> https://issues.apache.org/jira/browse/HADOOP-9533
> https://issues.apache.org/jira/browse/HADOOP-9392
> 
> As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
> * An alternative authentication mechanism to Kerberos for user 
> authentication
> * A broader capability for integration into enterprise identity and 
> SSO solutions
> * Possibly the advertisement/negotiation of available authentication 
> mechanisms
> * Backward compatibility for the existing use of Kerberos
> * No (or minimal) changes to existing Hadoop tokens (delegation, job, 
> block access, etc)
> * Pluggable authentication mechanisms across: RPC, REST and webui 
> enforcement points
> * Continued support for existing authorization policy/ACLs, etc
> * Keeping more fine grained authorization policies in mind - like attribute based access control
> 	- fine grained access control is a separate but related effort that 
> we must not preclude with this effort
> * Cross cluster SSO
> 
> In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
>                               +------+
> 	+------+ credentials 1 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  2 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> The above diagram represents the simplest interaction model for an SSO service in Hadoop.
> 1. client authenticates to SSO service and acquires an access token  
> a. client presents credentials to an authentication service endpoint 
> exposed by the SSO server (AS) and receives a token representing the 
> authentication event and verified identity  b. client then presents 
> the identity token from 1.a. to the token endpoint exposed by the SSO 
> server (TGS) to request an access token to a particular Hadoop service 
> and receives an access token 2. client presents the Hadoop access 
> token to the Hadoop service for which the access token has been 
> granted and requests the desired resource or services  a. access token 
> is presented as appropriate for the service endpoint protocol being 
> used  b. Hadoop service token validation handler validates the token 
> and verifies its integrity and the identity of the issuer
> 
>    +------+
>    |  IdP |
>    +------+
>    1   ^ credentials
>        | :idp_token
>        |                      +------+
> 	+------+  idp_token  2 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  3 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> 
> The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
> 1. client authenticates to a trusted identity provider within the 
> enterprise and acquires an IdP specific token  a. client presents 
> credentials to an enterprise IdP and receives a token representing the 
> authentication identity 2. client authenticates to SSO service and 
> acquires an access token  a. client presents idp_token to an 
> authentication service endpoint exposed by the SSO server (AS) and 
> receives a token representing the authentication event and verified 
> identity  b. client then presents the identity token from 2.a. to the 
> token endpoint exposed by the SSO server (TGS) to request an access 
> token to a particular Hadoop service and receives an access token 3. 
> client presents the Hadoop access token to the Hadoop service for 
> which the access token has been granted and requests the desired 
> resource or services  a. access token is presented as appropriate for 
> the service endpoint protocol being used  b. Hadoop service token 
> validation handler validates the token and verifies its integrity and 
> the identity of the issuer
> 	
> Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:
> 
> 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.
> 
> 2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.
> 
> 3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.
> 
> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.
> 
> 5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 
> 
> 6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
> 
> 7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.
> 
> 8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.
> 
> So, discussion points:
> 
> 1. Are there additional components that would be required for a Hadoop SSO service?
> 2. Should any of the above described components be considered not actually necessary or poorly described?
> 2. Should we create a new umbrella Jira to identify each of these as a subtask?
> 3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
> 4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
> 
> Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.
> 
> Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.
> 
> thanks,
> 
> --larry

Re: [DISCUSS] Hadoop SSO/Token Server Components

Posted by Larry McCay <lm...@hortonworks.com>.

Hi Kai -

I think that I need to clarify something…

This is not an update for 9533 but a continuation of the discussions that are focused on a fresh look at a SSO for Hadoop.
We've agreed to leave our previous designs behind and therefore we aren't really seeing it as an HSSO layered on top of TAS approach or an HSSO vs TAS discussion.

Your latest design revision actually makes it clear that you are now targeting exactly what was described as HSSO - so comparing and contrasting is not going to add any value.

What we need you to do at this point, is to look at those high-level components described on this thread and comment on whether we need additional components or any that are listed that don't seem necessary to you and why.
In other words, we need to define and agree on the work that has to be done.

We also need to determine those components that need to be done before anything else can be started.
I happen to agree with Brian that #4 Hadoop SSO Tokens are central to all the other components and should probably be defined and POC'd in short order.

Personally, I think that continuing the separation of 9533 and 9392 will do this effort a disservice. There doesn't seem to be enough differences between the two to justify separate jiras anymore. It may be best to file a new one that reflects a single vision without the extra cruft that has built up in either of the existing ones. We would certainly reference the existing ones within the new one. This approach would align with the spirit of the discussions up to this point.

I am prepared to start a discussion around the shape of the two Hadoop SSO tokens: identity and access. If this is what others feel the next topic should be.
If we can identify a jira home for it, we can do it there - otherwise we can create another DISCUSS thread for it.

thanks,

--larry


On Jul 3, 2013, at 2:39 PM, "Zheng, Kai" <ka...@intel.com> wrote:

> Hi Larry,
> 
> Thanks for the update. Good to see that with this update we are now aligned on most points.
> 
> I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes:
> 1.    Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO;
> 2.    Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services;
> 3.    Refined proxy access token and the proxy/impersonation flow;
> 4.    Refined the browser web SSO flow regarding access to Hadoop web services;
> 5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL;
> 6.    Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token;
> 7.    Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference;
> 8.    Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section;
> 9.    Added secured task launcher in appendices as possible solutions for Windows platform;
> 10.    Removed low level contents, and not so relevant parts into appendices section from the main body.
> 
> As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs.
> 
> I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work.
> 
> Look forward to your comments and comments from others in the community. Thanks.
> 
> Regards,
> Kai
> 
> -----Original Message-----
> From: Larry McCay [mailto:lmccay@hortonworks.com] 
> Sent: Wednesday, July 03, 2013 4:04 AM
> To: common-dev@hadoop.apache.org
> Subject: [DISCUSS] Hadoop SSO/Token Server Components
> 
> All -
> 
> As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
> There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.
> 
> https://issues.apache.org/jira/browse/HADOOP-9533
> https://issues.apache.org/jira/browse/HADOOP-9392
> 
> As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
> * An alternative authentication mechanism to Kerberos for user authentication
> * A broader capability for integration into enterprise identity and SSO solutions
> * Possibly the advertisement/negotiation of available authentication mechanisms
> * Backward compatibility for the existing use of Kerberos
> * No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc)
> * Pluggable authentication mechanisms across: RPC, REST and webui enforcement points
> * Continued support for existing authorization policy/ACLs, etc
> * Keeping more fine grained authorization policies in mind - like attribute based access control
> 	- fine grained access control is a separate but related effort that we must not preclude with this effort
> * Cross cluster SSO
> 
> In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
>                               +------+
> 	+------+ credentials 1 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  2 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> The above diagram represents the simplest interaction model for an SSO service in Hadoop.
> 1. client authenticates to SSO service and acquires an access token
>  a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
>  b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
>  a. access token is presented as appropriate for the service endpoint protocol being used
>  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
> 
>    +------+
>    |  IdP |
>    +------+
>    1   ^ credentials
>        | :idp_token
>        |                      +------+
> 	+------+  idp_token  2 | SSO  |
> 	|CLIENT|-------------->|SERVER|
> 	+------+  :tokens      +------+
> 	  3 |                    
> 	    | access token
> 	    V :requested resource
> 	+-------+
> 	|HADOOP |
> 	|SERVICE|
> 	+-------+
> 	
> 
> The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
> 1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token
>  a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity 2. client authenticates to SSO service and acquires an access token
>  a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
>  b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
>  a. access token is presented as appropriate for the service endpoint protocol being used
>  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
> 	
> Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:
> 
> 1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.
> 
> 2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.
> 
> 3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.
> 
> 4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.
> 
> 5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 
> 
> 6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.
> 
> 7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.
> 
> 8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.
> 
> So, discussion points:
> 
> 1. Are there additional components that would be required for a Hadoop SSO service?
> 2. Should any of the above described components be considered not actually necessary or poorly described?
> 2. Should we create a new umbrella Jira to identify each of these as a subtask?
> 3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
> 4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?
> 
> Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.
> 
> Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.
> 
> thanks,
> 
> --larry

RE: [DISCUSS] Hadoop SSO/Token Server Components

Posted by "Zheng, Kai" <ka...@intel.com>.

Hi Larry,

Thanks for the update. Good to see that with this update we are now aligned on most points.

 I have also updated our TokenAuth design in HADOOP-9392. The new revision incorporates feedback and suggestions in related discussion with the community, particularly from Microsoft and others attending the Security design lounge session at the Hadoop summit. Summary of the changes:
1.    Revised the approach to now use two tokens, Identity Token plus Access Token, particularly considering our authorization framework and compatibility with HSSO;
2.    Introduced Authorization Server (AS) from our authorization framework into the flow that issues access tokens for clients with identity tokens to access services;
3.    Refined proxy access token and the proxy/impersonation flow;
4.    Refined the browser web SSO flow regarding access to Hadoop web services;
5.    Added Hadoop RPC access flow regarding CLI clients accessing Hadoop services via RPC/SASL;
6.    Added client authentication integration flow to illustrate how desktop logins can be integrated into the authentication process to TAS to exchange identity token;
7.    Introduced fine grained access control flow from authorization framework, I have put it in appendices section for the reference;
8.    Added a detailed flow to illustrate Hadoop Simple authentication over TokenAuth, in the appendices section;
9.    Added secured task launcher in appendices as possible solutions for Windows platform;
10.    Removed low level contents, and not so relevant parts into appendices section from the main body.

As we all think about how to layer HSSO on TAS in TokenAuth framework, please take some time to look at the doc and then let's discuss the gaps we might have. I would like to discuss these gaps with focus on the implementations details so we are all moving towards getting code done. Let's continue this part of the discussion in HADOOP-9392 to allow for better tracking on the JIRA itself. For discussions related to Centralized SSO server, suggest we continue to use HADOOP-9533 to consolidate all discussion related to that JIRA. That way we don't need extra umbrella JIRAs.

I agree we should speed up these discussions, agree on some of the implementation specifics so both us can get moving on the code while not stepping on each other in our work.

Look forward to your comments and comments from others in the community. Thanks.

Regards,
Kai

-----Original Message-----
From: Larry McCay [mailto:lmccay@hortonworks.com] 
Sent: Wednesday, July 03, 2013 4:04 AM
To: common-dev@hadoop.apache.org
Subject: [DISCUSS] Hadoop SSO/Token Server Components

All -

As a follow up to the discussions that were had during Hadoop Summit, I would like to introduce the discussion topic around the moving parts of a Hadoop SSO/Token Service.
There are a couple of related Jira's that can be referenced and may or may not be updated as a result of this discuss thread.

https://issues.apache.org/jira/browse/HADOOP-9533
https://issues.apache.org/jira/browse/HADOOP-9392

As the first aspect of the discussion, we should probably state the overall goals and scoping for this effort:
* An alternative authentication mechanism to Kerberos for user authentication
* A broader capability for integration into enterprise identity and SSO solutions
* Possibly the advertisement/negotiation of available authentication mechanisms
* Backward compatibility for the existing use of Kerberos
* No (or minimal) changes to existing Hadoop tokens (delegation, job, block access, etc)
* Pluggable authentication mechanisms across: RPC, REST and webui enforcement points
* Continued support for existing authorization policy/ACLs, etc
* Keeping more fine grained authorization policies in mind - like attribute based access control
	- fine grained access control is a separate but related effort that we must not preclude with this effort
* Cross cluster SSO

In order to tease out the moving parts here are a couple high level and simplified descriptions of SSO interaction flow:
                               +------+
	+------+ credentials 1 | SSO  |
	|CLIENT|-------------->|SERVER|
	+------+  :tokens      +------+
	  2 |                    
	    | access token
	    V :requested resource
	+-------+
	|HADOOP |
	|SERVICE|
	+-------+
	
The above diagram represents the simplest interaction model for an SSO service in Hadoop.
1. client authenticates to SSO service and acquires an access token
  a. client presents credentials to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
  b. client then presents the identity token from 1.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 2. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
  a. access token is presented as appropriate for the service endpoint protocol being used
  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
    
    +------+
    |  IdP |
    +------+
    1   ^ credentials
        | :idp_token
        |                      +------+
	+------+  idp_token  2 | SSO  |
	|CLIENT|-------------->|SERVER|
	+------+  :tokens      +------+
	  3 |                    
	    | access token
	    V :requested resource
	+-------+
	|HADOOP |
	|SERVICE|
	+-------+
	

The above diagram represents a slightly more complicated interaction model for an SSO service in Hadoop that removes Hadoop from the credential collection business.
1. client authenticates to a trusted identity provider within the enterprise and acquires an IdP specific token
  a. client presents credentials to an enterprise IdP and receives a token representing the authentication identity 2. client authenticates to SSO service and acquires an access token
  a. client presents idp_token to an authentication service endpoint exposed by the SSO server (AS) and receives a token representing the authentication event and verified identity
  b. client then presents the identity token from 2.a. to the token endpoint exposed by the SSO server (TGS) to request an access token to a particular Hadoop service and receives an access token 3. client presents the Hadoop access token to the Hadoop service for which the access token has been granted and requests the desired resource or services
  a. access token is presented as appropriate for the service endpoint protocol being used
  b. Hadoop service token validation handler validates the token and verifies its integrity and the identity of the issuer
	
Considering the above set of goals and high level interaction flow description, we can start to discuss the component inventory required to accomplish this vision:

1. SSO Server Instance: this component must be able to expose endpoints for both authentication of users by collecting and validating credentials and federation of identities represented by tokens from trusted IdPs within the enterprise. The endpoints should be composable so as to allow for multifactor authentication mechanisms. They will also need to return tokens that represent the authentication event and verified identity as well as access tokens for specific Hadoop services.

2. Authentication Providers: pluggable authentication mechanisms must be easily created and configured for use within the SSO server instance. They will ideally allow the enterprise to plugin their preferred components from off the shelf as well as provide custom providers. Supporting existing standards for such authentication providers should be a top priority concern. There are a number of standard approaches in use in the Java world: JAAS loginmodules, servlet filters, JASPIC authmodules, etc. A pluggable provider architecture that allows the enterprise to leverage existing investments in these technologies and existing skill sets would be ideal.

3. Token Authority: a token authority component would need to have the ability to issue, verify and revoke tokens. This authority will need to be trusted by all enforcement points that need to verify incoming tokens. Using something like PKI for establishing trust will be required.

4. Hadoop SSO Tokens: the exact shape and form of the sso tokens will need to be considered in order to determine the means by which trust and integrity are ensured while using them. There may be some abstraction of the underlying format provided through interface based design but all token implementations will need to have the same attributes and capabilities in terms of validation and cryptographic verification.

5. SSO Protocol: the lowest common denominator protocol for SSO server interactions across client types would likely be REST. Depending on the REST client in use it may require explicitly coding to the token flow described in the earlier interaction descriptions or a plugin may be provided for things like HTTPClient, curl, etc. RPC clients will have this taken care for them within the SASL layer and will leverage the REST endpoints as well. This likely implies trust requirements for the RPC client to be able to trust the SSO server's identity cert that is presented over SSL. 

6. REST Client Agent Plugins: required for encapsulating the interaction with the SSO server for the client programming models. We may need these for many client types: e.g. Java, JavaScript, .Net, Python, cURL etc.

7. Server Side Authentication Handlers: the server side of the REST, RPC or webui connection will need to be able to validate and verify the incoming Hadoop tokens in order to grant or deny access to requested resources.

8. Credential/Trust Management: throughout the system - on client and server sides - we will need to manage and provide access to PKI and potentially shared secret artifacts in order to establish the required trust relationships to replace the mutual authentication that would be otherwise provided by using kerberos everywhere.

So, discussion points:

1. Are there additional components that would be required for a Hadoop SSO service?
2. Should any of the above described components be considered not actually necessary or poorly described?
2. Should we create a new umbrella Jira to identify each of these as a subtask?
3. Should we just continue to use 9533 for the SSO server and add additional subtasks?
4. What are the natural seams of separation between these components and any dependencies between one and another that affect priority?

Obviously, each component that we identify will have a jira of its own - more than likely - so we are only trying to identify the high level descriptions for now.

Can we try and drive this discussion to a close by the end of the week? This will allow us to start breaking out into component implementation plans.

thanks,

--larry