You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Larry McCay (JIRA)" <ji...@apache.org> on 2013/07/01 23:25:20 UTC

[jira] [Commented] (HADOOP-9533) Centralized Hadoop SSO/Token Server

    [ https://issues.apache.org/jira/browse/HADOOP-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697202#comment-13697202 ] 

Larry McCay commented on HADOOP-9533:
-------------------------------------

- Summit Summary -

Last week at Hadoop Summit there was a room dedicated as the summit Design Lounge.
This was a place where folks could get together and talk about design issues with other contributors with a simple flip-board and some beanbag chairs.
We used this as an opportunity to bootstrap some discussions within common-dev for security related topics. I'd like to summarize the security session and takeaways here for everyone.

This summary and set of takeaways are largely from memory. 
Please feel free to correct anything that is inaccurate or omitted.

Pretty well attended - don't recall all the names but some of the companies represented:
* Yahoo!
* Microsoft
* Hortonworks
* Intel
* eBay
* Voltage Security
* Flying Penguins
* EMC
* others...

We set expectations as a meet and greet/project kickoff - project being the emerging security development community.
Most folks were pretty engaged throughout the session.

In order to keep the scope of conversations manageable we tried to remain focused on authentication and the ideas around SSO and tokens.

We discussed kerberos as:
1. major pain point and barrier to entry for some
2. seemingly perfect for others
	a. obviously requiring backward compatibility

It seemed to be consensus that:
1. user authentication should be easily integrated with alternative enterprise identity solutions
2. that service identity issues should not require thousands of service identities added to enterprise user repositories
3. that customers should not be forced to install/deploy and manage a KDC for services - this implies a couple options:
	a. alternatives to kerberos for service identities
	b. hadoop KDC implementation - ie. ApacheDS?

There was active discussion around:
1. Hadoop SSO server
	a. acknowledgement of Hadoop SSO tokens as something that can be standardized for representing both the identity and authentication event data as well and access tokens representing a verifiable means for the authenticated identity to access resources or services
	b. a general understanding of Hadoop SSO as being an analogue and alternative for the kerberos KDC and the related tokens being analogous to TGTs and service tickets
	c. an agreement that there are interesting attributes about the authentication event that may be useful in cross cluster trust for SSO - such as a rating of authentication strength and number of factors, etc
	d. that existing Hadoop tokens - ie. delegation, job, block access - will all continue to work and that we are initially looking at alternatives to the KDC, TGTs and service tickets
2. authentication mechanism discovery by clients - Daryn Sharp has done a bunch of work around this and our SSO solution may want to consider a similar mechanism for discovering trusted IDPs and service endpoints
3. backward compatibility - kerberos shops need to just continue to work
4. some insight into where/how folks believe that token based authentication can be accomplished within existing contracts - SASL/GSSAPI, REST, web ui
5. what the establishment of a cross cutting concern community around security and what that means in terms of the Apache way - email lists, wiki, Jiras across projects, etc
6. dependencies, rolling updates, patching and how it related to hadoop projects versus packaging
7. collaboration road ahead

A number of breakout discussions were had outside of the designated design lounge session as well.

Takeaways for the immediate road ahead:
1. common-dev may be sufficient to discuss security related topics
	a. many developers are already subscribed to it
	b. there is not that much traffic there anyway
	c. we can discuss a more security focused list if we like
2. we will discuss the establishment of a wiki space for a holistic view of security model, patterns, approaches, etc
3. we will begin discussion on common-dev in near-term for the following:
	a. discuss and agree on the high level moving parts required for our goals for authentication: SSO service, tokens, token validation handlers, credential management tools, etc
	b. discuss and agree on the natural seams across these moving parts and agree on collaboration by tackling various pieces in a divide and conquer approach
	c. more than likely - the first piece that will need some immediate discussion will be the shape and form of the tokens
	d. we will follow up or supplement discussions with POC code patches and/or specs attached to jiras

Overall, design lounge was rather effective for what we wanted to do - which was to bootstrap discussions and collaboration within the community at large. As always, no specific decisions have been made during this session and we can discuss any or all of this within common-dev and on related jiras.

Jiras related to the security development group and these discussions:

Centralized SSO/Token Server https://issues.apache.org/jira/browse/HADOOP-9533
Token based authentication and SSO https://issues.apache.org/jira/browse/HADOOP-9392
Document/analyze current Hadoop security model https://issues.apache.org/jira/browse/HADOOP-9621
Improve Hadoop security - Use cases https://issues.apache.org/jira/browse/HADOOP-9671

                
> Centralized Hadoop SSO/Token Server
> -----------------------------------
>
>                 Key: HADOOP-9533
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9533
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>            Reporter: Larry McCay
>         Attachments: HSSO-Interaction-Overview-rev-1.docx, HSSO-Interaction-Overview-rev-1.pdf
>
>
> This is an umbrella Jira filing to oversee a set of proposals for introducing a new master service for Hadoop Single Sign On (HSSO).
> There is an increasing need for pluggable authentication providers that authenticate both users and services as well as validate tokens in order to federate identities authenticated by trusted IDPs. These IDPs may be deployed within the enterprise or third-party IDPs that are external to the enterprise.
> These needs speak to a specific pain point: which is a narrow integration path into the enterprise identity infrastructure. Kerberos is a fine solution for those that already have it in place or are willing to adopt its use but there remains a class of user that finds this unacceptable and needs to integrate with a wider variety of identity management solutions.
> Another specific pain point is that of rolling and distributing keys. A related and integral part of the HSSO server is library called the Credential Management Framework (CMF), which will be a common library for easing the management of secrets, keys and credentials.
> Initially, the existing delegation, block access and job tokens will continue to be utilized. There may be some changes required to leverage a PKI based signature facility rather than shared secrets. This is a means to simplify the solution for the pain point of distributing shared secrets.
> This project will primarily centralize the responsibility of authentication and federation into a single service that is trusted across the Hadoop cluster and optionally across multiple clusters. This greatly simplifies a number of things in the Hadoop ecosystem:
> 1.	a single token format that is used across all of Hadoop regardless of authentication method
> 2.	a single service to have pluggable providers instead of all services
> 3.	a single token authority that would be trusted across the cluster/s and through PKI encryption be able to easily issue cryptographically verifiable tokens
> 4.	automatic rolling of the token authority’s keys and publishing of the public key for easy access by those parties that need to verify incoming tokens
> 5.	use of PKI for signatures eliminates the need for securely sharing and distributing shared secrets
> In addition to serving as the internal Hadoop SSO service this service will be leveraged by the Knox Gateway from the cluster perimeter in order to acquire the Hadoop cluster tokens. The same token mechanism that is used for internal services will be used to represent user identities. Providing for interesting scenarios such as SSO across Hadoop clusters within an enterprise and/or into the cloud.
> The HSSO service will be comprised of three major components and capabilities:
> 1.	Federating IDP – authenticates users/services and issues the common Hadoop token
> 2.	Federating SP – validates the token of trusted external IDPs and issues the common Hadoop token
> 3.	Token Authority – management of the common Hadoop tokens – including: 
>     a.	Issuance 
>     b.	Renewal
>     c.	Revocation
> As this is a meta Jira for tracking this overall effort, the details of the individual efforts will be submitted along with the child Jira filings.
> Hadoop-Common would seem to be the most appropriate home for such a service and its related common facilities. We will also leverage and extend existing common mechanisms as appropriate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira