You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by Larry McCay <lm...@hortonworks.com> on 2013/07/01 23:40:27 UTC

Hadoop Summit: Security Design Lounge Session

All -

Last week at Hadoop Summit there was a room dedicated as the summit Design Lounge.
This was a place where like folks could get together and talk about design issues with other contributors with a simple flip board and some beanbag chairs.
We used this as an opportunity to bootstrap some discussions within common-dev for security related topics. I'd like to summarize the security session and takeaways here for everyone.

This summary and set of takeaways are largely from memory. 
Please - anyone that attended - feel free to correct anything that is inaccurate or omitted.

Pretty well attended - companies represented:
* Yahoo!
* Microsoft
* Hortonworks
* Cloudera
* Intel
* eBay
* Voltage Security
* Flying Penguins
* EMC
* others...

Most folks were pretty engaged throughout the session.
We set expectations as a meet and greet/project kickoff - project being the emerging security development community.

In order to keep the scope of conversations manageable we tried to keep focused on authentication and the ideas around SSO and tokens.

We discussed kerberos as:
1. major pain point and barrier to entry for some
2. seemingly perfect for others
	a. obviously requiring backward compatibility

It seemed to be consensus that:
1. user authentication should be easily integrated with alternative enterprise identity solutions
2. that service identity issues should not require thousands of service identities added to enterprise user repositories
3. that customers should not be forced to install/deploy and manage a KDC for services - this implies a couple options:
	a. alternatives to kerberos for service identities
	b. hadoop KDC implementation - ie. ApacheDS?

There was active discussion around:
1. Hadoop SSO server
	a. acknowledgement of Hadoop SSO tokens as something that can be standardized for representing both the identity and authentication event data as well and access tokens representing a verifiable means for the authenticated identity to access resources or services
	b. a general understanding of Hadoop SSO as being an analogue and alternative for the kerberos KDC and the related tokens being analogous to TGTs and service tickets
	c. an agreement that there are interesting attributes about the authentication event that may be useful in cross cluster trust for SSO - such as a rating of authentication strength and number of factors, etc
	d. that existing Hadoop tokens - ie. delegation, job, block access - will all continue to work and that we are initially looking at alternatives to the KDC, TGTs and service tickets
2. authentication mechanism discovery by clients - Daryn Sharp has done a bunch of work around this and our SSO solution may want to consider a similar mechanism for discovering trusted IDPs and service endpoints
3. backward compatibility - kerberos shops need to just continue to work
4. some insight into where/how folks believe that token based authentication can be accomplished within existing contracts - SASL/GSSAPI, REST, web ui
5. what the establishment of a cross cutting concern community around security and what that means in terms of the Apache way - email lists, wiki, Jiras across projects, etc
6. dependencies, rolling updates, patching and how it related to hadoop projects versus packaging
7. collaboration road ahead

A number of breakout discussions were had outside of the designated design lounge session as well.

Takeaways for the immediate road ahead:
1. common-dev may be sufficient to discuss security related topics
	a. many developers are already subscribed to it
	b. there is not that much traffic there anyway
	c. we can discuss a more security focused list if we like
2. we will discuss the establishment of a wiki space for a holistic view of security model, patterns, approaches, etc
3. we will begin discussion on common-dev in near-term for the following:
	a. discuss and agree on the high level moving parts required for our goals for authentication: SSO service, tokens, token validation handlers, credential management tools, etc
	b. discuss and agree on the natural seams across these moving parts and agree on collaboration by tackling various pieces in a divide and conquer approach
	c. more than likely - the first piece that will need some immediate discussion will be the shape and form of the tokens
	d. we will follow up or supplement discussions with POC code patches and/or specs attached to jiras

Overall, design lounge was rather effective for what we wanted to do - which was to bootstrap discussions and collaboration within the community at large. As always, no specific decisions have been made during this session and we can discuss any or all of this within common-dev and on related jiras.

Jiras related to the security development group and these discussions:

Centralized SSO/Token Server https://issues.apache.org/jira/browse/HADOOP-9533
Token based authentication and SSO https://issues.apache.org/jira/browse/HADOOP-9392
Document/analyze current Hadoop security model https://issues.apache.org/jira/browse/HADOOP-9621
Improve Hadoop security - Use cases https://issues.apache.org/jira/browse/HADOOP-9671

thanks,

--larry


Re: Hadoop Summit: Security Design Lounge Session

Posted by Larry McCay <lm...@hortonworks.com>.
Adding additional takeaways that were articulated by Alejandro and expanded by me in another thread - so that we have it all in one place…thanks again, Alejandro!

++++++++++++++++++++++++++++++++++++++++++++++++

Hi Alejandro -

I missed your #4 in my summary and takeaways of the session in another thread on this list.

I believe that the points of discussion were along the lines of:

* put common security libraries into common much the same way as hadoop-auth is today making each available as separate maven modules to be used across the ecosystem
* the was a concern raised that we need to be cognizant of not using common as a "dumping grounds"
	- I believe this to mean that we need to ensure that the libraries that are added there are truly cross cutting and can be used by the other projects across Hadoop
	- I think that security related things will largely be of that nature but we need to keep it in mind

I'm not sure whether #3 is represented in the other summary or not…

There was certainly discussions around the emerging work from Daryn related to pluggable authentication mechanisms within that layer and we will immediately have the options of kerberos, simple and plain. There was also talk of how this can be leveraged to introduce a Hadoop token mechanism as well. 

At the same time, there was talk of the possibility of simply making kerberos easy and a non-issue for intra-cluster use. Certainly we need both of these approaches.
I believe someone used ApacheDS' KDC support as an example - if we could standup an ApacheDS based KDC and configure it and related keytabs easily than the end-to-end story is more palatable to a broader user base. That story being the choice of authentication mechanisms for user authentication and easy provisioning and management of kerberos for intra-cluster service authentication.

If you agree with this extended summary then I can update the other thread with that recollection.
Thanks for providing it!

--larry

On Jul 4, 2013, at 4:09 PM, Alejandro Abdelnur <tu...@cloudera.com> wrote:

> Leaving JIRAs and design docs aside, my recollection from the f2f lounge
> discussion could be summarized as:
> 
> ------
> 1* Decouple users-services authentication from (intra) services-services
> authentication.
> 
> The main motivation for this is to get pluggable authentication and
> integrated SSO experience for users.
> 
> (we never discussed if this is needed for external-apps talking with Hadoop)
> 
> 2* We should leave the Hadoop delegation tokens alone
> 
> No need to make this pluggable as this is an internal authentication
> mechanism after the 'real' authentication happened.
> 
> (this is independent from factoring out all classes we currently have into
> a common implementation for Hadoop and other projects to use)
> 
> 3* Being able to replace kerberos with something else for (intra)
> services-services authentication.
> 
> It was suggested that to support deployments where stock Kerberos may not
> be an option (i.e. cloud) we should make sure that UserGroupInformation and
> RPC security logic work with a pluggable GSS implementation.
> 
> 4* Create a common security component ie 'hadoop-security' to be 'the'
> security lib for all projects to use.
> 
> Create a component/project that would provide the common security pieces
> for all projects to use.
> 
> ------
> 
> If we agree with this, after any necessary corrections, I think we could
> distill clear goals from it and start from there.
> 
> Thanks.
> 
> Tucu & Alejandro


On Jul 1, 2013, at 5:40 PM, Larry McCay <lm...@hortonworks.com> wrote:

> All -
> 
> Last week at Hadoop Summit there was a room dedicated as the summit Design Lounge.
> This was a place where like folks could get together and talk about design issues with other contributors with a simple flip board and some beanbag chairs.
> We used this as an opportunity to bootstrap some discussions within common-dev for security related topics. I'd like to summarize the security session and takeaways here for everyone.
> 
> This summary and set of takeaways are largely from memory. 
> Please - anyone that attended - feel free to correct anything that is inaccurate or omitted.
> 
> Pretty well attended - companies represented:
> * Yahoo!
> * Microsoft
> * Hortonworks
> * Cloudera
> * Intel
> * eBay
> * Voltage Security
> * Flying Penguins
> * EMC
> * others...
> 
> Most folks were pretty engaged throughout the session.
> We set expectations as a meet and greet/project kickoff - project being the emerging security development community.
> 
> In order to keep the scope of conversations manageable we tried to keep focused on authentication and the ideas around SSO and tokens.
> 
> We discussed kerberos as:
> 1. major pain point and barrier to entry for some
> 2. seemingly perfect for others
> 	a. obviously requiring backward compatibility
> 
> It seemed to be consensus that:
> 1. user authentication should be easily integrated with alternative enterprise identity solutions
> 2. that service identity issues should not require thousands of service identities added to enterprise user repositories
> 3. that customers should not be forced to install/deploy and manage a KDC for services - this implies a couple options:
> 	a. alternatives to kerberos for service identities
> 	b. hadoop KDC implementation - ie. ApacheDS?
> 
> There was active discussion around:
> 1. Hadoop SSO server
> 	a. acknowledgement of Hadoop SSO tokens as something that can be standardized for representing both the identity and authentication event data as well and access tokens representing a verifiable means for the authenticated identity to access resources or services
> 	b. a general understanding of Hadoop SSO as being an analogue and alternative for the kerberos KDC and the related tokens being analogous to TGTs and service tickets
> 	c. an agreement that there are interesting attributes about the authentication event that may be useful in cross cluster trust for SSO - such as a rating of authentication strength and number of factors, etc
> 	d. that existing Hadoop tokens - ie. delegation, job, block access - will all continue to work and that we are initially looking at alternatives to the KDC, TGTs and service tickets
> 2. authentication mechanism discovery by clients - Daryn Sharp has done a bunch of work around this and our SSO solution may want to consider a similar mechanism for discovering trusted IDPs and service endpoints
> 3. backward compatibility - kerberos shops need to just continue to work
> 4. some insight into where/how folks believe that token based authentication can be accomplished within existing contracts - SASL/GSSAPI, REST, web ui
> 5. what the establishment of a cross cutting concern community around security and what that means in terms of the Apache way - email lists, wiki, Jiras across projects, etc
> 6. dependencies, rolling updates, patching and how it related to hadoop projects versus packaging
> 7. collaboration road ahead
> 
> A number of breakout discussions were had outside of the designated design lounge session as well.
> 
> Takeaways for the immediate road ahead:
> 1. common-dev may be sufficient to discuss security related topics
> 	a. many developers are already subscribed to it
> 	b. there is not that much traffic there anyway
> 	c. we can discuss a more security focused list if we like
> 2. we will discuss the establishment of a wiki space for a holistic view of security model, patterns, approaches, etc
> 3. we will begin discussion on common-dev in near-term for the following:
> 	a. discuss and agree on the high level moving parts required for our goals for authentication: SSO service, tokens, token validation handlers, credential management tools, etc
> 	b. discuss and agree on the natural seams across these moving parts and agree on collaboration by tackling various pieces in a divide and conquer approach
> 	c. more than likely - the first piece that will need some immediate discussion will be the shape and form of the tokens
> 	d. we will follow up or supplement discussions with POC code patches and/or specs attached to jiras
> 
> Overall, design lounge was rather effective for what we wanted to do - which was to bootstrap discussions and collaboration within the community at large. As always, no specific decisions have been made during this session and we can discuss any or all of this within common-dev and on related jiras.
> 
> Jiras related to the security development group and these discussions:
> 
> Centralized SSO/Token Server https://issues.apache.org/jira/browse/HADOOP-9533
> Token based authentication and SSO https://issues.apache.org/jira/browse/HADOOP-9392
> Document/analyze current Hadoop security model https://issues.apache.org/jira/browse/HADOOP-9621
> Improve Hadoop security - Use cases https://issues.apache.org/jira/browse/HADOOP-9671
> 
> thanks,
> 
> --larry
> 


RE: Hadoop Summit: Security Design Lounge Session

Posted by Kyle Leckie <ky...@microsoft.com>.
Thanks for the excellent summary Larry,

Questions for the group:
I have taken a quick look at how pluggable token validation could be added to the RPC endpoints:
	- Are there any current approaches that I should examined before I continue with my investigation?
	- For server Auth; I would like to consider TLS. Has there been any benchmarking of a well implemented server stack (supports session caching and has algorithms configured for performance)?
--
Kyle

-----Original Message-----
From: Larry McCay [mailto:lmccay@hortonworks.com] 
Sent: Monday, July 1, 2013 2:40 PM
To: common-dev@hadoop.apache.org
Subject: Hadoop Summit: Security Design Lounge Session

All -

Last week at Hadoop Summit there was a room dedicated as the summit Design Lounge.
This was a place where like folks could get together and talk about design issues with other contributors with a simple flip board and some beanbag chairs.
We used this as an opportunity to bootstrap some discussions within common-dev for security related topics. I'd like to summarize the security session and takeaways here for everyone.

This summary and set of takeaways are largely from memory. 
Please - anyone that attended - feel free to correct anything that is inaccurate or omitted.

Pretty well attended - companies represented:
* Yahoo!
* Microsoft
* Hortonworks
* Cloudera
* Intel
* eBay
* Voltage Security
* Flying Penguins
* EMC
* others...

Most folks were pretty engaged throughout the session.
We set expectations as a meet and greet/project kickoff - project being the emerging security development community.

In order to keep the scope of conversations manageable we tried to keep focused on authentication and the ideas around SSO and tokens.

We discussed kerberos as:
1. major pain point and barrier to entry for some 2. seemingly perfect for others
	a. obviously requiring backward compatibility

It seemed to be consensus that:
1. user authentication should be easily integrated with alternative enterprise identity solutions 2. that service identity issues should not require thousands of service identities added to enterprise user repositories 3. that customers should not be forced to install/deploy and manage a KDC for services - this implies a couple options:
	a. alternatives to kerberos for service identities
	b. hadoop KDC implementation - ie. ApacheDS?

There was active discussion around:
1. Hadoop SSO server
	a. acknowledgement of Hadoop SSO tokens as something that can be standardized for representing both the identity and authentication event data as well and access tokens representing a verifiable means for the authenticated identity to access resources or services
	b. a general understanding of Hadoop SSO as being an analogue and alternative for the kerberos KDC and the related tokens being analogous to TGTs and service tickets
	c. an agreement that there are interesting attributes about the authentication event that may be useful in cross cluster trust for SSO - such as a rating of authentication strength and number of factors, etc
	d. that existing Hadoop tokens - ie. delegation, job, block access - will all continue to work and that we are initially looking at alternatives to the KDC, TGTs and service tickets 2. authentication mechanism discovery by clients - Daryn Sharp has done a bunch of work around this and our SSO solution may want to consider a similar mechanism for discovering trusted IDPs and service endpoints 3. backward compatibility - kerberos shops need to just continue to work 4. some insight into where/how folks believe that token based authentication can be accomplished within existing contracts - SASL/GSSAPI, REST, web ui 5. what the establishment of a cross cutting concern community around security and what that means in terms of the Apache way - email lists, wiki, Jiras across projects, etc 6. dependencies, rolling updates, patching and how it related to hadoop projects versus packaging 7. collaboration road ahead

A number of breakout discussions were had outside of the designated design lounge session as well.

Takeaways for the immediate road ahead:
1. common-dev may be sufficient to discuss security related topics
	a. many developers are already subscribed to it
	b. there is not that much traffic there anyway
	c. we can discuss a more security focused list if we like 2. we will discuss the establishment of a wiki space for a holistic view of security model, patterns, approaches, etc 3. we will begin discussion on common-dev in near-term for the following:
	a. discuss and agree on the high level moving parts required for our goals for authentication: SSO service, tokens, token validation handlers, credential management tools, etc
	b. discuss and agree on the natural seams across these moving parts and agree on collaboration by tackling various pieces in a divide and conquer approach
	c. more than likely - the first piece that will need some immediate discussion will be the shape and form of the tokens
	d. we will follow up or supplement discussions with POC code patches and/or specs attached to jiras

Overall, design lounge was rather effective for what we wanted to do - which was to bootstrap discussions and collaboration within the community at large. As always, no specific decisions have been made during this session and we can discuss any or all of this within common-dev and on related jiras.

Jiras related to the security development group and these discussions:

Centralized SSO/Token Server https://issues.apache.org/jira/browse/HADOOP-9533
Token based authentication and SSO https://issues.apache.org/jira/browse/HADOOP-9392
Document/analyze current Hadoop security model https://issues.apache.org/jira/browse/HADOOP-9621
Improve Hadoop security - Use cases https://issues.apache.org/jira/browse/HADOOP-9671

thanks,

--larry



FW: Hadoop Summit: Security Design Lounge Session

Posted by "Mattmann, Chris A (398J)" <ch...@jpl.nasa.gov>.
Fwd'ing to Knox list.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Larry McCay <lm...@hortonworks.com>
Reply-To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
Date: Monday, July 1, 2013 2:40 PM
To: "common-dev@hadoop.apache.org" <co...@hadoop.apache.org>
Subject: Hadoop Summit: Security Design Lounge Session

>All -
>
>Last week at Hadoop Summit there was a room dedicated as the summit
>Design Lounge.
>This was a place where like folks could get together and talk about
>design issues with other contributors with a simple flip board and some
>beanbag chairs.
>We used this as an opportunity to bootstrap some discussions within
>common-dev for security related topics. I'd like to summarize the
>security session and takeaways here for everyone.
>
>This summary and set of takeaways are largely from memory.
>Please - anyone that attended - feel free to correct anything that is
>inaccurate or omitted.
>
>Pretty well attended - companies represented:
>* Yahoo!
>* Microsoft
>* Hortonworks
>* Cloudera
>* Intel
>* eBay
>* Voltage Security
>* Flying Penguins
>* EMC
>* others...
>
>Most folks were pretty engaged throughout the session.
>We set expectations as a meet and greet/project kickoff - project being
>the emerging security development community.
>
>In order to keep the scope of conversations manageable we tried to keep
>focused on authentication and the ideas around SSO and tokens.
>
>We discussed kerberos as:
>1. major pain point and barrier to entry for some
>2. seemingly perfect for others
>	a. obviously requiring backward compatibility
>
>It seemed to be consensus that:
>1. user authentication should be easily integrated with alternative
>enterprise identity solutions
>2. that service identity issues should not require thousands of service
>identities added to enterprise user repositories
>3. that customers should not be forced to install/deploy and manage a KDC
>for services - this implies a couple options:
>	a. alternatives to kerberos for service identities
>	b. hadoop KDC implementation - ie. ApacheDS?
>
>There was active discussion around:
>1. Hadoop SSO server
>	a. acknowledgement of Hadoop SSO tokens as something that can be
>standardized for representing both the identity and authentication event
>data as well and access tokens representing a verifiable means for the
>authenticated identity to access resources or services
>	b. a general understanding of Hadoop SSO as being an analogue and
>alternative for the kerberos KDC and the related tokens being analogous
>to TGTs and service tickets
>	c. an agreement that there are interesting attributes about the
>authentication event that may be useful in cross cluster trust for SSO -
>such as a rating of authentication strength and number of factors, etc
>	d. that existing Hadoop tokens - ie. delegation, job, block access -
>will all continue to work and that we are initially looking at
>alternatives to the KDC, TGTs and service tickets
>2. authentication mechanism discovery by clients - Daryn Sharp has done a
>bunch of work around this and our SSO solution may want to consider a
>similar mechanism for discovering trusted IDPs and service endpoints
>3. backward compatibility - kerberos shops need to just continue to work
>4. some insight into where/how folks believe that token based
>authentication can be accomplished within existing contracts -
>SASL/GSSAPI, REST, web ui
>5. what the establishment of a cross cutting concern community around
>security and what that means in terms of the Apache way - email lists,
>wiki, Jiras across projects, etc
>6. dependencies, rolling updates, patching and how it related to hadoop
>projects versus packaging
>7. collaboration road ahead
>
>A number of breakout discussions were had outside of the designated
>design lounge session as well.
>
>Takeaways for the immediate road ahead:
>1. common-dev may be sufficient to discuss security related topics
>	a. many developers are already subscribed to it
>	b. there is not that much traffic there anyway
>	c. we can discuss a more security focused list if we like
>2. we will discuss the establishment of a wiki space for a holistic view
>of security model, patterns, approaches, etc
>3. we will begin discussion on common-dev in near-term for the following:
>	a. discuss and agree on the high level moving parts required for our
>goals for authentication: SSO service, tokens, token validation handlers,
>credential management tools, etc
>	b. discuss and agree on the natural seams across these moving parts and
>agree on collaboration by tackling various pieces in a divide and conquer
>approach
>	c. more than likely - the first piece that will need some immediate
>discussion will be the shape and form of the tokens
>	d. we will follow up or supplement discussions with POC code patches
>and/or specs attached to jiras
>
>Overall, design lounge was rather effective for what we wanted to do -
>which was to bootstrap discussions and collaboration within the community
>at large. As always, no specific decisions have been made during this
>session and we can discuss any or all of this within common-dev and on
>related jiras.
>
>Jiras related to the security development group and these discussions:
>
>Centralized SSO/Token Server
>https://issues.apache.org/jira/browse/HADOOP-9533
>Token based authentication and SSO
>https://issues.apache.org/jira/browse/HADOOP-9392
>Document/analyze current Hadoop security model
>https://issues.apache.org/jira/browse/HADOOP-9621
>Improve Hadoop security - Use cases
>https://issues.apache.org/jira/browse/HADOOP-9671
>
>thanks,
>
>--larry
>