You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Robert Metzger <rm...@apache.org> on 2015/02/12 15:28:34 UTC

Transferring security tokens to remote machines

Hi,

I'm a committer at the Apache Flink project.
One of our users asked for adding support for reading from a secured HDFS
cluster.

Flink has a master-worker model. Since its not really feasible for users to
login with their kerberos credentials on all workers, I wanted to acquire
the security token on the master and send it to all workers.
For that, I wrote the following code to get the tokens in to a byte array:

UserGroupInformation.setConfiguration(hdConf);
Credentials credentials = new Credentials();
UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();

Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
for(Token<? extends TokenIdentifier> token : usrTok) {
   final Text id = new Text(token.getIdentifier());
   credentials.addToken(id, token);
}
DataOutputBuffer dob = new DataOutputBuffer();
credentials.writeTokenStorageToStream(dob);
dob.flush();

However, the collection currUsr.getTokens() is empty, hence the output
buffer doesn't contain much data.

I suspect that I didn't fully understand the Hadoop security concepts yet.
It would be great if somebody from the list could clarify how to
properly acquire the tokens.

Also, I was wondering if there is any document describing how the
UserGroupInformation class is working (when is it loading the
credentials, does it only work for Kerberos, ...)


Best,

Robert

Re: Transferring security tokens to remote machines

Posted by Robert Metzger <rm...@apache.org>.
Hi,

thank you for the quick reply. I'll look into the links to see if we can
implement a similar mechanism.

Robert

On Thu, Feb 12, 2015 at 6:19 PM, Alexander Alten-Lorenz <wget.null@gmail.com
> wrote:

> Hi Robert,
>
> forgive me if I’m wrong, but so far as I understand Flink uses nearly the
> same model as HDFS (not at all). Means the master receives an action and
> distribute that to the workers (more or less ;))
> HDFS as example uses not an push mechanism, the DN clients fetch the token
> from the NN when they need them. Could that be a solution, too?
>
>
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java
>
> MapReduce gets the token from the JT with getDelegationToken()
>
>
> http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)
>
>
> BR,
>  Alexander
>
>
> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
>
> Hi,
>
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS
> cluster.
>
> Flink has a master-worker model. Since its not really feasible for users
> to login with their kerberos credentials on all workers, I wanted to
> acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
>
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
>
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
>
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
>
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
>
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
>
>
> Best,
>
> Robert
>
>
>
>

Re: Transferring security tokens to remote machines

Posted by Robert Metzger <rm...@apache.org>.
Hi,

thank you for the quick reply. I'll look into the links to see if we can
implement a similar mechanism.

Robert

On Thu, Feb 12, 2015 at 6:19 PM, Alexander Alten-Lorenz <wget.null@gmail.com
> wrote:

> Hi Robert,
>
> forgive me if I’m wrong, but so far as I understand Flink uses nearly the
> same model as HDFS (not at all). Means the master receives an action and
> distribute that to the workers (more or less ;))
> HDFS as example uses not an push mechanism, the DN clients fetch the token
> from the NN when they need them. Could that be a solution, too?
>
>
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java
>
> MapReduce gets the token from the JT with getDelegationToken()
>
>
> http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)
>
>
> BR,
>  Alexander
>
>
> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
>
> Hi,
>
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS
> cluster.
>
> Flink has a master-worker model. Since its not really feasible for users
> to login with their kerberos credentials on all workers, I wanted to
> acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
>
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
>
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
>
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
>
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
>
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
>
>
> Best,
>
> Robert
>
>
>
>

Re: Transferring security tokens to remote machines

Posted by Robert Metzger <rm...@apache.org>.
Hi,

thank you for the quick reply. I'll look into the links to see if we can
implement a similar mechanism.

Robert

On Thu, Feb 12, 2015 at 6:19 PM, Alexander Alten-Lorenz <wget.null@gmail.com
> wrote:

> Hi Robert,
>
> forgive me if I’m wrong, but so far as I understand Flink uses nearly the
> same model as HDFS (not at all). Means the master receives an action and
> distribute that to the workers (more or less ;))
> HDFS as example uses not an push mechanism, the DN clients fetch the token
> from the NN when they need them. Could that be a solution, too?
>
>
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java
>
> MapReduce gets the token from the JT with getDelegationToken()
>
>
> http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)
>
>
> BR,
>  Alexander
>
>
> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
>
> Hi,
>
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS
> cluster.
>
> Flink has a master-worker model. Since its not really feasible for users
> to login with their kerberos credentials on all workers, I wanted to
> acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
>
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
>
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
>
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
>
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
>
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
>
>
> Best,
>
> Robert
>
>
>
>

Re: Transferring security tokens to remote machines

Posted by Robert Metzger <rm...@apache.org>.
Hi,

thank you for the quick reply. I'll look into the links to see if we can
implement a similar mechanism.

Robert

On Thu, Feb 12, 2015 at 6:19 PM, Alexander Alten-Lorenz <wget.null@gmail.com
> wrote:

> Hi Robert,
>
> forgive me if I’m wrong, but so far as I understand Flink uses nearly the
> same model as HDFS (not at all). Means the master receives an action and
> distribute that to the workers (more or less ;))
> HDFS as example uses not an push mechanism, the DN clients fetch the token
> from the NN when they need them. Could that be a solution, too?
>
>
> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java
>
> MapReduce gets the token from the JT with getDelegationToken()
>
>
> http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)
>
>
> BR,
>  Alexander
>
>
> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
>
> Hi,
>
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS
> cluster.
>
> Flink has a master-worker model. Since its not really feasible for users
> to login with their kerberos credentials on all workers, I wanted to
> acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
>
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
>
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
>
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
>
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
>
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
>
>
> Best,
>
> Robert
>
>
>
>

Re: Transferring security tokens to remote machines

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Robert,

forgive me if I’m wrong, but so far as I understand Flink uses nearly the same model as HDFS (not at all). Means the master receives an action and distribute that to the workers (more or less ;)) 
HDFS as example uses not an push mechanism, the DN clients fetch the token from the NN when they need them. Could that be a solution, too?

https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java <https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java>

MapReduce gets the token from the JT with getDelegationToken()

http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text) <http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)>


BR,
 Alexander 


> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
> 
> Hi,
> 
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS cluster.
> 
> Flink has a master-worker model. Since its not really feasible for users to login with their kerberos credentials on all workers, I wanted to acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
> 
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
> 
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
> 
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
> 
> Best,
> Robert
> 


Re: Transferring security tokens to remote machines

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Robert,

forgive me if I’m wrong, but so far as I understand Flink uses nearly the same model as HDFS (not at all). Means the master receives an action and distribute that to the workers (more or less ;)) 
HDFS as example uses not an push mechanism, the DN clients fetch the token from the NN when they need them. Could that be a solution, too?

https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java <https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java>

MapReduce gets the token from the JT with getDelegationToken()

http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text) <http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)>


BR,
 Alexander 


> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
> 
> Hi,
> 
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS cluster.
> 
> Flink has a master-worker model. Since its not really feasible for users to login with their kerberos credentials on all workers, I wanted to acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
> 
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
> 
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
> 
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
> 
> Best,
> Robert
> 


Re: Transferring security tokens to remote machines

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Robert,

forgive me if I’m wrong, but so far as I understand Flink uses nearly the same model as HDFS (not at all). Means the master receives an action and distribute that to the workers (more or less ;)) 
HDFS as example uses not an push mechanism, the DN clients fetch the token from the NN when they need them. Could that be a solution, too?

https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java <https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java>

MapReduce gets the token from the JT with getDelegationToken()

http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text) <http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)>


BR,
 Alexander 


> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
> 
> Hi,
> 
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS cluster.
> 
> Flink has a master-worker model. Since its not really feasible for users to login with their kerberos credentials on all workers, I wanted to acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
> 
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
> 
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
> 
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
> 
> Best,
> Robert
> 


Re: Transferring security tokens to remote machines

Posted by Alexander Alten-Lorenz <wg...@gmail.com>.
Hi Robert,

forgive me if I’m wrong, but so far as I understand Flink uses nearly the same model as HDFS (not at all). Means the master receives an action and distribute that to the workers (more or less ;)) 
HDFS as example uses not an push mechanism, the DN clients fetch the token from the NN when they need them. Could that be a solution, too?

https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java <https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1/src/hdfs/org/apache/hadoop/hdfs/tools/DelegationTokenFetcher.java>

MapReduce gets the token from the JT with getDelegationToken()

http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text) <http://hadoop.apache.org/docs/r2.5.2/api/org/apache/hadoop/mapreduce/Cluster.html#getDelegationToken(org.apache.hadoop.io.Text)>


BR,
 Alexander 


> On 12 Feb 2015, at 15:28, Robert Metzger <rm...@apache.org> wrote:
> 
> Hi,
> 
> I'm a committer at the Apache Flink project.
> One of our users asked for adding support for reading from a secured HDFS cluster.
> 
> Flink has a master-worker model. Since its not really feasible for users to login with their kerberos credentials on all workers, I wanted to acquire the security token on the master and send it to all workers.
> For that, I wrote the following code to get the tokens in to a byte array:
> 
> UserGroupInformation.setConfiguration(hdConf);
> Credentials credentials = new Credentials();
> UserGroupInformation currUsr = UserGroupInformation.getCurrentUser();
> 
> Collection<Token<? extends TokenIdentifier>> usrTok = currUsr.getTokens();
> for(Token<? extends TokenIdentifier> token : usrTok) {
>    final Text id = new Text(token.getIdentifier());
>    credentials.addToken(id, token);
> }
> DataOutputBuffer dob = new DataOutputBuffer();
> credentials.writeTokenStorageToStream(dob);
> dob.flush();
> However, the collection currUsr.getTokens() is empty, hence the output buffer doesn't contain much data.
> I suspect that I didn't fully understand the Hadoop security concepts yet.
> It would be great if somebody from the list could clarify how to properly acquire the tokens.
> 
> Also, I was wondering if there is any document describing how the UserGroupInformation class is working (when is it loading the credentials, does it only work for Kerberos, ...)
> 
> Best,
> Robert
>