You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tez.apache.org by Subroto Sanyal <sa...@gmail.com> on 2014/09/01 05:59:23 UTC

Re: Tez with secured hadoop

Hi Bikas,

In the method:
org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:

[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)


I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.


Request some suggestion on this.


On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

> There is nothing special that you need to do if you are already running
> secure Map Reduce jobs. The client needs to run in a Kerberized
> authenticated context. After that if you are using the built-in library of
> inputs/outputs etc then they should be taking care of all the access
> credentials for you when using the 0.5 API. I
>
>
>
> If you are using 0.4 API to write your job then you may need to use
> additional APIs for passing credentials to the application. Look for
> credentials in
> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
> URI*>* uris*)*
>
>
>
> The second method is a shortcut if you are using HDFS files for input. It
> obtains credentials for you from a collection of HDFS input URIs.
>
>
>
> Bikas
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Tuesday, August 19, 2014 3:30 AM
> *To:* user@tez.apache.org
> *Subject:* Tez with secured hadoop
>
>
>
> hi
>
>
>
> Tez works on secure hadoop cluster since tez-0.3.
>
> Is there any documentation available about configuring TezClient to make
> it work?
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.




-- 
Cheers,
*Subroto Sanyal*

Re: Tez with secured hadoop

Posted by Subroto Sanyal <sa...@gmail.com>.
hi Bikas,

Thanks for summarising this. Indeed it was a problem with class-loader on
the client side.

Further(un-related to Tez), I have seen a private static variable
org.apache.hadoop.security.SecurityUtil.securityInfoProviders which is
responsible for holding all the SecurityInfo classes which doesn't get
updated when the Tez Jars are laoded(it happens at later stage in my client
framework) leading to the problem.

On Tue, Oct 14, 2014 at 8:26 PM, Bikas Saha <bi...@hortonworks.com> wrote:

> To close this thread, this turned out to be a user issue.
>
>
>
> Any users who want to use impersonation can follow the suggested steps in
> the thread to do it. Users need to be careful about the client side
> classpath. The tez-api jars have META-INF properties that define the Hadoop
> security provider. If the client side classpath has issues that make this
> provider inaccessible then it can cause issues. Among other things,
> combining the tez jars in a shaded/uber jar can cause such issues.
>
>
>
> Thanks Subroto for bringing this up. This thread will be useful for anyone
> who wants to play with security using Tez/Hadoop.
>
>
>
> Bikas
>
>
>
> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
> *Sent:* Wednesday, October 08, 2014 1:02 PM
>
> *To:* user@tez.apache.org
> *Subject:* RE: Tez with secured hadoop
>
>
>
> I have responded on the jira. Lets take the discussion there.
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Tuesday, October 07, 2014 10:44 PM
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> hi,
>
>
>
> Please let me know how it is possible to get tokens and related stuffs in
> TEZ framework?
>
>
>
> On Tue, Oct 7, 2014 at 9:11 AM, Subroto Sanyal <sa...@gmail.com>
> wrote:
>
> hi Vinod, hi Bikas
>
>
>
> Thanks for your inputs.
>
> Though I am able to spawn the DAGAppMaster with proxy user (I can see the
> DAGAppMaster running with proxy user in resource manager UI) but, the call: TezClient.getAppMasterStatus
>  fails with security issues.
>
> I have raised a ticket with my client code and more details:
>
> https://issues.apache.org/jira/browse/TEZ-1640
>
> (UserGroupInformation.java:1551) - PriviledgedActionException as:qa (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
>
> Failed to retrieve AM Status via proxy
>
> com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "ip-10-178-144-254/10.178.144.254"; destination host is: "ip-10-187-33-206":56660;
>
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>
>         at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>
>         at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522)
>
>
>
>
>
>
>
> On Tue, Oct 7, 2014 at 4:25 AM, Bikas Saha <bi...@hortonworks.com> wrote:
>
> Thanks Vinod.
>
>
>
> Subroto, looks like there are 2 issues here.
>
>
>
> One is running as use foo but submitting the job as user bar. Vinod’s code
> snippet is relevant to that (Please see my answer with createProxyUser
> earlier and Vinod’s exact code below). The real user is foo but the
> effective user is bar. So when the processes connects to YARN then it YARN
> will use the effective user bar as the job user instead of foo.
>
>
>
> Second is how to obtain credentials for bar while running as foo. That is
> covered by the tez-credentials-file-path solution mentioned earlier. This
> allows user foo to collect credentials from HDFS for user bar (foo is
> expected to be a trusted proxy user in HDFS to be allowed to do this. Like
> Oozie). Then the credentials are passed to the client via the file.
>
>
>
> Hope both these clarify your situation and move you forward.
>
>
>
> Bikas
>
>
>
> *From:* Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
> *Sent:* Monday, October 06, 2014 11:39 AM
>
>
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> That is not the way to do impersonation. Please see the following:
>
>              UserGroupInformation ugi =
>                      UserGroupInformation.createProxyUser("qa",
> UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
> realUser. "qa" is the expected effective-user
>              ugi.doAs(new PrivilegedExceptionAction<Void>() {
>                public Void run() throws Exception {
>                  ..
>                }
>              }
>
>
> +Vinod
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
> wrote:
>
> hi Bikas,
>
>
>
> My code snippet to create TezClient looks like (TEZ-0.5):
>
> new PrivilegedExceptionAction<TezClient>() {
>
>
>
>                 @Override
>
>                 public TezClient run() throws Exception {
>
>                     UserGroupInformation currentUser =
> UserGroupInformation.getCurrentUser();
>
>                     LOG.info("Current User:" + currentUser);
>
>                     File tokenFile = new
> File(System.getProperty("java.io.tmpdir"),
> tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));
>
>                     LOG.info("Token File:" + tokenFile.getAbsolutePath());
>
>
> currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
> conf);
>
>                     tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
> tokenFile.getAbsolutePath());
>
>                     return TezClient.create(tezSessionName, tezConf,
> createSession, localResourceMap, credentials);
>
>                 }
>
>             }
>
>
>
> The logs generated from this piece of code during execution looks like:
>
> (TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
> subroto@EC2.INTERNAL (auth:KERBEROS)
>
> (TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob
>
>
>
> When this runs on cluster the job runs as "subroto" but, what I expect is
> to run it as "qa".
>
>
>
> Please let me know if there is something missing or wrong in the code.
>
>
>
> On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:
>
> If by impersonation you mean what Oozie does where Oozie runs as Oozie but
> get delegation tokens for user FOO then you will need to follow the
> mechanism that Oozie uses. Oozie writes the delegation tokens into a file
> and put that file in a specific path which is picked up by the application
> (in this case TezClient) and the application loads credentials from that
> file. In case of Tez the location of the credentials file is the value of
> config "tez.credentials.path"
>
>
>
> Bikas
>
>
>
> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
> *Sent:* Monday, September 01, 2014 5:34 PM
> *To:* user@tez.apache.org
> *Subject:* RE: Tez with secured hadoop
>
>
>
> They way this is supposed to work is the following in a secure cluster.
>
> 1)      The user that is running TezClient/DAGClient needs to be Kerberos
> authenticated. This allows the process running DAGClient/TezClient to
> contact the RM and get tokens to communicate with the AM.
>
> 2)      The TezClient/DAGClient uses the tokens obtained from the RM and
> populates it into the current user UGI (ie the use who is running
> TezClient/DAGClient). The RPC to the AM will try to authenticate the
> current user using the tokens just added to the current users UGI.
>
>
>
> In a non-secure environment, no tokens are needed. So I am guessing that
> your are running in a secure env.
>
>
>
> Given the above info, what is happening in your case. Whichever user the
> client is running under, it looks like it can authenticate to the RM to get
> the app report. So it should have gotten tokens to access the AM. Its not
> clear what you mean by user “subroto” being privileged and the real user
> not considered by Tez. It looks like you are running the client as user
> “subroto”. Who is “subroto” and who is the real user?
>
>
>
> Does this happen always or occasionally. There is a known race condition
> in YARN where the client gets tokens before the AM gets the key to validate
> the tokens.
>
>
>
> You can turn on debug logging and see the SASL negotiation logs to get
> more info on whats happening. You may add a debug log in getAMProxy() to
> verify that token were obtained from the RM and added to the UGI.
>
>
>
> It may help if you describe your scenario. What are you trying to achieve
> by impersonation and how are you trying to do that. We recently added ACLs
> in case that works for your scenario.
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Sunday, August 31, 2014 8:59 PM
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> Hi Bikas,
>
>
>
> In the method:
>
> org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String,
> int, Token) a UGI is getting created with name of the current user. I think
> in this process it ignores all the security things and making the
> authentication mode as "SIMPLE". I have piece of code which tries to create
> a TezClient and it keeps throwing the exception:
>
>
>
> [anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
> (UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
> (auth:SIMPLE) cause:java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
>
> [anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
> (TezClient.java:539) - Failed to retrieve AM Status via proxy
>
> com.google.protobuf.ServiceException: java.io.IOException: Failed on local
> exception: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
> "domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
> "domU-12-31-39-0C-7D-37":59431;
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>
> at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>
> at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)
>
> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)
>
> at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)
>
>
>
> I m trying to achieve impersonation. Here user "subroto" is privileged
> user and the real user is not at all considered by the Tez Code.
>
>
>
> Request some suggestion on this.
>
>
>
> On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
> There is nothing special that you need to do if you are already running
> secure Map Reduce jobs. The client needs to run in a Kerberized
> authenticated context. After that if you are using the built-in library of
> inputs/outputs etc then they should be taking care of all the access
> credentials for you when using the 0.5 API. I
>
>
>
> If you are using 0.4 API to write your job then you may need to use
> additional APIs for passing credentials to the application. Look for
> credentials in
> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
> URI*>* uris*)*
>
>
>
> The second method is a shortcut if you are using HDFS files for input. It
> obtains credentials for you from a collection of HDFS input URIs.
>
>
>
> Bikas
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Tuesday, August 19, 2014 3:30 AM
> *To:* user@tez.apache.org
> *Subject:* Tez with secured hadoop
>
>
>
> hi
>
>
>
> Tez works on secure hadoop cluster since tez-0.3.
>
> Is there any documentation available about configuring TezClient to make
> it work?
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Cheers,
*Subroto Sanyal*

RE: Tez with secured hadoop

Posted by Bikas Saha <bi...@hortonworks.com>.
To close this thread, this turned out to be a user issue.



Any users who want to use impersonation can follow the suggested steps in
the thread to do it. Users need to be careful about the client side
classpath. The tez-api jars have META-INF properties that define the Hadoop
security provider. If the client side classpath has issues that make this
provider inaccessible then it can cause issues. Among other things,
combining the tez jars in a shaded/uber jar can cause such issues.



Thanks Subroto for bringing this up. This thread will be useful for anyone
who wants to play with security using Tez/Hadoop.



Bikas



*From:* Bikas Saha [mailto:bikas@hortonworks.com]
*Sent:* Wednesday, October 08, 2014 1:02 PM
*To:* user@tez.apache.org
*Subject:* RE: Tez with secured hadoop



I have responded on the jira. Lets take the discussion there.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, October 07, 2014 10:44 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



hi,



Please let me know how it is possible to get tokens and related stuffs in
TEZ framework?



On Tue, Oct 7, 2014 at 9:11 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

hi Vinod, hi Bikas



Thanks for your inputs.

Though I am able to spawn the DAGAppMaster with proxy user (I can see the
DAGAppMaster running with proxy user in resource manager UI) but, the
call: TezClient.getAppMasterStatus
 fails with security issues.

I have raised a ticket with my client code and more details:

https://issues.apache.org/jira/browse/TEZ-1640

(UserGroupInformation.java:1551) - PriviledgedActionException as:qa
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on
local exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"ip-10-178-144-254/10.178.144.254"; destination host is:
"ip-10-187-33-206":56660;

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

        at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

        at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522)







On Tue, Oct 7, 2014 at 4:25 AM, Bikas Saha <bi...@hortonworks.com> wrote:

Thanks Vinod.



Subroto, looks like there are 2 issues here.



One is running as use foo but submitting the job as user bar. Vinod’s code
snippet is relevant to that (Please see my answer with createProxyUser
earlier and Vinod’s exact code below). The real user is foo but the
effective user is bar. So when the processes connects to YARN then it YARN
will use the effective user bar as the job user instead of foo.



Second is how to obtain credentials for bar while running as foo. That is
covered by the tez-credentials-file-path solution mentioned earlier. This
allows user foo to collect credentials from HDFS for user bar (foo is
expected to be a trusted proxy user in HDFS to be allowed to do this. Like
Oozie). Then the credentials are passed to the client via the file.



Hope both these clarify your situation and move you forward.



Bikas



*From:* Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
*Sent:* Monday, October 06, 2014 11:39 AM


*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



That is not the way to do impersonation. Please see the following:

             UserGroupInformation ugi =
                     UserGroupInformation.createProxyUser("qa",
UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
realUser. "qa" is the expected effective-user
             ugi.doAs(new PrivilegedExceptionAction<Void>() {
               public Void run() throws Exception {
                 ..
               }
             }


+Vinod
Hortonworks Inc.
http://hortonworks.com/



On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

hi Bikas,



My code snippet to create TezClient looks like (TEZ-0.5):

new PrivilegedExceptionAction<TezClient>() {



                @Override

                public TezClient run() throws Exception {

                    UserGroupInformation currentUser =
UserGroupInformation.getCurrentUser();

                    LOG.info("Current User:" + currentUser);

                    File tokenFile = new
File(System.getProperty("java.io.tmpdir"),
tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));

                    LOG.info("Token File:" + tokenFile.getAbsolutePath());


currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
conf);

                    tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
tokenFile.getAbsolutePath());

                    return TezClient.create(tezSessionName, tezConf,
createSession, localResourceMap, credentials);

                }

            }



The logs generated from this piece of code during execution looks like:

(TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
subroto@EC2.INTERNAL (auth:KERBEROS)

(TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob



When this runs on cluster the job runs as "subroto" but, what I expect is
to run it as "qa".



Please let me know if there is something missing or wrong in the code.



On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:

If by impersonation you mean what Oozie does where Oozie runs as Oozie but
get delegation tokens for user FOO then you will need to follow the
mechanism that Oozie uses. Oozie writes the delegation tokens into a file
and put that file in a specific path which is picked up by the application
(in this case TezClient) and the application loads credentials from that
file. In case of Tez the location of the credentials file is the value of
config "tez.credentials.path"



Bikas



*From:* Bikas Saha [mailto:bikas@hortonworks.com]
*Sent:* Monday, September 01, 2014 5:34 PM
*To:* user@tez.apache.org
*Subject:* RE: Tez with secured hadoop



They way this is supposed to work is the following in a secure cluster.

1)      The user that is running TezClient/DAGClient needs to be Kerberos
authenticated. This allows the process running DAGClient/TezClient to
contact the RM and get tokens to communicate with the AM.

2)      The TezClient/DAGClient uses the tokens obtained from the RM and
populates it into the current user UGI (ie the use who is running
TezClient/DAGClient). The RPC to the AM will try to authenticate the
current user using the tokens just added to the current users UGI.



In a non-secure environment, no tokens are needed. So I am guessing that
your are running in a secure env.



Given the above info, what is happening in your case. Whichever user the
client is running under, it looks like it can authenticate to the RM to get
the app report. So it should have gotten tokens to access the AM. Its not
clear what you mean by user “subroto” being privileged and the real user
not considered by Tez. It looks like you are running the client as user
“subroto”. Who is “subroto” and who is the real user?



Does this happen always or occasionally. There is a known race condition in
YARN where the client gets tokens before the AM gets the key to validate
the tokens.



You can turn on debug logging and see the SASL negotiation logs to get more
info on whats happening. You may add a debug log in getAMProxy() to verify
that token were obtained from the RM and added to the UGI.



It may help if you describe your scenario. What are you trying to achieve
by impersonation and how are you trying to do that. We recently added ACLs
in case that works for your scenario.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Sunday, August 31, 2014 8:59 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



Hi Bikas,



In the method:

org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:



[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)



I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.



Request some suggestion on this.



On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

There is nothing special that you need to do if you are already running
secure Map Reduce jobs. The client needs to run in a Kerberized
authenticated context. After that if you are using the built-in library of
inputs/outputs etc then they should be taking care of all the access
credentials for you when using the 0.5 API. I



If you are using 0.4 API to write your job then you may need to use
additional APIs for passing credentials to the application. Look for
credentials in
https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
URI*>* uris*)*



The second method is a shortcut if you are using HDFS files for input. It
obtains credentials for you from a collection of HDFS input URIs.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, August 19, 2014 3:30 AM
*To:* user@tez.apache.org
*Subject:* Tez with secured hadoop



hi



Tez works on secure hadoop cluster since tez-0.3.

Is there any documentation available about configuring TezClient to make it
work?



-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*




CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*





-- 
Cheers,
*Subroto Sanyal*

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: Tez with secured hadoop

Posted by Bikas Saha <bi...@hortonworks.com>.
I have responded on the jira. Lets take the discussion there.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, October 07, 2014 10:44 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



hi,



Please let me know how it is possible to get tokens and related stuffs in
TEZ framework?



On Tue, Oct 7, 2014 at 9:11 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

hi Vinod, hi Bikas



Thanks for your inputs.

Though I am able to spawn the DAGAppMaster with proxy user (I can see the
DAGAppMaster running with proxy user in resource manager UI) but, the
call: TezClient.getAppMasterStatus
 fails with security issues.

I have raised a ticket with my client code and more details:

https://issues.apache.org/jira/browse/TEZ-1640

(UserGroupInformation.java:1551) - PriviledgedActionException as:qa
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on
local exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"ip-10-178-144-254/10.178.144.254"; destination host is:
"ip-10-187-33-206":56660;

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

        at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

        at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522)







On Tue, Oct 7, 2014 at 4:25 AM, Bikas Saha <bi...@hortonworks.com> wrote:

Thanks Vinod.



Subroto, looks like there are 2 issues here.



One is running as use foo but submitting the job as user bar. Vinod’s code
snippet is relevant to that (Please see my answer with createProxyUser
earlier and Vinod’s exact code below). The real user is foo but the
effective user is bar. So when the processes connects to YARN then it YARN
will use the effective user bar as the job user instead of foo.



Second is how to obtain credentials for bar while running as foo. That is
covered by the tez-credentials-file-path solution mentioned earlier. This
allows user foo to collect credentials from HDFS for user bar (foo is
expected to be a trusted proxy user in HDFS to be allowed to do this. Like
Oozie). Then the credentials are passed to the client via the file.



Hope both these clarify your situation and move you forward.



Bikas



*From:* Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
*Sent:* Monday, October 06, 2014 11:39 AM


*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



That is not the way to do impersonation. Please see the following:

             UserGroupInformation ugi =
                     UserGroupInformation.createProxyUser("qa",
UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
realUser. "qa" is the expected effective-user
             ugi.doAs(new PrivilegedExceptionAction<Void>() {
               public Void run() throws Exception {
                 ..
               }
             }


+Vinod
Hortonworks Inc.
http://hortonworks.com/



On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

hi Bikas,



My code snippet to create TezClient looks like (TEZ-0.5):

new PrivilegedExceptionAction<TezClient>() {



                @Override

                public TezClient run() throws Exception {

                    UserGroupInformation currentUser =
UserGroupInformation.getCurrentUser();

                    LOG.info("Current User:" + currentUser);

                    File tokenFile = new
File(System.getProperty("java.io.tmpdir"),
tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));

                    LOG.info("Token File:" + tokenFile.getAbsolutePath());


currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
conf);

                    tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
tokenFile.getAbsolutePath());

                    return TezClient.create(tezSessionName, tezConf,
createSession, localResourceMap, credentials);

                }

            }



The logs generated from this piece of code during execution looks like:

(TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
subroto@EC2.INTERNAL (auth:KERBEROS)

(TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob



When this runs on cluster the job runs as "subroto" but, what I expect is
to run it as "qa".



Please let me know if there is something missing or wrong in the code.



On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:

If by impersonation you mean what Oozie does where Oozie runs as Oozie but
get delegation tokens for user FOO then you will need to follow the
mechanism that Oozie uses. Oozie writes the delegation tokens into a file
and put that file in a specific path which is picked up by the application
(in this case TezClient) and the application loads credentials from that
file. In case of Tez the location of the credentials file is the value of
config "tez.credentials.path"



Bikas



*From:* Bikas Saha [mailto:bikas@hortonworks.com]
*Sent:* Monday, September 01, 2014 5:34 PM
*To:* user@tez.apache.org
*Subject:* RE: Tez with secured hadoop



They way this is supposed to work is the following in a secure cluster.

1)      The user that is running TezClient/DAGClient needs to be Kerberos
authenticated. This allows the process running DAGClient/TezClient to
contact the RM and get tokens to communicate with the AM.

2)      The TezClient/DAGClient uses the tokens obtained from the RM and
populates it into the current user UGI (ie the use who is running
TezClient/DAGClient). The RPC to the AM will try to authenticate the
current user using the tokens just added to the current users UGI.



In a non-secure environment, no tokens are needed. So I am guessing that
your are running in a secure env.



Given the above info, what is happening in your case. Whichever user the
client is running under, it looks like it can authenticate to the RM to get
the app report. So it should have gotten tokens to access the AM. Its not
clear what you mean by user “subroto” being privileged and the real user
not considered by Tez. It looks like you are running the client as user
“subroto”. Who is “subroto” and who is the real user?



Does this happen always or occasionally. There is a known race condition in
YARN where the client gets tokens before the AM gets the key to validate
the tokens.



You can turn on debug logging and see the SASL negotiation logs to get more
info on whats happening. You may add a debug log in getAMProxy() to verify
that token were obtained from the RM and added to the UGI.



It may help if you describe your scenario. What are you trying to achieve
by impersonation and how are you trying to do that. We recently added ACLs
in case that works for your scenario.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Sunday, August 31, 2014 8:59 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



Hi Bikas,



In the method:

org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:



[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)



I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.



Request some suggestion on this.



On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

There is nothing special that you need to do if you are already running
secure Map Reduce jobs. The client needs to run in a Kerberized
authenticated context. After that if you are using the built-in library of
inputs/outputs etc then they should be taking care of all the access
credentials for you when using the 0.5 API. I



If you are using 0.4 API to write your job then you may need to use
additional APIs for passing credentials to the application. Look for
credentials in
https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
URI*>* uris*)*



The second method is a shortcut if you are using HDFS files for input. It
obtains credentials for you from a collection of HDFS input URIs.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, August 19, 2014 3:30 AM
*To:* user@tez.apache.org
*Subject:* Tez with secured hadoop



hi



Tez works on secure hadoop cluster since tez-0.3.

Is there any documentation available about configuring TezClient to make it
work?



-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*




CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*





-- 
Cheers,
*Subroto Sanyal*

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Tez with secured hadoop

Posted by Subroto Sanyal <sa...@gmail.com>.
hi,

Please let me know how it is possible to get tokens and related stuffs in
TEZ framework?

On Tue, Oct 7, 2014 at 9:11 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

> hi Vinod, hi Bikas
>
> Thanks for your inputs.
> Though I am able to spawn the DAGAppMaster with proxy user (I can see the
> DAGAppMaster running with proxy user in resource manager UI) but, the call: TezClient.getAppMasterStatus
>  fails with security issues.
> I have raised a ticket with my client code and more details:
> https://issues.apache.org/jira/browse/TEZ-1640
>
> (UserGroupInformation.java:1551) - PriviledgedActionException as:qa (auth:SIMPLE) cause:java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
> Failed to retrieve AM Status via proxy
> com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "ip-10-178-144-254/10.178.144.254"; destination host is: "ip-10-187-33-206":56660;
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>         at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>         at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522)
>
>
>
>
> On Tue, Oct 7, 2014 at 4:25 AM, Bikas Saha <bi...@hortonworks.com> wrote:
>
>> Thanks Vinod.
>>
>>
>>
>> Subroto, looks like there are 2 issues here.
>>
>>
>>
>> One is running as use foo but submitting the job as user bar. Vinod’s
>> code snippet is relevant to that (Please see my answer with createProxyUser
>> earlier and Vinod’s exact code below). The real user is foo but the
>> effective user is bar. So when the processes connects to YARN then it YARN
>> will use the effective user bar as the job user instead of foo.
>>
>>
>>
>> Second is how to obtain credentials for bar while running as foo. That is
>> covered by the tez-credentials-file-path solution mentioned earlier. This
>> allows user foo to collect credentials from HDFS for user bar (foo is
>> expected to be a trusted proxy user in HDFS to be allowed to do this. Like
>> Oozie). Then the credentials are passed to the client via the file.
>>
>>
>>
>> Hope both these clarify your situation and move you forward.
>>
>>
>>
>> Bikas
>>
>>
>>
>> *From:* Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
>> *Sent:* Monday, October 06, 2014 11:39 AM
>>
>> *To:* user@tez.apache.org
>> *Subject:* Re: Tez with secured hadoop
>>
>>
>>
>> That is not the way to do impersonation. Please see the following:
>>
>>              UserGroupInformation ugi =
>>                      UserGroupInformation.createProxyUser("qa",
>> UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
>> realUser. "qa" is the expected effective-user
>>              ugi.doAs(new PrivilegedExceptionAction<Void>() {
>>                public Void run() throws Exception {
>>                  ..
>>                }
>>              }
>>
>>
>> +Vinod
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>>
>>
>> On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
>> wrote:
>>
>> hi Bikas,
>>
>>
>>
>> My code snippet to create TezClient looks like (TEZ-0.5):
>>
>> new PrivilegedExceptionAction<TezClient>() {
>>
>>
>>
>>                 @Override
>>
>>                 public TezClient run() throws Exception {
>>
>>                     UserGroupInformation currentUser =
>> UserGroupInformation.getCurrentUser();
>>
>>                     LOG.info("Current User:" + currentUser);
>>
>>                     File tokenFile = new
>> File(System.getProperty("java.io.tmpdir"),
>> tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));
>>
>>                     LOG.info("Token File:" + tokenFile.getAbsolutePath());
>>
>>
>> currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
>> conf);
>>
>>                     tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
>> tokenFile.getAbsolutePath());
>>
>>                     return TezClient.create(tezSessionName, tezConf,
>> createSession, localResourceMap, credentials);
>>
>>                 }
>>
>>             }
>>
>>
>>
>> The logs generated from this piece of code during execution looks like:
>>
>> (TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
>> subroto@EC2.INTERNAL (auth:KERBEROS)
>>
>> (TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob
>>
>>
>>
>> When this runs on cluster the job runs as "subroto" but, what I expect is
>> to run it as "qa".
>>
>>
>>
>> Please let me know if there is something missing or wrong in the code.
>>
>>
>>
>> On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com>
>> wrote:
>>
>> If by impersonation you mean what Oozie does where Oozie runs as Oozie
>> but get delegation tokens for user FOO then you will need to follow the
>> mechanism that Oozie uses. Oozie writes the delegation tokens into a file
>> and put that file in a specific path which is picked up by the application
>> (in this case TezClient) and the application loads credentials from that
>> file. In case of Tez the location of the credentials file is the value of
>> config "tez.credentials.path"
>>
>>
>>
>> Bikas
>>
>>
>>
>> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
>> *Sent:* Monday, September 01, 2014 5:34 PM
>> *To:* user@tez.apache.org
>> *Subject:* RE: Tez with secured hadoop
>>
>>
>>
>> They way this is supposed to work is the following in a secure cluster.
>>
>> 1)      The user that is running TezClient/DAGClient needs to be
>> Kerberos authenticated. This allows the process running DAGClient/TezClient
>> to contact the RM and get tokens to communicate with the AM.
>>
>> 2)      The TezClient/DAGClient uses the tokens obtained from the RM and
>> populates it into the current user UGI (ie the use who is running
>> TezClient/DAGClient). The RPC to the AM will try to authenticate the
>> current user using the tokens just added to the current users UGI.
>>
>>
>>
>> In a non-secure environment, no tokens are needed. So I am guessing that
>> your are running in a secure env.
>>
>>
>>
>> Given the above info, what is happening in your case. Whichever user the
>> client is running under, it looks like it can authenticate to the RM to get
>> the app report. So it should have gotten tokens to access the AM. Its not
>> clear what you mean by user “subroto” being privileged and the real user
>> not considered by Tez. It looks like you are running the client as user
>> “subroto”. Who is “subroto” and who is the real user?
>>
>>
>>
>> Does this happen always or occasionally. There is a known race condition
>> in YARN where the client gets tokens before the AM gets the key to validate
>> the tokens.
>>
>>
>>
>> You can turn on debug logging and see the SASL negotiation logs to get
>> more info on whats happening. You may add a debug log in getAMProxy() to
>> verify that token were obtained from the RM and added to the UGI.
>>
>>
>>
>> It may help if you describe your scenario. What are you trying to achieve
>> by impersonation and how are you trying to do that. We recently added ACLs
>> in case that works for your scenario.
>>
>>
>>
>> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
>> *Sent:* Sunday, August 31, 2014 8:59 PM
>> *To:* user@tez.apache.org
>> *Subject:* Re: Tez with secured hadoop
>>
>>
>>
>> Hi Bikas,
>>
>>
>>
>> In the method:
>>
>> org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String,
>> int, Token) a UGI is getting created with name of the current user. I think
>> in this process it ignores all the security things and making the
>> authentication mode as "SIMPLE". I have piece of code which tries to create
>> a TezClient and it keeps throwing the exception:
>>
>>
>>
>> [anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
>> (UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
>> (auth:SIMPLE) cause:java.io.IOException:
>> org.apache.hadoop.security.AccessControlException: Client cannot
>> authenticate via:[TOKEN, KERBEROS]
>>
>> [anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
>> (TezClient.java:539) - Failed to retrieve AM Status via proxy
>>
>> com.google.protobuf.ServiceException: java.io.IOException: Failed on
>> local exception: java.io.IOException:
>> org.apache.hadoop.security.AccessControlException: Client cannot
>> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
>> "domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
>> "domU-12-31-39-0C-7D-37":59431;
>>
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>>
>> at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>>
>> at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)
>>
>> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)
>>
>> at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)
>>
>>
>>
>> I m trying to achieve impersonation. Here user "subroto" is privileged
>> user and the real user is not at all considered by the Tez Code.
>>
>>
>>
>> Request some suggestion on this.
>>
>>
>>
>> On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com>
>> wrote:
>>
>> There is nothing special that you need to do if you are already running
>> secure Map Reduce jobs. The client needs to run in a Kerberized
>> authenticated context. After that if you are using the built-in library of
>> inputs/outputs etc then they should be taking care of all the access
>> credentials for you when using the 0.5 API. I
>>
>>
>>
>> If you are using 0.4 API to write your job then you may need to use
>> additional APIs for passing credentials to the application. Look for
>> credentials in
>> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
>> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection
>> *<*URI*>* uris*)*
>>
>>
>>
>> The second method is a shortcut if you are using HDFS files for input. It
>> obtains credentials for you from a collection of HDFS input URIs.
>>
>>
>>
>> Bikas
>>
>>
>>
>> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
>> *Sent:* Tuesday, August 19, 2014 3:30 AM
>> *To:* user@tez.apache.org
>> *Subject:* Tez with secured hadoop
>>
>>
>>
>> hi
>>
>>
>>
>> Tez works on secure hadoop cluster since tez-0.3.
>>
>> Is there any documentation available about configuring TezClient to make
>> it work?
>>
>>
>>
>> --
>> Cheers,
>> *Subroto Sanyal*
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>>
>>
>>
>>
>> --
>> Cheers,
>> *Subroto Sanyal*
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>>
>>
>>
>>
>> --
>> Cheers,
>> *Subroto Sanyal*
>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>



-- 
Cheers,
*Subroto Sanyal*

Re: Tez with secured hadoop

Posted by Subroto Sanyal <sa...@gmail.com>.
hi Vinod, hi Bikas

Thanks for your inputs.
Though I am able to spawn the DAGAppMaster with proxy user (I can see the
DAGAppMaster running with proxy user in resource manager UI) but, the
call: TezClient.getAppMasterStatus
 fails with security issues.
I have raised a ticket with my client code and more details:
https://issues.apache.org/jira/browse/TEZ-1640

(UserGroupInformation.java:1551) - PriviledgedActionException as:qa
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]
Failed to retrieve AM Status via proxy
com.google.protobuf.ServiceException: java.io.IOException: Failed on
local exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"ip-10-178-144-254/10.178.144.254"; destination host is:
"ip-10-187-33-206":56660;
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
        at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
        at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:522)




On Tue, Oct 7, 2014 at 4:25 AM, Bikas Saha <bi...@hortonworks.com> wrote:

> Thanks Vinod.
>
>
>
> Subroto, looks like there are 2 issues here.
>
>
>
> One is running as use foo but submitting the job as user bar. Vinod’s code
> snippet is relevant to that (Please see my answer with createProxyUser
> earlier and Vinod’s exact code below). The real user is foo but the
> effective user is bar. So when the processes connects to YARN then it YARN
> will use the effective user bar as the job user instead of foo.
>
>
>
> Second is how to obtain credentials for bar while running as foo. That is
> covered by the tez-credentials-file-path solution mentioned earlier. This
> allows user foo to collect credentials from HDFS for user bar (foo is
> expected to be a trusted proxy user in HDFS to be allowed to do this. Like
> Oozie). Then the credentials are passed to the client via the file.
>
>
>
> Hope both these clarify your situation and move you forward.
>
>
>
> Bikas
>
>
>
> *From:* Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
> *Sent:* Monday, October 06, 2014 11:39 AM
>
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> That is not the way to do impersonation. Please see the following:
>
>              UserGroupInformation ugi =
>                      UserGroupInformation.createProxyUser("qa",
> UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
> realUser. "qa" is the expected effective-user
>              ugi.doAs(new PrivilegedExceptionAction<Void>() {
>                public Void run() throws Exception {
>                  ..
>                }
>              }
>
>
> +Vinod
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>
> On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
> wrote:
>
> hi Bikas,
>
>
>
> My code snippet to create TezClient looks like (TEZ-0.5):
>
> new PrivilegedExceptionAction<TezClient>() {
>
>
>
>                 @Override
>
>                 public TezClient run() throws Exception {
>
>                     UserGroupInformation currentUser =
> UserGroupInformation.getCurrentUser();
>
>                     LOG.info("Current User:" + currentUser);
>
>                     File tokenFile = new
> File(System.getProperty("java.io.tmpdir"),
> tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));
>
>                     LOG.info("Token File:" + tokenFile.getAbsolutePath());
>
>
> currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
> conf);
>
>                     tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
> tokenFile.getAbsolutePath());
>
>                     return TezClient.create(tezSessionName, tezConf,
> createSession, localResourceMap, credentials);
>
>                 }
>
>             }
>
>
>
> The logs generated from this piece of code during execution looks like:
>
> (TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
> subroto@EC2.INTERNAL (auth:KERBEROS)
>
> (TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob
>
>
>
> When this runs on cluster the job runs as "subroto" but, what I expect is
> to run it as "qa".
>
>
>
> Please let me know if there is something missing or wrong in the code.
>
>
>
> On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:
>
> If by impersonation you mean what Oozie does where Oozie runs as Oozie but
> get delegation tokens for user FOO then you will need to follow the
> mechanism that Oozie uses. Oozie writes the delegation tokens into a file
> and put that file in a specific path which is picked up by the application
> (in this case TezClient) and the application loads credentials from that
> file. In case of Tez the location of the credentials file is the value of
> config "tez.credentials.path"
>
>
>
> Bikas
>
>
>
> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
> *Sent:* Monday, September 01, 2014 5:34 PM
> *To:* user@tez.apache.org
> *Subject:* RE: Tez with secured hadoop
>
>
>
> They way this is supposed to work is the following in a secure cluster.
>
> 1)      The user that is running TezClient/DAGClient needs to be Kerberos
> authenticated. This allows the process running DAGClient/TezClient to
> contact the RM and get tokens to communicate with the AM.
>
> 2)      The TezClient/DAGClient uses the tokens obtained from the RM and
> populates it into the current user UGI (ie the use who is running
> TezClient/DAGClient). The RPC to the AM will try to authenticate the
> current user using the tokens just added to the current users UGI.
>
>
>
> In a non-secure environment, no tokens are needed. So I am guessing that
> your are running in a secure env.
>
>
>
> Given the above info, what is happening in your case. Whichever user the
> client is running under, it looks like it can authenticate to the RM to get
> the app report. So it should have gotten tokens to access the AM. Its not
> clear what you mean by user “subroto” being privileged and the real user
> not considered by Tez. It looks like you are running the client as user
> “subroto”. Who is “subroto” and who is the real user?
>
>
>
> Does this happen always or occasionally. There is a known race condition
> in YARN where the client gets tokens before the AM gets the key to validate
> the tokens.
>
>
>
> You can turn on debug logging and see the SASL negotiation logs to get
> more info on whats happening. You may add a debug log in getAMProxy() to
> verify that token were obtained from the RM and added to the UGI.
>
>
>
> It may help if you describe your scenario. What are you trying to achieve
> by impersonation and how are you trying to do that. We recently added ACLs
> in case that works for your scenario.
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Sunday, August 31, 2014 8:59 PM
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> Hi Bikas,
>
>
>
> In the method:
>
> org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String,
> int, Token) a UGI is getting created with name of the current user. I think
> in this process it ignores all the security things and making the
> authentication mode as "SIMPLE". I have piece of code which tries to create
> a TezClient and it keeps throwing the exception:
>
>
>
> [anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
> (UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
> (auth:SIMPLE) cause:java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
>
> [anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
> (TezClient.java:539) - Failed to retrieve AM Status via proxy
>
> com.google.protobuf.ServiceException: java.io.IOException: Failed on local
> exception: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
> "domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
> "domU-12-31-39-0C-7D-37":59431;
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>
> at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>
> at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)
>
> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)
>
> at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)
>
>
>
> I m trying to achieve impersonation. Here user "subroto" is privileged
> user and the real user is not at all considered by the Tez Code.
>
>
>
> Request some suggestion on this.
>
>
>
> On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
> There is nothing special that you need to do if you are already running
> secure Map Reduce jobs. The client needs to run in a Kerberized
> authenticated context. After that if you are using the built-in library of
> inputs/outputs etc then they should be taking care of all the access
> credentials for you when using the 0.5 API. I
>
>
>
> If you are using 0.4 API to write your job then you may need to use
> additional APIs for passing credentials to the application. Look for
> credentials in
> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
> URI*>* uris*)*
>
>
>
> The second method is a shortcut if you are using HDFS files for input. It
> obtains credentials for you from a collection of HDFS input URIs.
>
>
>
> Bikas
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Tuesday, August 19, 2014 3:30 AM
> *To:* user@tez.apache.org
> *Subject:* Tez with secured hadoop
>
>
>
> hi
>
>
>
> Tez works on secure hadoop cluster since tez-0.3.
>
> Is there any documentation available about configuring TezClient to make
> it work?
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Cheers,
*Subroto Sanyal*

RE: Tez with secured hadoop

Posted by Bikas Saha <bi...@hortonworks.com>.
Thanks Vinod.



Subroto, looks like there are 2 issues here.



One is running as use foo but submitting the job as user bar. Vinod’s code
snippet is relevant to that (Please see my answer with createProxyUser
earlier and Vinod’s exact code below). The real user is foo but the
effective user is bar. So when the processes connects to YARN then it YARN
will use the effective user bar as the job user instead of foo.



Second is how to obtain credentials for bar while running as foo. That is
covered by the tez-credentials-file-path solution mentioned earlier. This
allows user foo to collect credentials from HDFS for user bar (foo is
expected to be a trusted proxy user in HDFS to be allowed to do this. Like
Oozie). Then the credentials are passed to the client via the file.



Hope both these clarify your situation and move you forward.



Bikas



*From:* Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com]
*Sent:* Monday, October 06, 2014 11:39 AM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



That is not the way to do impersonation. Please see the following:

             UserGroupInformation ugi =
                     UserGroupInformation.createProxyUser("qa",
UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
realUser. "qa" is the expected effective-user
             ugi.doAs(new PrivilegedExceptionAction<Void>() {
               public Void run() throws Exception {
                 ..
               }
             }


+Vinod
Hortonworks Inc.
http://hortonworks.com/



On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

hi Bikas,



My code snippet to create TezClient looks like (TEZ-0.5):

new PrivilegedExceptionAction<TezClient>() {



                @Override

                public TezClient run() throws Exception {

                    UserGroupInformation currentUser =
UserGroupInformation.getCurrentUser();

                    LOG.info("Current User:" + currentUser);

                    File tokenFile = new
File(System.getProperty("java.io.tmpdir"),
tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));

                    LOG.info("Token File:" + tokenFile.getAbsolutePath());


currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
conf);

                    tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
tokenFile.getAbsolutePath());

                    return TezClient.create(tezSessionName, tezConf,
createSession, localResourceMap, credentials);

                }

            }



The logs generated from this piece of code during execution looks like:

(TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
subroto@EC2.INTERNAL (auth:KERBEROS)

(TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob



When this runs on cluster the job runs as "subroto" but, what I expect is
to run it as "qa".



Please let me know if there is something missing or wrong in the code.



On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:

If by impersonation you mean what Oozie does where Oozie runs as Oozie but
get delegation tokens for user FOO then you will need to follow the
mechanism that Oozie uses. Oozie writes the delegation tokens into a file
and put that file in a specific path which is picked up by the application
(in this case TezClient) and the application loads credentials from that
file. In case of Tez the location of the credentials file is the value of
config "tez.credentials.path"



Bikas



*From:* Bikas Saha [mailto:bikas@hortonworks.com]
*Sent:* Monday, September 01, 2014 5:34 PM
*To:* user@tez.apache.org
*Subject:* RE: Tez with secured hadoop



They way this is supposed to work is the following in a secure cluster.

1)      The user that is running TezClient/DAGClient needs to be Kerberos
authenticated. This allows the process running DAGClient/TezClient to
contact the RM and get tokens to communicate with the AM.

2)      The TezClient/DAGClient uses the tokens obtained from the RM and
populates it into the current user UGI (ie the use who is running
TezClient/DAGClient). The RPC to the AM will try to authenticate the
current user using the tokens just added to the current users UGI.



In a non-secure environment, no tokens are needed. So I am guessing that
your are running in a secure env.



Given the above info, what is happening in your case. Whichever user the
client is running under, it looks like it can authenticate to the RM to get
the app report. So it should have gotten tokens to access the AM. Its not
clear what you mean by user “subroto” being privileged and the real user
not considered by Tez. It looks like you are running the client as user
“subroto”. Who is “subroto” and who is the real user?



Does this happen always or occasionally. There is a known race condition in
YARN where the client gets tokens before the AM gets the key to validate
the tokens.



You can turn on debug logging and see the SASL negotiation logs to get more
info on whats happening. You may add a debug log in getAMProxy() to verify
that token were obtained from the RM and added to the UGI.



It may help if you describe your scenario. What are you trying to achieve
by impersonation and how are you trying to do that. We recently added ACLs
in case that works for your scenario.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Sunday, August 31, 2014 8:59 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



Hi Bikas,



In the method:

org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:



[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)



I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.



Request some suggestion on this.



On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

There is nothing special that you need to do if you are already running
secure Map Reduce jobs. The client needs to run in a Kerberized
authenticated context. After that if you are using the built-in library of
inputs/outputs etc then they should be taking care of all the access
credentials for you when using the 0.5 API. I



If you are using 0.4 API to write your job then you may need to use
additional APIs for passing credentials to the application. Look for
credentials in
https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
URI*>* uris*)*



The second method is a shortcut if you are using HDFS files for input. It
obtains credentials for you from a collection of HDFS input URIs.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, August 19, 2014 3:30 AM
*To:* user@tez.apache.org
*Subject:* Tez with secured hadoop



hi



Tez works on secure hadoop cluster since tez-0.3.

Is there any documentation available about configuring TezClient to make it
work?



-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*




CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Tez with secured hadoop

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
That is not the way to do impersonation. Please see the following:

             UserGroupInformation ugi =
                     UserGroupInformation.createProxyUser("qa",
UserGroupInformation.getLoginUser()); // <-- login user is subroto, the
realUser. "qa" is the expected effective-user
             ugi.doAs(new PrivilegedExceptionAction<Void>() {
               public Void run() throws Exception {
                 ..
               }
             }

+Vinod
Hortonworks Inc.
http://hortonworks.com/

On Thu, Oct 2, 2014 at 6:22 AM, Subroto Sanyal <sa...@gmail.com>
wrote:

> hi Bikas,
>
> My code snippet to create TezClient looks like (TEZ-0.5):
>
> new PrivilegedExceptionAction<TezClient>() {
>
>
>                 @Override
>
>                 public TezClient run() throws Exception {
>
>                     UserGroupInformation currentUser =
> UserGroupInformation.getCurrentUser();
>
>                     LOG.info("Current User:" + currentUser);
>
>                     File tokenFile = new File(System.getProperty(
> "java.io.tmpdir"), tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));
>
>                     LOG.info("Token File:" + tokenFile.getAbsolutePath());
>
>
> currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
> conf);
>
>                     tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
> tokenFile.getAbsolutePath());
>
>                     return TezClient.create(tezSessionName, tezConf,
> createSession, localResourceMap, credentials);
>
>                 }
>
>             }
>
>
> The logs generated from this piece of code during execution looks like:
>
> (TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
> subroto@EC2.INTERNAL (auth:KERBEROS)
>
> (TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob
>
>
> When this runs on cluster the job runs as "subroto" but, what I expect is
> to run it as "qa".
>
>
> Please let me know if there is something missing or wrong in the code.
>
> On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:
>
>> If by impersonation you mean what Oozie does where Oozie runs as Oozie
>> but get delegation tokens for user FOO then you will need to follow the
>> mechanism that Oozie uses. Oozie writes the delegation tokens into a file
>> and put that file in a specific path which is picked up by the application
>> (in this case TezClient) and the application loads credentials from that
>> file. In case of Tez the location of the credentials file is the value of
>> config "tez.credentials.path"
>>
>>
>>
>> Bikas
>>
>>
>>
>> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
>> *Sent:* Monday, September 01, 2014 5:34 PM
>> *To:* user@tez.apache.org
>> *Subject:* RE: Tez with secured hadoop
>>
>>
>>
>> They way this is supposed to work is the following in a secure cluster.
>>
>> 1)      The user that is running TezClient/DAGClient needs to be
>> Kerberos authenticated. This allows the process running DAGClient/TezClient
>> to contact the RM and get tokens to communicate with the AM.
>>
>> 2)      The TezClient/DAGClient uses the tokens obtained from the RM and
>> populates it into the current user UGI (ie the use who is running
>> TezClient/DAGClient). The RPC to the AM will try to authenticate the
>> current user using the tokens just added to the current users UGI.
>>
>>
>>
>> In a non-secure environment, no tokens are needed. So I am guessing that
>> your are running in a secure env.
>>
>>
>>
>> Given the above info, what is happening in your case. Whichever user the
>> client is running under, it looks like it can authenticate to the RM to get
>> the app report. So it should have gotten tokens to access the AM. Its not
>> clear what you mean by user “subroto” being privileged and the real user
>> not considered by Tez. It looks like you are running the client as user
>> “subroto”. Who is “subroto” and who is the real user?
>>
>>
>>
>> Does this happen always or occasionally. There is a known race condition
>> in YARN where the client gets tokens before the AM gets the key to validate
>> the tokens.
>>
>>
>>
>> You can turn on debug logging and see the SASL negotiation logs to get
>> more info on whats happening. You may add a debug log in getAMProxy() to
>> verify that token were obtained from the RM and added to the UGI.
>>
>>
>>
>> It may help if you describe your scenario. What are you trying to achieve
>> by impersonation and how are you trying to do that. We recently added ACLs
>> in case that works for your scenario.
>>
>>
>>
>> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
>> *Sent:* Sunday, August 31, 2014 8:59 PM
>> *To:* user@tez.apache.org
>> *Subject:* Re: Tez with secured hadoop
>>
>>
>>
>> Hi Bikas,
>>
>>
>>
>> In the method:
>>
>> org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String,
>> int, Token) a UGI is getting created with name of the current user. I think
>> in this process it ignores all the security things and making the
>> authentication mode as "SIMPLE". I have piece of code which tries to create
>> a TezClient and it keeps throwing the exception:
>>
>>
>>
>> [anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
>> (UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
>> (auth:SIMPLE) cause:java.io.IOException:
>> org.apache.hadoop.security.AccessControlException: Client cannot
>> authenticate via:[TOKEN, KERBEROS]
>>
>> [anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
>> (TezClient.java:539) - Failed to retrieve AM Status via proxy
>>
>> com.google.protobuf.ServiceException: java.io.IOException: Failed on
>> local exception: java.io.IOException:
>> org.apache.hadoop.security.AccessControlException: Client cannot
>> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
>> "domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
>> "domU-12-31-39-0C-7D-37":59431;
>>
>> at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>>
>> at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>>
>> at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)
>>
>> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)
>>
>> at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)
>>
>>
>>
>> I m trying to achieve impersonation. Here user "subroto" is privileged
>> user and the real user is not at all considered by the Tez Code.
>>
>>
>>
>> Request some suggestion on this.
>>
>>
>>
>> On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com>
>> wrote:
>>
>> There is nothing special that you need to do if you are already running
>> secure Map Reduce jobs. The client needs to run in a Kerberized
>> authenticated context. After that if you are using the built-in library of
>> inputs/outputs etc then they should be taking care of all the access
>> credentials for you when using the 0.5 API. I
>>
>>
>>
>> If you are using 0.4 API to write your job then you may need to use
>> additional APIs for passing credentials to the application. Look for
>> credentials in
>> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
>> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection
>> *<*URI*>* uris*)*
>>
>>
>>
>> The second method is a shortcut if you are using HDFS files for input. It
>> obtains credentials for you from a collection of HDFS input URIs.
>>
>>
>>
>> Bikas
>>
>>
>>
>> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
>> *Sent:* Tuesday, August 19, 2014 3:30 AM
>> *To:* user@tez.apache.org
>> *Subject:* Tez with secured hadoop
>>
>>
>>
>> hi
>>
>>
>>
>> Tez works on secure hadoop cluster since tez-0.3.
>>
>> Is there any documentation available about configuring TezClient to make
>> it work?
>>
>>
>>
>> --
>> Cheers,
>> *Subroto Sanyal*
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>>
>>
>>
>>
>> --
>> Cheers,
>> *Subroto Sanyal*
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Tez with secured hadoop

Posted by Subroto Sanyal <sa...@gmail.com>.
hi Bikas,

" Probably involves fiddling around with the createProxyUser() method in
UserGroupInformation and starting TezClient under that UGI."

I am using proxy user to create TezClient. Following is the proxy user
backed up by a valid kerberos user(credential) from the log line:
==============================================================================================
(TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
subroto@EC2.INTERNAL (auth:KERBEROS)
==============================================================================================

Is there any other way to create the token file which is being set to
TEZ_CREDENTIALS_PATH ? MapReduce supports impersonation in the same way
(UGI.doAs).
Has the impersonation feature thing being tested/verified with TEZ?


On Fri, Oct 3, 2014 at 2:47 AM, Bikas Saha <bi...@hortonworks.com> wrote:

> The application will be run by YARN as the user that submits the
> application. So if you want the app to run as FOO then it must be submit as
> FOO. A proxy user has the privilege to run as BAR but get credentials for
> FOO and give it to the app. A proxy user can also pass the real user via
> the UserGroupInformation. I am not entirely familiar with the exact flow of
> this but you could look at Oozie code to check how Oozie does this.
> Probably involves fiddling around with the createProxyUser() method in
> UserGroupInformation and starting TezClient under that UGI.
>
>
>
> Bikas
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Thursday, October 02, 2014 6:23 AM
>
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> hi Bikas,
>
>
>
> My code snippet to create TezClient looks like (TEZ-0.5):
>
> new PrivilegedExceptionAction<TezClient>() {
>
>
>
>                 @Override
>
>                 public TezClient run() throws Exception {
>
>                     UserGroupInformation currentUser =
> UserGroupInformation.getCurrentUser();
>
>                     LOG.info("Current User:" + currentUser);
>
>                     File tokenFile = new
> File(System.getProperty("java.io.tmpdir"),
> tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));
>
>                     LOG.info("Token File:" + tokenFile.getAbsolutePath());
>
>
> currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
> conf);
>
>                     tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
> tokenFile.getAbsolutePath());
>
>                     return TezClient.create(tezSessionName, tezConf,
> createSession, localResourceMap, credentials);
>
>                 }
>
>             }
>
>
>
> The logs generated from this piece of code during execution looks like:
>
> (TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
> subroto@EC2.INTERNAL (auth:KERBEROS)
>
> (TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob
>
>
>
> When this runs on cluster the job runs as "subroto" but, what I expect is
> to run it as "qa".
>
>
>
> Please let me know if there is something missing or wrong in the code.
>
>
>
> On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:
>
> If by impersonation you mean what Oozie does where Oozie runs as Oozie but
> get delegation tokens for user FOO then you will need to follow the
> mechanism that Oozie uses. Oozie writes the delegation tokens into a file
> and put that file in a specific path which is picked up by the application
> (in this case TezClient) and the application loads credentials from that
> file. In case of Tez the location of the credentials file is the value of
> config "tez.credentials.path"
>
>
>
> Bikas
>
>
>
> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
> *Sent:* Monday, September 01, 2014 5:34 PM
> *To:* user@tez.apache.org
> *Subject:* RE: Tez with secured hadoop
>
>
>
> They way this is supposed to work is the following in a secure cluster.
>
> 1)      The user that is running TezClient/DAGClient needs to be Kerberos
> authenticated. This allows the process running DAGClient/TezClient to
> contact the RM and get tokens to communicate with the AM.
>
> 2)      The TezClient/DAGClient uses the tokens obtained from the RM and
> populates it into the current user UGI (ie the use who is running
> TezClient/DAGClient). The RPC to the AM will try to authenticate the
> current user using the tokens just added to the current users UGI.
>
>
>
> In a non-secure environment, no tokens are needed. So I am guessing that
> your are running in a secure env.
>
>
>
> Given the above info, what is happening in your case. Whichever user the
> client is running under, it looks like it can authenticate to the RM to get
> the app report. So it should have gotten tokens to access the AM. Its not
> clear what you mean by user “subroto” being privileged and the real user
> not considered by Tez. It looks like you are running the client as user
> “subroto”. Who is “subroto” and who is the real user?
>
>
>
> Does this happen always or occasionally. There is a known race condition
> in YARN where the client gets tokens before the AM gets the key to validate
> the tokens.
>
>
>
> You can turn on debug logging and see the SASL negotiation logs to get
> more info on whats happening. You may add a debug log in getAMProxy() to
> verify that token were obtained from the RM and added to the UGI.
>
>
>
> It may help if you describe your scenario. What are you trying to achieve
> by impersonation and how are you trying to do that. We recently added ACLs
> in case that works for your scenario.
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Sunday, August 31, 2014 8:59 PM
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> Hi Bikas,
>
>
>
> In the method:
>
> org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String,
> int, Token) a UGI is getting created with name of the current user. I think
> in this process it ignores all the security things and making the
> authentication mode as "SIMPLE". I have piece of code which tries to create
> a TezClient and it keeps throwing the exception:
>
>
>
> [anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
> (UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
> (auth:SIMPLE) cause:java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
>
> [anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
> (TezClient.java:539) - Failed to retrieve AM Status via proxy
>
> com.google.protobuf.ServiceException: java.io.IOException: Failed on local
> exception: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
> "domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
> "domU-12-31-39-0C-7D-37":59431;
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>
> at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>
> at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)
>
> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)
>
> at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)
>
>
>
> I m trying to achieve impersonation. Here user "subroto" is privileged
> user and the real user is not at all considered by the Tez Code.
>
>
>
> Request some suggestion on this.
>
>
>
> On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
> There is nothing special that you need to do if you are already running
> secure Map Reduce jobs. The client needs to run in a Kerberized
> authenticated context. After that if you are using the built-in library of
> inputs/outputs etc then they should be taking care of all the access
> credentials for you when using the 0.5 API. I
>
>
>
> If you are using 0.4 API to write your job then you may need to use
> additional APIs for passing credentials to the application. Look for
> credentials in
> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
> URI*>* uris*)*
>
>
>
> The second method is a shortcut if you are using HDFS files for input. It
> obtains credentials for you from a collection of HDFS input URIs.
>
>
>
> Bikas
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Tuesday, August 19, 2014 3:30 AM
> *To:* user@tez.apache.org
> *Subject:* Tez with secured hadoop
>
>
>
> hi
>
>
>
> Tez works on secure hadoop cluster since tez-0.3.
>
> Is there any documentation available about configuring TezClient to make
> it work?
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Cheers,
*Subroto Sanyal*

RE: Tez with secured hadoop

Posted by Bikas Saha <bi...@hortonworks.com>.
The application will be run by YARN as the user that submits the
application. So if you want the app to run as FOO then it must be submit as
FOO. A proxy user has the privilege to run as BAR but get credentials for
FOO and give it to the app. A proxy user can also pass the real user via
the UserGroupInformation. I am not entirely familiar with the exact flow of
this but you could look at Oozie code to check how Oozie does this.
Probably involves fiddling around with the createProxyUser() method in
UserGroupInformation and starting TezClient under that UGI.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Thursday, October 02, 2014 6:23 AM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



hi Bikas,



My code snippet to create TezClient looks like (TEZ-0.5):

new PrivilegedExceptionAction<TezClient>() {



                @Override

                public TezClient run() throws Exception {

                    UserGroupInformation currentUser =
UserGroupInformation.getCurrentUser();

                    LOG.info("Current User:" + currentUser);

                    File tokenFile = new
File(System.getProperty("java.io.tmpdir"),
tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));

                    LOG.info("Token File:" + tokenFile.getAbsolutePath());


currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
conf);

                    tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
tokenFile.getAbsolutePath());

                    return TezClient.create(tezSessionName, tezConf,
createSession, localResourceMap, credentials);

                }

            }



The logs generated from this piece of code during execution looks like:

(TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
subroto@EC2.INTERNAL (auth:KERBEROS)

(TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob



When this runs on cluster the job runs as "subroto" but, what I expect is
to run it as "qa".



Please let me know if there is something missing or wrong in the code.



On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:

If by impersonation you mean what Oozie does where Oozie runs as Oozie but
get delegation tokens for user FOO then you will need to follow the
mechanism that Oozie uses. Oozie writes the delegation tokens into a file
and put that file in a specific path which is picked up by the application
(in this case TezClient) and the application loads credentials from that
file. In case of Tez the location of the credentials file is the value of
config "tez.credentials.path"



Bikas



*From:* Bikas Saha [mailto:bikas@hortonworks.com]
*Sent:* Monday, September 01, 2014 5:34 PM
*To:* user@tez.apache.org
*Subject:* RE: Tez with secured hadoop



They way this is supposed to work is the following in a secure cluster.

1)      The user that is running TezClient/DAGClient needs to be Kerberos
authenticated. This allows the process running DAGClient/TezClient to
contact the RM and get tokens to communicate with the AM.

2)      The TezClient/DAGClient uses the tokens obtained from the RM and
populates it into the current user UGI (ie the use who is running
TezClient/DAGClient). The RPC to the AM will try to authenticate the
current user using the tokens just added to the current users UGI.



In a non-secure environment, no tokens are needed. So I am guessing that
your are running in a secure env.



Given the above info, what is happening in your case. Whichever user the
client is running under, it looks like it can authenticate to the RM to get
the app report. So it should have gotten tokens to access the AM. Its not
clear what you mean by user “subroto” being privileged and the real user
not considered by Tez. It looks like you are running the client as user
“subroto”. Who is “subroto” and who is the real user?



Does this happen always or occasionally. There is a known race condition in
YARN where the client gets tokens before the AM gets the key to validate
the tokens.



You can turn on debug logging and see the SASL negotiation logs to get more
info on whats happening. You may add a debug log in getAMProxy() to verify
that token were obtained from the RM and added to the UGI.



It may help if you describe your scenario. What are you trying to achieve
by impersonation and how are you trying to do that. We recently added ACLs
in case that works for your scenario.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Sunday, August 31, 2014 8:59 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



Hi Bikas,



In the method:

org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:



[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)



I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.



Request some suggestion on this.



On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

There is nothing special that you need to do if you are already running
secure Map Reduce jobs. The client needs to run in a Kerberized
authenticated context. After that if you are using the built-in library of
inputs/outputs etc then they should be taking care of all the access
credentials for you when using the 0.5 API. I



If you are using 0.4 API to write your job then you may need to use
additional APIs for passing credentials to the application. Look for
credentials in
https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
URI*>* uris*)*



The second method is a shortcut if you are using HDFS files for input. It
obtains credentials for you from a collection of HDFS input URIs.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, August 19, 2014 3:30 AM
*To:* user@tez.apache.org
*Subject:* Tez with secured hadoop



hi



Tez works on secure hadoop cluster since tez-0.3.

Is there any documentation available about configuring TezClient to make it
work?



-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Tez with secured hadoop

Posted by Subroto Sanyal <sa...@gmail.com>.
hi Bikas,

My code snippet to create TezClient looks like (TEZ-0.5):

new PrivilegedExceptionAction<TezClient>() {


                @Override

                public TezClient run() throws Exception {

                    UserGroupInformation currentUser =
UserGroupInformation.getCurrentUser();

                    LOG.info("Current User:" + currentUser);

                    File tokenFile = new File(System.getProperty(
"java.io.tmpdir"), tezSessionName.replaceAll("[^a-zA-Z0-9]", ""));

                    LOG.info("Token File:" + tokenFile.getAbsolutePath());


currentUser.getCredentials().writeTokenStorageFile(UriUtil.toPath(tokenFile.getAbsoluteFile()),
conf);

                    tezConf.set(TezConfiguration.TEZ_CREDENTIALS_PATH,
tokenFile.getAbsolutePath());

                    return TezClient.create(tezSessionName, tezConf,
createSession, localResourceMap, credentials);

                }

            }


The logs generated from this piece of code during execution looks like:

(TezClientFacade.java:142) - Current User:qa (auth:PROXY) via
subroto@EC2.INTERNAL (auth:KERBEROS)

(TezClientFacade.java:144) - Token File:/home/subroto/tmp/testTezJob


When this runs on cluster the job runs as "subroto" but, what I expect is
to run it as "qa".


Please let me know if there is something missing or wrong in the code.

On Sat, Sep 13, 2014 at 3:33 AM, Bikas Saha <bi...@hortonworks.com> wrote:

> If by impersonation you mean what Oozie does where Oozie runs as Oozie but
> get delegation tokens for user FOO then you will need to follow the
> mechanism that Oozie uses. Oozie writes the delegation tokens into a file
> and put that file in a specific path which is picked up by the application
> (in this case TezClient) and the application loads credentials from that
> file. In case of Tez the location of the credentials file is the value of
> config "tez.credentials.path"
>
>
>
> Bikas
>
>
>
> *From:* Bikas Saha [mailto:bikas@hortonworks.com]
> *Sent:* Monday, September 01, 2014 5:34 PM
> *To:* user@tez.apache.org
> *Subject:* RE: Tez with secured hadoop
>
>
>
> They way this is supposed to work is the following in a secure cluster.
>
> 1)      The user that is running TezClient/DAGClient needs to be Kerberos
> authenticated. This allows the process running DAGClient/TezClient to
> contact the RM and get tokens to communicate with the AM.
>
> 2)      The TezClient/DAGClient uses the tokens obtained from the RM and
> populates it into the current user UGI (ie the use who is running
> TezClient/DAGClient). The RPC to the AM will try to authenticate the
> current user using the tokens just added to the current users UGI.
>
>
>
> In a non-secure environment, no tokens are needed. So I am guessing that
> your are running in a secure env.
>
>
>
> Given the above info, what is happening in your case. Whichever user the
> client is running under, it looks like it can authenticate to the RM to get
> the app report. So it should have gotten tokens to access the AM. Its not
> clear what you mean by user “subroto” being privileged and the real user
> not considered by Tez. It looks like you are running the client as user
> “subroto”. Who is “subroto” and who is the real user?
>
>
>
> Does this happen always or occasionally. There is a known race condition
> in YARN where the client gets tokens before the AM gets the key to validate
> the tokens.
>
>
>
> You can turn on debug logging and see the SASL negotiation logs to get
> more info on whats happening. You may add a debug log in getAMProxy() to
> verify that token were obtained from the RM and added to the UGI.
>
>
>
> It may help if you describe your scenario. What are you trying to achieve
> by impersonation and how are you trying to do that. We recently added ACLs
> in case that works for your scenario.
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Sunday, August 31, 2014 8:59 PM
> *To:* user@tez.apache.org
> *Subject:* Re: Tez with secured hadoop
>
>
>
> Hi Bikas,
>
>
>
> In the method:
>
> org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String,
> int, Token) a UGI is getting created with name of the current user. I think
> in this process it ignores all the security things and making the
> authentication mode as "SIMPLE". I have piece of code which tries to create
> a TezClient and it keeps throwing the exception:
>
>
>
> [anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
> (UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
> (auth:SIMPLE) cause:java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]
>
> [anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
> (TezClient.java:539) - Failed to retrieve AM Status via proxy
>
> com.google.protobuf.ServiceException: java.io.IOException: Failed on local
> exception: java.io.IOException:
> org.apache.hadoop.security.AccessControlException: Client cannot
> authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
> "domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
> "domU-12-31-39-0C-7D-37":59431;
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>
> at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)
>
> at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)
>
> at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)
>
> at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)
>
>
>
> I m trying to achieve impersonation. Here user "subroto" is privileged
> user and the real user is not at all considered by the Tez Code.
>
>
>
> Request some suggestion on this.
>
>
>
> On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com>
> wrote:
>
> There is nothing special that you need to do if you are already running
> secure Map Reduce jobs. The client needs to run in a Kerberized
> authenticated context. After that if you are using the built-in library of
> inputs/outputs etc then they should be taking care of all the access
> credentials for you when using the 0.5 API. I
>
>
>
> If you are using 0.4 API to write your job then you may need to use
> additional APIs for passing credentials to the application. Look for
> credentials in
> https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
> and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
> URI*>* uris*)*
>
>
>
> The second method is a shortcut if you are using HDFS files for input. It
> obtains credentials for you from a collection of HDFS input URIs.
>
>
>
> Bikas
>
>
>
> *From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
> *Sent:* Tuesday, August 19, 2014 3:30 AM
> *To:* user@tez.apache.org
> *Subject:* Tez with secured hadoop
>
>
>
> hi
>
>
>
> Tez works on secure hadoop cluster since tez-0.3.
>
> Is there any documentation available about configuring TezClient to make
> it work?
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
>
>
>
>
> --
> Cheers,
> *Subroto Sanyal*
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity
> to which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>



-- 
Cheers,
*Subroto Sanyal*

RE: Tez with secured hadoop

Posted by Bikas Saha <bi...@hortonworks.com>.
If by impersonation you mean what Oozie does where Oozie runs as Oozie but
get delegation tokens for user FOO then you will need to follow the
mechanism that Oozie uses. Oozie writes the delegation tokens into a file
and put that file in a specific path which is picked up by the application
(in this case TezClient) and the application loads credentials from that
file. In case of Tez the location of the credentials file is the value of
config "tez.credentials.path"



Bikas



*From:* Bikas Saha [mailto:bikas@hortonworks.com]
*Sent:* Monday, September 01, 2014 5:34 PM
*To:* user@tez.apache.org
*Subject:* RE: Tez with secured hadoop



They way this is supposed to work is the following in a secure cluster.

1)      The user that is running TezClient/DAGClient needs to be Kerberos
authenticated. This allows the process running DAGClient/TezClient to
contact the RM and get tokens to communicate with the AM.

2)      The TezClient/DAGClient uses the tokens obtained from the RM and
populates it into the current user UGI (ie the use who is running
TezClient/DAGClient). The RPC to the AM will try to authenticate the
current user using the tokens just added to the current users UGI.



In a non-secure environment, no tokens are needed. So I am guessing that
your are running in a secure env.



Given the above info, what is happening in your case. Whichever user the
client is running under, it looks like it can authenticate to the RM to get
the app report. So it should have gotten tokens to access the AM. Its not
clear what you mean by user “subroto” being privileged and the real user
not considered by Tez. It looks like you are running the client as user
“subroto”. Who is “subroto” and who is the real user?



Does this happen always or occasionally. There is a known race condition in
YARN where the client gets tokens before the AM gets the key to validate
the tokens.



You can turn on debug logging and see the SASL negotiation logs to get more
info on whats happening. You may add a debug log in getAMProxy() to verify
that token were obtained from the RM and added to the UGI.



It may help if you describe your scenario. What are you trying to achieve
by impersonation and how are you trying to do that. We recently added ACLs
in case that works for your scenario.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Sunday, August 31, 2014 8:59 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



Hi Bikas,



In the method:

org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:



[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)



I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.



Request some suggestion on this.



On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

There is nothing special that you need to do if you are already running
secure Map Reduce jobs. The client needs to run in a Kerberized
authenticated context. After that if you are using the built-in library of
inputs/outputs etc then they should be taking care of all the access
credentials for you when using the 0.5 API. I



If you are using 0.4 API to write your job then you may need to use
additional APIs for passing credentials to the application. Look for
credentials in
https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
URI*>* uris*)*



The second method is a shortcut if you are using HDFS files for input. It
obtains credentials for you from a collection of HDFS input URIs.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, August 19, 2014 3:30 AM
*To:* user@tez.apache.org
*Subject:* Tez with secured hadoop



hi



Tez works on secure hadoop cluster since tez-0.3.

Is there any documentation available about configuring TezClient to make it
work?



-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

RE: Tez with secured hadoop

Posted by Bikas Saha <bi...@hortonworks.com>.
They way this is supposed to work is the following in a secure cluster.

1)      The user that is running TezClient/DAGClient needs to be Kerberos
authenticated. This allows the process running DAGClient/TezClient to
contact the RM and get tokens to communicate with the AM.

2)      The TezClient/DAGClient uses the tokens obtained from the RM and
populates it into the current user UGI (ie the use who is running
TezClient/DAGClient). The RPC to the AM will try to authenticate the
current user using the tokens just added to the current users UGI.



In a non-secure environment, no tokens are needed. So I am guessing that
your are running in a secure env.



Given the above info, what is happening in your case. Whichever user the
client is running under, it looks like it can authenticate to the RM to get
the app report. So it should have gotten tokens to access the AM. Its not
clear what you mean by user “subroto” being privileged and the real user
not considered by Tez. It looks like you are running the client as user
“subroto”. Who is “subroto” and who is the real user?



Does this happen always or occasionally. There is a known race condition in
YARN where the client gets tokens before the AM gets the key to validate
the tokens.



You can turn on debug logging and see the SASL negotiation logs to get more
info on whats happening. You may add a debug log in getAMProxy() to verify
that token were obtained from the RM and added to the UGI.



It may help if you describe your scenario. What are you trying to achieve
by impersonation and how are you trying to do that. We recently added ACLs
in case that works for your scenario.



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Sunday, August 31, 2014 8:59 PM
*To:* user@tez.apache.org
*Subject:* Re: Tez with secured hadoop



Hi Bikas,



In the method:

org.apache.tez.client.TezClientUtils.getAMProxy(Configuration, String, int,
Token) a UGI is getting created with name of the current user. I think in
this process it ignores all the security things and making the
authentication mode as "SIMPLE". I have piece of code which tries to create
a TezClient and it keeps throwing the exception:



[anonymous]  WARN [2014-08-28 03:37:50.181] [MrPlanRunnerV2]
(UserGroupInformation.java:1551) - PriviledgedActionException as:subroto
(auth:SIMPLE) cause:java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]

[anonymous]  INFO [2014-08-28 03:37:50.182] [MrPlanRunnerV2]
(TezClient.java:539) - Failed to retrieve AM Status via proxy

com.google.protobuf.ServiceException: java.io.IOException: Failed on local
exception: java.io.IOException:
org.apache.hadoop.security.AccessControlException: Client cannot
authenticate via:[TOKEN, KERBEROS]; Host Details : local host is:
"domU-12-31-39-0F-74-32/10.193.119.192"; destination host is:
"domU-12-31-39-0C-7D-37":59431;

at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)

at com.sun.proxy.$Proxy111.getAMStatus(Unknown Source)

at org.apache.tez.client.TezClient.getAppMasterStatus(TezClient.java:532)

at org.apache.tez.client.TezClient.waitTillReady(TezClient.java:607)

at subroto.tez.TezClusterSession$2.run(TezClusterSession.java:180)



I m trying to achieve impersonation. Here user "subroto" is privileged user
and the real user is not at all considered by the Tez Code.



Request some suggestion on this.



On Tue, Aug 19, 2014 at 11:18 PM, Bikas Saha <bi...@hortonworks.com> wrote:

There is nothing special that you need to do if you are already running
secure Map Reduce jobs. The client needs to run in a Kerberized
authenticated context. After that if you are using the built-in library of
inputs/outputs etc then they should be taking care of all the access
credentials for you when using the 0.5 API. I



If you are using 0.4 API to write your job then you may need to use
additional APIs for passing credentials to the application. Look for
credentials in
https://github.com/apache/tez/blob/branch-0.4.0-incubating/tez-mapreduce-examples/src/main/java/org/apache/tez/mapreduce/examples/FilterLinesByWord.java
and also *public* *synchronized* DAG *addURIsForCredentials(*Collection*<*
URI*>* uris*)*



The second method is a shortcut if you are using HDFS files for input. It
obtains credentials for you from a collection of HDFS input URIs.



Bikas



*From:* Subroto Sanyal [mailto:sanyalsubroto@gmail.com]
*Sent:* Tuesday, August 19, 2014 3:30 AM
*To:* user@tez.apache.org
*Subject:* Tez with secured hadoop



hi



Tez works on secure hadoop cluster since tez-0.3.

Is there any documentation available about configuring TezClient to make it
work?



-- 
Cheers,
*Subroto Sanyal*


CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.





-- 
Cheers,
*Subroto Sanyal*

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.