You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Sunil Govind <su...@gmail.com> on 2016/06/10 13:07:06 UTC

Re: Verifying the authenticity of submitted AM

HI Mingyu,

May be you can take a look at below link
https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/yarn.html

It will give a fair idea about the security you can get for an application

- Sunil

On Fri, Jun 10, 2016 at 3:54 AM Mingyu Kim <mk...@palantir.com> wrote:

> // forking for clarify
>
>
>
> Related to the question I had below, I’m wondering how I can verify the
> authenticity of the submitted AM. (For example, when I’m making a call to
> AM, I’d like to verify that I’m talking to the AM that I submitted, not
> someone else who hijacked my network traffic. Also, when AM makes a
> callback to a server outside YARN, I’d like to verify that it’s the AM I
> submitted, not someone else who’s spoofing) This can generally be achieved
> by sending a secret (whether that’s a one-time secret that the server
> outside YARN can verity or a SSL keystore) to AM. Do you know how one can
> securely send the secret to AM? Or, is there an existing YARN mechanism I
> can rely on to verify the authenticity? (I saw
> ApplicationReport.getClientToAMToken(), but that seems to be for AM to
> verify the authenticity of client) Again, any pointer will be appreciated.
>
>
>
> Thanks,
>
> Mingyu
>
>
>
> *From: *Rohith Sharma K S <ro...@huawei.com>
> *Date: *Wednesday, June 8, 2016 at 11:15 PM
> *To: *Mingyu Kim <mk...@palantir.com>, "user@hadoop.apache.org" <
> user@hadoop.apache.org>
> *Cc: *Matt Cheah <mc...@palantir.com>
> *Subject: *RE: Securely discovering Application Master's metadata or
> sending a secret to Application Master at submission
>
>
>
> Hi
>
>
>
> Do you know how I can extend the client interface of the RPC port?
>
> >>> YARN provides YARNClIent library that uses ApplicationClientProtocol.
> For your more understanding refer
> https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html#Writing_a_simple_Client
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__hadoop.apache.org_docs_stable_hadoop-2Dyarn_hadoop-2Dyarn-2Dsite_WritingYarnApplications.html-23Writing-5Fa-5Fsimple-5FClient&d=DQMGaQ&c=izlc9mHr637UR4lpLEZLFFS3Vn2UXBrZ4tFb6oOnmz8&r=ennQJq47pNnObsDh-88a9YUrUulcYQoV8giPASqXB84&m=5pHc0M-1BOxtbvvaoT6ahycddGtWm-uq9f5JW_FJRQM&s=S9H5l9wo0JK9Oet5_GiN-lW4lQBxkaC1mxPDRY1kGpk&e=>
>
>
>
> I know AM has some endpoints exposed through the RPC port for internal
> YARN communications, but was not sure how I can extend it to expose a
> custom endpoint.
>
> >>> I am not sure what you mean here internal YARN communication? AM can
> connect to RM only via AM-RM interface for register/unregister and
> heartbeat and details sent to RM are limited.  It is up to the AM’s to
> expose client interface for providing metadata.
>
> Thanks & Regards
>
> Rohith Sharma K S
>
> *From:* Mingyu Kim [mailto:mkim@palantir.com]
> *Sent:* 09 June 2016 11:21
> *To:* Rohith Sharma K S; user@hadoop.apache.org
> *Cc:* Matt Cheah
> *Subject:* Re: Securely discovering Application Master's metadata or
> sending a secret to Application Master at submission
>
>
>
> Hi Rohith,
>
>
>
> Thanks for the quick response. That sounds promising. Do you know how I
> can extend the client interface of the RPC port? I know AM has some
> endpoints exposed through the RPC port for internal YARN communications,
> but was not sure how I can extend it to expose a custom endpoint. Any
> pointer would be appreciated!
>
>
>
> Mingyu
>
>
>
> *From: *Rohith Sharma K S <ro...@huawei.com>
> *Date: *Wednesday, June 8, 2016 at 10:39 PM
> *To: *Mingyu Kim <mk...@palantir.com>, "user@hadoop.apache.org" <
> user@hadoop.apache.org>
> *Cc: *Matt Cheah <mc...@palantir.com>
> *Subject: *RE: Securely discovering Application Master's metadata or
> sending a secret to Application Master at submission
>
>
>
> Hi
>
>
>
> Apart from AM address and tracking URL, no other meta data of
> applicationMaster are stored in YARN. May be AM can expose client interface
> so that AM clients can interact with Running AM to retrieve specific AM
> details.
>
>
>
> RPC port of AM can be get from YARN client interface such as
> ApplicationClientProtocol# getApplicationReport() OR
> ApplicationClientProtocol #getApplicationAttemptReport().
>
>
>
> Thanks & Regards
>
> Rohith Sharma K S
>
>
>
> *From:* Mingyu Kim [mailto:mkim@palantir.com <mk...@palantir.com>]
> *Sent:* 09 June 2016 10:36
> *To:* user@hadoop.apache.org
> *Cc:* Matt Cheah
> *Subject:* Securely discovering Application Master's metadata or sending
> a secret to Application Master at submission
>
>
>
> Hi all,
>
>
>
> To provide a bit of background, I’m trying to deploy a REST server on
> Application Master and discover the randomly assigned port number securely.
> I can easily discover the host name of AM through YARN REST API, but the
> port number needs to be discovered separately. (Port number is assigned
> within a specified range with retries to avoid port conflicts) An easy
> solution would be to have Application Master make a callback with the port
> number, but I’d like to design it such that YARN nodes don’t talk back to
> the node that submitted the YARN application. So, this problem reduces to
> securely discovering a small metadata of Application Master. To be clear,
> by being secure, I’m less concerned about exposing the information to
> others, but more concerned about the integrity of data (e.g. the metadata
> actually originated from the Application Master.)
>
>
>
> I was hoping that there is a way to register some Application Master
> metadata to Resource Manager, but there doesn’t seem to be a way. Another
> option I considered was to write the information to a HDFS file, but in
> order to verify the integrity of the content, I need a way to securely send
> a private key to Application Master, which I’m not sure what the best is.
>
>
>
> To recap, does anyone know if there is a way
>
> ·         To register small metadata securely from Application Master to
> Resource Manager so that it can be discovered by the YARN application
> submitter?
>
> ·         Or, to securely send a private key to Application Master at the
> application submission time?
>
>
>
> Thanks a lot,
>
> Mingyu
>

Re: Verifying the authenticity of submitted AM

Posted by Mingyu Kim <mk...@palantir.com>.
Sorry for the late response. I finally caught up on most chapters on the gitbook you linked. This was super helpful. Thanks for the pointer.

 

Just to make sure I understood it correctly,

 

1.       One can send a secret as a command-line argument or environment variable to the AM securely by setting up Kerberos and setting hadoop.rpc.protection=privacy, because then the application submission request blob will be sent to the node manager encrypted.

2.       A client outside YARN can make a REST call to AM and verify the identity of AM (assuming the REST server is set up to use Kerberos) via SPNEGO.

3.       A REST server outside YARN can verify the identity of AM when AM makes a callback via SPNEGO. However, the authenticity can be verified at the user identity level. For example, if two applications are submitted under user A, one application can pretend to be the other application because they are authenticated as a same user.

 

So, it sounds like if I’d like to verify the authenticity of a particular AM submitted as opposed to relying on user-identity level authenticity check provided via SPNEGO, using the option #1 to securely pass a one-time secret would be the right way to go. Please correct me if any of my understanding is wrong.

 

Thanks,

Mingyu

 

From: Sunil Govind <su...@gmail.com>
Date: Friday, June 10, 2016 at 6:07 AM
To: Mingyu Kim <mk...@palantir.com>, Rohith Sharma K S <ro...@huawei.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Matt Cheah <mc...@palantir.com>
Subject: Re: Verifying the authenticity of submitted AM

 

HI Mingyu, 

 

May be you can take a look at below link

https://steveloughran.gitbooks.io/kerberos_and_hadoop/content/sections/yarn.html

 

It will give a fair idea about the security you can get for an application

 

- Sunil

 

On Fri, Jun 10, 2016 at 3:54 AM Mingyu Kim <mk...@palantir.com> wrote:

// forking for clarify

 

Related to the question I had below, I’m wondering how I can verify the authenticity of the submitted AM. (For example, when I’m making a call to AM, I’d like to verify that I’m talking to the AM that I submitted, not someone else who hijacked my network traffic. Also, when AM makes a callback to a server outside YARN, I’d like to verify that it’s the AM I submitted, not someone else who’s spoofing) This can generally be achieved by sending a secret (whether that’s a one-time secret that the server outside YARN can verity or a SSL keystore) to AM. Do you know how one can securely send the secret to AM? Or, is there an existing YARN mechanism I can rely on to verify the authenticity? (I saw ApplicationReport.getClientToAMToken(), but that seems to be for AM to verify the authenticity of client) Again, any pointer will be appreciated.

 

Thanks,

Mingyu

 

From: Rohith Sharma K S <ro...@huawei.com>
Date: Wednesday, June 8, 2016 at 11:15 PM
To: Mingyu Kim <mk...@palantir.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Matt Cheah <mc...@palantir.com>
Subject: RE: Securely discovering Application Master's metadata or sending a secret to Application Master at submission

 

Hi

 

Do you know how I can extend the client interface of the RPC port?

>>> YARN provides YARNClIent library that uses ApplicationClientProtocol. For your more understanding refer https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html#Writing_a_simple_Client

 

I know AM has some endpoints exposed through the RPC port for internal YARN communications, but was not sure how I can extend it to expose a custom endpoint.

>>> I am not sure what you mean here internal YARN communication? AM can connect to RM only via AM-RM interface for register/unregister and heartbeat and details sent to RM are limited.  It is up to the AM’s to expose client interface for providing metadata.

Thanks & Regards

Rohith Sharma K S

From: Mingyu Kim [mailto:mkim@palantir.com] 
Sent: 09 June 2016 11:21
To: Rohith Sharma K S; user@hadoop.apache.org
Cc: Matt Cheah
Subject: Re: Securely discovering Application Master's metadata or sending a secret to Application Master at submission

 

Hi Rohith,

 

Thanks for the quick response. That sounds promising. Do you know how I can extend the client interface of the RPC port? I know AM has some endpoints exposed through the RPC port for internal YARN communications, but was not sure how I can extend it to expose a custom endpoint. Any pointer would be appreciated!

 

Mingyu

 

From: Rohith Sharma K S <ro...@huawei.com>
Date: Wednesday, June 8, 2016 at 10:39 PM
To: Mingyu Kim <mk...@palantir.com>, "user@hadoop.apache.org" <us...@hadoop.apache.org>
Cc: Matt Cheah <mc...@palantir.com>
Subject: RE: Securely discovering Application Master's metadata or sending a secret to Application Master at submission

 

Hi

 

Apart from AM address and tracking URL, no other meta data of applicationMaster are stored in YARN. May be AM can expose client interface so that AM clients can interact with Running AM to retrieve specific AM details. 

 

RPC port of AM can be get from YARN client interface such as ApplicationClientProtocol# getApplicationReport() OR ApplicationClientProtocol #getApplicationAttemptReport().

 

Thanks & Regards

Rohith Sharma K S

 

From: Mingyu Kim [mailto:mkim@palantir.com] 
Sent: 09 June 2016 10:36
To: user@hadoop.apache.org
Cc: Matt Cheah
Subject: Securely discovering Application Master's metadata or sending a secret to Application Master at submission

 

Hi all,

 

To provide a bit of background, I’m trying to deploy a REST server on Application Master and discover the randomly assigned port number securely. I can easily discover the host name of AM through YARN REST API, but the port number needs to be discovered separately. (Port number is assigned within a specified range with retries to avoid port conflicts) An easy solution would be to have Application Master make a callback with the port number, but I’d like to design it such that YARN nodes don’t talk back to the node that submitted the YARN application. So, this problem reduces to securely discovering a small metadata of Application Master. To be clear, by being secure, I’m less concerned about exposing the information to others, but more concerned about the integrity of data (e.g. the metadata actually originated from the Application Master.)

 

I was hoping that there is a way to register some Application Master metadata to Resource Manager, but there doesn’t seem to be a way. Another option I considered was to write the information to a HDFS file, but in order to verify the integrity of the content, I need a way to securely send a private key to Application Master, which I’m not sure what the best is.

 

To recap, does anyone know if there is a way

·         To register small metadata securely from Application Master to Resource Manager so that it can be discovered by the YARN application submitter?

·         Or, to securely send a private key to Application Master at the application submission time?

 

Thanks a lot,

Mingyu