You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Blaze Spinnaker <bl...@gmail.com> on 2017/10/30 19:28:50 UTC

Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

Hi,

We are submitting critical UserGroupInformation credentials and wanted to
know how these are protected in Spark Cluster.

Questions:

Are the credentials persisted to disk at any point?  If so, where?
If they are persisted, are they encrypted? Or just obfuscated?  is the
encryption key accessible?
Are they only protected by file permissions?

Are they only in memory?

How would you securely propagate UGI / credentials to spark executors?

Regards,

Tim

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Blaze!

Thanks for the link, although it did not have anything I didn't already
know. I'm afraid I don't quite follow what your concern is here. The files
are protected using UNIX permissions on the worker nodes. Is that not what
you are seeing? Are you using the LinuxContainerExecutor? Are the yarn
containers running as the user who launched the yarn application? Are you
saying the permissions on the file should be different?

Ravi

On Mon, Oct 30, 2017 at 9:22 PM, Blaze Spinnaker <bl...@gmail.com>
wrote:

> Ravi,
>
> The code and architecture is based on the Hadoop source code submitted
> through the Yarn Client.    This is an issue for map reduce as well.  eg:
> https://pravinchavan.wordpress.com/2013/04/25/223/
>
> On Mon, Oct 30, 2017 at 1:15 PM, Ravi Prakash <ra...@gmail.com>
> wrote:
>
>> Hi Blaze!
>>
>> Thanks for digging into this. I'm sure security related features could
>> use more attention. Tokens for one user should be isolated from other
>> users. I'm sorry I don't know how spark uses them.
>>
>> Would this question be more appropriate on the spark mailing list?
>> https://spark.apache.org/community.html
>>
>> Thanks
>> Ravi
>>
>> On Mon, Oct 30, 2017 at 12:43 PM, Blaze Spinnaker <
>> blazespinnaker@gmail.com> wrote:
>>
>>> I looked at this a bit more and I see a container_tokens file in spark
>>> directory.   Does this contain the credentials where are added by
>>> addCredentials?   Is this file accessible to the spark executors?
>>>
>>> It looks like just a clear text protobuf file.
>>>
>>> https://github.com/apache/hadoop/blob/82cb2a6497caa7c5e693aa
>>> 41ad18e92f1c7eb16a/hadoop-common-project/hadoop-common/src/
>>> main/java/org/apache/hadoop/security/Credentials.java#L221
>>>
>>> This means that anyone with access to the user can read credentials from
>>> any other user.  Correct?
>>>
>>> On Mon, Oct 30, 2017 at 12:28 PM, Blaze Spinnaker <
>>> blazespinnaker@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> We are submitting critical UserGroupInformation credentials and wanted
>>>> to know how these are protected in Spark Cluster.
>>>>
>>>> Questions:
>>>>
>>>> Are the credentials persisted to disk at any point?  If so, where?
>>>> If they are persisted, are they encrypted? Or just obfuscated?  is the
>>>> encryption key accessible?
>>>> Are they only protected by file permissions?
>>>>
>>>> Are they only in memory?
>>>>
>>>> How would you securely propagate UGI / credentials to spark executors?
>>>>
>>>> Regards,
>>>>
>>>> Tim
>>>>
>>>
>>>
>>
>

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

Posted by Blaze Spinnaker <bl...@gmail.com>.
Ravi,

The code and architecture is based on the Hadoop source code submitted
through the Yarn Client.    This is an issue for map reduce as well.  eg:
https://pravinchavan.wordpress.com/2013/04/25/223/

On Mon, Oct 30, 2017 at 1:15 PM, Ravi Prakash <ra...@gmail.com> wrote:

> Hi Blaze!
>
> Thanks for digging into this. I'm sure security related features could use
> more attention. Tokens for one user should be isolated from other users.
> I'm sorry I don't know how spark uses them.
>
> Would this question be more appropriate on the spark mailing list?
> https://spark.apache.org/community.html
>
> Thanks
> Ravi
>
> On Mon, Oct 30, 2017 at 12:43 PM, Blaze Spinnaker <
> blazespinnaker@gmail.com> wrote:
>
>> I looked at this a bit more and I see a container_tokens file in spark
>> directory.   Does this contain the credentials where are added by
>> addCredentials?   Is this file accessible to the spark executors?
>>
>> It looks like just a clear text protobuf file.
>>
>> https://github.com/apache/hadoop/blob/82cb2a6497caa7c5e693aa
>> 41ad18e92f1c7eb16a/hadoop-common-project/hadoop-common/
>> src/main/java/org/apache/hadoop/security/Credentials.java#L221
>>
>> This means that anyone with access to the user can read credentials from
>> any other user.  Correct?
>>
>> On Mon, Oct 30, 2017 at 12:28 PM, Blaze Spinnaker <
>> blazespinnaker@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> We are submitting critical UserGroupInformation credentials and wanted
>>> to know how these are protected in Spark Cluster.
>>>
>>> Questions:
>>>
>>> Are the credentials persisted to disk at any point?  If so, where?
>>> If they are persisted, are they encrypted? Or just obfuscated?  is the
>>> encryption key accessible?
>>> Are they only protected by file permissions?
>>>
>>> Are they only in memory?
>>>
>>> How would you securely propagate UGI / credentials to spark executors?
>>>
>>> Regards,
>>>
>>> Tim
>>>
>>
>>
>

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Blaze!

Thanks for digging into this. I'm sure security related features could use
more attention. Tokens for one user should be isolated from other users.
I'm sorry I don't know how spark uses them.

Would this question be more appropriate on the spark mailing list?
https://spark.apache.org/community.html

Thanks
Ravi

On Mon, Oct 30, 2017 at 12:43 PM, Blaze Spinnaker <bl...@gmail.com>
wrote:

> I looked at this a bit more and I see a container_tokens file in spark
> directory.   Does this contain the credentials where are added by
> addCredentials?   Is this file accessible to the spark executors?
>
> It looks like just a clear text protobuf file.
>
> https://github.com/apache/hadoop/blob/82cb2a6497caa7c5e693aa41ad18e9
> 2f1c7eb16a/hadoop-common-project/hadoop-common/src/
> main/java/org/apache/hadoop/security/Credentials.java#L221
>
> This means that anyone with access to the user can read credentials from
> any other user.  Correct?
>
> On Mon, Oct 30, 2017 at 12:28 PM, Blaze Spinnaker <
> blazespinnaker@gmail.com> wrote:
>
>> Hi,
>>
>> We are submitting critical UserGroupInformation credentials and wanted to
>> know how these are protected in Spark Cluster.
>>
>> Questions:
>>
>> Are the credentials persisted to disk at any point?  If so, where?
>> If they are persisted, are they encrypted? Or just obfuscated?  is the
>> encryption key accessible?
>> Are they only protected by file permissions?
>>
>> Are they only in memory?
>>
>> How would you securely propagate UGI / credentials to spark executors?
>>
>> Regards,
>>
>> Tim
>>
>
>

Re: Vulnerabilities to UserGroupInformation / credentials in a Spark Cluster

Posted by Blaze Spinnaker <bl...@gmail.com>.
I looked at this a bit more and I see a container_tokens file in spark
directory.   Does this contain the credentials where are added by
addCredentials?   Is this file accessible to the spark executors?

It looks like just a clear text protobuf file.

https://github.com/apache/hadoop/blob/82cb2a6497caa7c5e693aa41ad18e92f1c7eb16a/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Credentials.java#L221

This means that anyone with access to the user can read credentials from
any other user.  Correct?

On Mon, Oct 30, 2017 at 12:28 PM, Blaze Spinnaker <bl...@gmail.com>
wrote:

> Hi,
>
> We are submitting critical UserGroupInformation credentials and wanted to
> know how these are protected in Spark Cluster.
>
> Questions:
>
> Are the credentials persisted to disk at any point?  If so, where?
> If they are persisted, are they encrypted? Or just obfuscated?  is the
> encryption key accessible?
> Are they only protected by file permissions?
>
> Are they only in memory?
>
> How would you securely propagate UGI / credentials to spark executors?
>
> Regards,
>
> Tim
>