You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Eric Payne (Jira)" <ji...@apache.org> on 2021/08/19 18:54:00 UTC

[jira] [Commented] (HADOOP-17857) Check real user ACLs in addition to proxied user ACLs

    [ https://issues.apache.org/jira/browse/HADOOP-17857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401820#comment-17401820 ] 

Eric Payne commented on HADOOP-17857:
-------------------------------------

I suggest that we define the ACLs so that a special character tells the AccessControlList system to check the ACLs for the real user and not those for the proxied user.

Let's take and example for submitting jobs in the Capacity Scheduler into the {{dataload}} queue. The use case is to only allow {{adm}} to submit to the {{dataload}} queue, but once the apps are submitted, they run as a proxy user (like {{headless1}}):
The following is the current syntax, and it only allows the {{adm}} user, as itself, to submit jobs to the {{dataload}} queue.
{code:xml}
  <property>
    <name>yarn.scheduler.capacity.root.dataload.acl_submit_applications</name>
    <value>adm</value>
  </property>
{code}
With this syntax, if the {{adm}} user proxies to {{headless1}} and submits the job, the Capacity Scheduler will reject the submission because {{headless1}} does not have submit ACL permissions.

*PROPOSED CHANGES:*
- Add a tilde (~) to the beginning of the {{adm}} user in the value section of the property.
In the above example, note the additon of the tilde (~):
{code:xml}
  <property>
    <name>yarn.scheduler.capacity.root.dataload.acl_submit_applications</name>
    <value>~adm</value>
  </property>
{code}
  - With the tilde (~}, any proxied user submitted by the {{adm}} user will be allowed to run in the {{dataload}} queue.
  - That same proxied user will _not_ be allowed to submit by themselves if they are not first proxied by {{adm}}.
  - NOTE: with this syntax, {{adm}} will not be able to directly submit as itself to the {{dataload}} queue. In order to both submit as {{adm}} and also allow an {{adm}}-proxied user to submit to the {{dataload}} queue, both {{~adm}} and {{adm}} must be specified, as follows:
{code:xml}
  <property>
    <name>yarn.scheduler.capacity.root.dataload.acl_submit_applications</name>
    <value>~adm,adm</value>
  </property>
{code}

This example could be extended to other ACL properties in other Hadoop systems.

We have been running with this change in production for over a year now, and it works well.


> Check real user ACLs in addition to proxied user ACLs
> -----------------------------------------------------
>
>                 Key: HADOOP-17857
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17857
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 3.2.2, 2.10.1, 3.3.1
>            Reporter: Eric Payne
>            Priority: Major
>
> In a secure cluster, it is possible to configure the services to allow a super-user to proxy to a regular user and perform actions on behalf of the proxied user (see [Proxy user - Superusers Acting On Behalf Of Other Users|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html]).
> This is useful for automating server access for multiple different users in a multi-tenant cluster. For example, this can be used by a super user submitting jobs to a YARN queue, accessing HDFS files, scheduling Oozie workflows, etc, which will then execute the service as the proxied user.
> Usually when these services check ACLs to determine if the user has access to the requested resources, the service only needs to check the ACLs for the proxied user. However, it is sometimes desirable to allow the proxied user to have access to the resources when only the real user has open ACLs.
> For instance, let's say the user {{adm}} is the only user with submit ACLs to the {{dataload}} queue, and the {{adm}} user wants to submit apps to the {{dataload}} queue on behalf of users {{headless1}} and {{headless2}}. In addition, we want to be able to bill {{headless1}} and {{headless2}} separately for the YARN resources used in the {{dataload}} queue. In order to do this, the apps need to run in the {{dataload}} queue as the respective headless users. We could open up the ACLs to the {{dataload}} queue to allow {{headless1}} and {{headless2}} to submit apps. But this would allow those users to submit any app to that queue, and not be limited to just the data loading apps, and we don't trust the {{headless1}} and {{headless2}} owners to honor that restriction.
> This JIRA proposes that we define a way to set up ACLs to restrict a resource's access to a  super-user, but when the access happens, run it as the proxied user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org