You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@knox.apache.org by larry mccay <lm...@apache.org> on 2022/02/05 18:36:11 UTC

[DISCUSS] KIP-16 - Virtual Groups in Apache Knox

All -

I've thrown together a proposal KIP for adding Virtual Group Mapping to
Knox identity assertion providers. [1]

Being able to create virtual groups based on aspects of the established
security context, identity, group memberships and attributes from the
request or other things will enabled a number of new capabilities. Things
like, more advanced and dynamic authorization policies and acls, custom
routing, throttling, QoS levels, etc.

Thoughts?

thanks,

--larry


1.
https://cwiki.apache.org/confluence/display/KNOX/KIP-16+-+Virtual+Groups+in+Apache+Knox

Re: [DISCUSS] KIP-16 - Virtual Groups in Apache Knox

Posted by Attila Magyar <am...@cloudera.com.INVALID>.
I came up with a few examples to demonstrate this language. This
is effectively a tiny Lisp-like expression evaluator. It has very minimal
syntax and simple semantics.

   <param>
        <!--
            1. Users lmccay and pzampino are explicitly added to the
Virtual Group (datalake-admin1) regardless of their LDAP groups
            2. All users that are members of both the admin and datalake
LDAP groups will be added to the Virtual Group
         -->
        <name>virtual.group.mapping.datalake-admin1</name>
        <value>
            (or
                (and
                    (member 'admin')
                    (member 'datalake'))
                (or
                    (username 'lmccay')
                    (username 'pzampino')))
        </value>
    </param>

   <param>
        <!-- Add all users to datalake-admin2 virtual group -->
        <name>virtual.group.mapping.datalake-admin2</name>
        <value>true</value>
    </param>

    <param>
        <!--
            Add all users who are a member of any group to datalake-admin3
virtual group.
            Alternatively we could also express this the following way:
(member '*')
        -->
        <name>virtual.group.mapping.datalake-admin3</name>
        <value>(!= 0 (size groups))</value>
    </param>


    <!--
        Any user that is a member of the group of the same name as their
username is added to datalake-admin4
    -->
    <param>
        <name>virtual.group.mapping.datalake-admin4</name>
        <value(member username)</value>
    </param>


Checking session attributes:
    (= (session 'name') 'value')

Checking request header or request method:
    (= (request-header 'name') 'value')
    (= (request-method 'name') 'value')

Few notes about the syntax.
  * Everything in the language is either a list or an atom.
  * A list - denoted as ( ... ) - contains atoms or other lists.
  * The general evaluation rule: the head of the list is the name of a
function, the rest are the parameters. Before we call the function we
evaluate the parameters (recursively).
  * An atom can be a symbol (name of a variable/constant), a string, a
number or a boolean.
  * The "or" and "and" are short-circuit conditionals (in lisp they call
them special forms). Their arguments are not evaluated at the call site,
because we want to stop the evaluation as soon as we can calculate the
final result.
       E.g.: (or true false false false true) here we can stop at the first
true value
  * The "or" and "and" support variable number of arguments.

Variables and functions can have the same name without clashing. In the
above examples username is both a function and a variable. So it's a lisp2
not a lisp1. However this isn't strictly necessary.

The whole thing is tiny. In my POC implementation, the parser is about 70
lines of code. The interpreter is about 80. The language is scalable, easy
to extend to meet further business needs.


On Tue, Feb 8, 2022 at 6:05 PM larry mccay <lm...@apache.org> wrote:

> Agreed, I think these are some great suggestions.
> Couple things to consider:
>
> 1. management app abilities to add individual params per mapping - think
> Ambari and CM
> 2. parsing complexity - I found the splitting on ORs simplified what be
> needed to be done there
> 3. whether we need to front load simple cases to meet some specific needs
>
>
> On Tue, Feb 8, 2022 at 8:57 AM Phil Zampino <pz...@apache.org> wrote:
>
> > I personally prefer the provider-param-per-virtual-group concept over the
> > one param for all group mappings approach. I think it is more readable
> and
> > easier to modify with less potential for error.
> > However, we should thoroughly think through any implications this choice
> > may have for Knox deployments.
> >
> > I also like the prefix notation for describing the virtual group
> > associations, whether they are specified in one or numerous params.
> >
> > On Tue, Feb 8, 2022 at 6:47 AM Attila Magyar
> <amagyar@cloudera.com.invalid
> > >
> > wrote:
> >
> > > I find this part a little bit difficult to read.
> > >
> > >     <param>
> > >         <name>virtual.group.mapping</name>
> > >
> > >
> > >
> >
> <value>user:lmccay,pzampino=datalake-admin||group:admin&&group:datalake=datalake-admin</value>
> > >     </param>
> > >
> > > Maybe it would be simpler if we put the virtual group name into the
> > > property param name:
> > >
> > >     <param>
> > >         <name>virtual.group.mapping.datalake-admin</name>
> > >
>  <value>user:lmccay,pzampino||group:admin&&group:datalake</value>
> > >     </param>
> > >
> > > This way the value only contains the predicate while the virtual group
> > name
> > > comes from the property name.
> > >
> > > If we want to support arbitrary logical expressions, we should consider
> > > using prefix or postfix notation instead of infix. With infix we need
> to
> > > deal with operator precedences and parentheses. Parsing logic can be
> > > significantly simpler with pre- and postfix notation.
> > >
> > > LDAP search filters
> > > <
> > >
> >
> https://confluence.atlassian.com/kb/how-to-write-ldap-search-filters-792496933.html
> > > >
> > > also use prefix notation so users might be already familiar with that.
> > >
> > > With prefix notation the above example would look something like this:
> > >
> > > <param>
> > >         <name>virtual.group.mapping.datalake-admin</name>
> > >         <value>(or (and group:admin group:datalake)
> > > user:lmccay,pzampino)</value>
> > >  </param>
> > >
> > > Or a more explicit and flexible version would look like this:
> > >
> > > (or
> > >     (and
> > >         (= group "admin")
> > >         (= group "datalake"))
> > >     (or
> > >         (= user "lmccay")
> > >         (= user "pzampino")))
> > >
> > > You can write it in one line, I just formatted it for ease of
> > readability.
> > >
> > > This syntax is unambiguous and trivial to parse and interpret. We don't
> > > need to deal with precedence rules. Plus it allows us to extend the
> > syntax
> > > further with no limitations.
> > >
> > > For example:
> > >
> > >    (or (starts-with "aprefix-" group) (matches "user[0-9]+" user))
> > >
> > > This is just a demonstration on language extension, with regexp and
> > string
> > > manipulation support.
> > >
> > >
> > > Even if we don't need this amount of flexibility I think we should
> still
> > > consider this because of the low implementation cost.
> > >
> > > Postfix notation is even simpler to implement but not widely used so
> some
> > > users might find it unusual. Other than this it has the same
> > > characteristics as the prefix version.
> > >
> > > group "admin" = group "datalake" = and user "lmccay" = user "pzampino"
> =
> > or
> > > or
> > >
> > > Thoughts?
> > >
> > >
> > >
> > > On Sat, Feb 5, 2022 at 7:36 PM larry mccay <lm...@apache.org> wrote:
> > >
> > > > All -
> > > >
> > > > I've thrown together a proposal KIP for adding Virtual Group Mapping
> to
> > > > Knox identity assertion providers. [1]
> > > >
> > > > Being able to create virtual groups based on aspects of the
> established
> > > > security context, identity, group memberships and attributes from the
> > > > request or other things will enabled a number of new capabilities.
> > Things
> > > > like, more advanced and dynamic authorization policies and acls,
> custom
> > > > routing, throttling, QoS levels, etc.
> > > >
> > > > Thoughts?
> > > >
> > > > thanks,
> > > >
> > > > --larry
> > > >
> > > >
> > > > 1.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KNOX/KIP-16+-+Virtual+Groups+in+Apache+Knox
> > > >
> > >
> > >
> > > --
> > > *Attila Magyar* | Staff Software Engineer
> > >
> > > cloudera.com <https://www.cloudera.com>
> > >
> > > [image: Cloudera] <https://www.cloudera.com/>
> > >
> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> > Cloudera
> > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > > ------------------------------
> > >
> >
>


-- 
*Attila Magyar* | Staff Software Engineer

cloudera.com <https://www.cloudera.com>

[image: Cloudera] <https://www.cloudera.com/>

[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
------------------------------

Re: [DISCUSS] KIP-16 - Virtual Groups in Apache Knox

Posted by larry mccay <lm...@apache.org>.
Agreed, I think these are some great suggestions.
Couple things to consider:

1. management app abilities to add individual params per mapping - think
Ambari and CM
2. parsing complexity - I found the splitting on ORs simplified what be
needed to be done there
3. whether we need to front load simple cases to meet some specific needs


On Tue, Feb 8, 2022 at 8:57 AM Phil Zampino <pz...@apache.org> wrote:

> I personally prefer the provider-param-per-virtual-group concept over the
> one param for all group mappings approach. I think it is more readable and
> easier to modify with less potential for error.
> However, we should thoroughly think through any implications this choice
> may have for Knox deployments.
>
> I also like the prefix notation for describing the virtual group
> associations, whether they are specified in one or numerous params.
>
> On Tue, Feb 8, 2022 at 6:47 AM Attila Magyar <amagyar@cloudera.com.invalid
> >
> wrote:
>
> > I find this part a little bit difficult to read.
> >
> >     <param>
> >         <name>virtual.group.mapping</name>
> >
> >
> >
> <value>user:lmccay,pzampino=datalake-admin||group:admin&&group:datalake=datalake-admin</value>
> >     </param>
> >
> > Maybe it would be simpler if we put the virtual group name into the
> > property param name:
> >
> >     <param>
> >         <name>virtual.group.mapping.datalake-admin</name>
> >         <value>user:lmccay,pzampino||group:admin&&group:datalake</value>
> >     </param>
> >
> > This way the value only contains the predicate while the virtual group
> name
> > comes from the property name.
> >
> > If we want to support arbitrary logical expressions, we should consider
> > using prefix or postfix notation instead of infix. With infix we need to
> > deal with operator precedences and parentheses. Parsing logic can be
> > significantly simpler with pre- and postfix notation.
> >
> > LDAP search filters
> > <
> >
> https://confluence.atlassian.com/kb/how-to-write-ldap-search-filters-792496933.html
> > >
> > also use prefix notation so users might be already familiar with that.
> >
> > With prefix notation the above example would look something like this:
> >
> > <param>
> >         <name>virtual.group.mapping.datalake-admin</name>
> >         <value>(or (and group:admin group:datalake)
> > user:lmccay,pzampino)</value>
> >  </param>
> >
> > Or a more explicit and flexible version would look like this:
> >
> > (or
> >     (and
> >         (= group "admin")
> >         (= group "datalake"))
> >     (or
> >         (= user "lmccay")
> >         (= user "pzampino")))
> >
> > You can write it in one line, I just formatted it for ease of
> readability.
> >
> > This syntax is unambiguous and trivial to parse and interpret. We don't
> > need to deal with precedence rules. Plus it allows us to extend the
> syntax
> > further with no limitations.
> >
> > For example:
> >
> >    (or (starts-with "aprefix-" group) (matches "user[0-9]+" user))
> >
> > This is just a demonstration on language extension, with regexp and
> string
> > manipulation support.
> >
> >
> > Even if we don't need this amount of flexibility I think we should still
> > consider this because of the low implementation cost.
> >
> > Postfix notation is even simpler to implement but not widely used so some
> > users might find it unusual. Other than this it has the same
> > characteristics as the prefix version.
> >
> > group "admin" = group "datalake" = and user "lmccay" = user "pzampino" =
> or
> > or
> >
> > Thoughts?
> >
> >
> >
> > On Sat, Feb 5, 2022 at 7:36 PM larry mccay <lm...@apache.org> wrote:
> >
> > > All -
> > >
> > > I've thrown together a proposal KIP for adding Virtual Group Mapping to
> > > Knox identity assertion providers. [1]
> > >
> > > Being able to create virtual groups based on aspects of the established
> > > security context, identity, group memberships and attributes from the
> > > request or other things will enabled a number of new capabilities.
> Things
> > > like, more advanced and dynamic authorization policies and acls, custom
> > > routing, throttling, QoS levels, etc.
> > >
> > > Thoughts?
> > >
> > > thanks,
> > >
> > > --larry
> > >
> > >
> > > 1.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KNOX/KIP-16+-+Virtual+Groups+in+Apache+Knox
> > >
> >
> >
> > --
> > *Attila Magyar* | Staff Software Engineer
> >
> > cloudera.com <https://www.cloudera.com>
> >
> > [image: Cloudera] <https://www.cloudera.com/>
> >
> > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> Cloudera
> > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > ------------------------------
> >
>

Re: [DISCUSS] KIP-16 - Virtual Groups in Apache Knox

Posted by Phil Zampino <pz...@apache.org>.
I personally prefer the provider-param-per-virtual-group concept over the
one param for all group mappings approach. I think it is more readable and
easier to modify with less potential for error.
However, we should thoroughly think through any implications this choice
may have for Knox deployments.

I also like the prefix notation for describing the virtual group
associations, whether they are specified in one or numerous params.

On Tue, Feb 8, 2022 at 6:47 AM Attila Magyar <am...@cloudera.com.invalid>
wrote:

> I find this part a little bit difficult to read.
>
>     <param>
>         <name>virtual.group.mapping</name>
>
>
> <value>user:lmccay,pzampino=datalake-admin||group:admin&&group:datalake=datalake-admin</value>
>     </param>
>
> Maybe it would be simpler if we put the virtual group name into the
> property param name:
>
>     <param>
>         <name>virtual.group.mapping.datalake-admin</name>
>         <value>user:lmccay,pzampino||group:admin&&group:datalake</value>
>     </param>
>
> This way the value only contains the predicate while the virtual group name
> comes from the property name.
>
> If we want to support arbitrary logical expressions, we should consider
> using prefix or postfix notation instead of infix. With infix we need to
> deal with operator precedences and parentheses. Parsing logic can be
> significantly simpler with pre- and postfix notation.
>
> LDAP search filters
> <
> https://confluence.atlassian.com/kb/how-to-write-ldap-search-filters-792496933.html
> >
> also use prefix notation so users might be already familiar with that.
>
> With prefix notation the above example would look something like this:
>
> <param>
>         <name>virtual.group.mapping.datalake-admin</name>
>         <value>(or (and group:admin group:datalake)
> user:lmccay,pzampino)</value>
>  </param>
>
> Or a more explicit and flexible version would look like this:
>
> (or
>     (and
>         (= group "admin")
>         (= group "datalake"))
>     (or
>         (= user "lmccay")
>         (= user "pzampino")))
>
> You can write it in one line, I just formatted it for ease of readability.
>
> This syntax is unambiguous and trivial to parse and interpret. We don't
> need to deal with precedence rules. Plus it allows us to extend the syntax
> further with no limitations.
>
> For example:
>
>    (or (starts-with "aprefix-" group) (matches "user[0-9]+" user))
>
> This is just a demonstration on language extension, with regexp and string
> manipulation support.
>
>
> Even if we don't need this amount of flexibility I think we should still
> consider this because of the low implementation cost.
>
> Postfix notation is even simpler to implement but not widely used so some
> users might find it unusual. Other than this it has the same
> characteristics as the prefix version.
>
> group "admin" = group "datalake" = and user "lmccay" = user "pzampino" = or
> or
>
> Thoughts?
>
>
>
> On Sat, Feb 5, 2022 at 7:36 PM larry mccay <lm...@apache.org> wrote:
>
> > All -
> >
> > I've thrown together a proposal KIP for adding Virtual Group Mapping to
> > Knox identity assertion providers. [1]
> >
> > Being able to create virtual groups based on aspects of the established
> > security context, identity, group memberships and attributes from the
> > request or other things will enabled a number of new capabilities. Things
> > like, more advanced and dynamic authorization policies and acls, custom
> > routing, throttling, QoS levels, etc.
> >
> > Thoughts?
> >
> > thanks,
> >
> > --larry
> >
> >
> > 1.
> >
> >
> https://cwiki.apache.org/confluence/display/KNOX/KIP-16+-+Virtual+Groups+in+Apache+Knox
> >
>
>
> --
> *Attila Magyar* | Staff Software Engineer
>
> cloudera.com <https://www.cloudera.com>
>
> [image: Cloudera] <https://www.cloudera.com/>
>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
> on LinkedIn] <https://www.linkedin.com/company/cloudera>
> ------------------------------
>

Re: [DISCUSS] KIP-16 - Virtual Groups in Apache Knox

Posted by Attila Magyar <am...@cloudera.com.INVALID>.
I find this part a little bit difficult to read.

    <param>
        <name>virtual.group.mapping</name>

<value>user:lmccay,pzampino=datalake-admin||group:admin&&group:datalake=datalake-admin</value>
    </param>

Maybe it would be simpler if we put the virtual group name into the
property param name:

    <param>
        <name>virtual.group.mapping.datalake-admin</name>
        <value>user:lmccay,pzampino||group:admin&&group:datalake</value>
    </param>

This way the value only contains the predicate while the virtual group name
comes from the property name.

If we want to support arbitrary logical expressions, we should consider
using prefix or postfix notation instead of infix. With infix we need to
deal with operator precedences and parentheses. Parsing logic can be
significantly simpler with pre- and postfix notation.

LDAP search filters
<https://confluence.atlassian.com/kb/how-to-write-ldap-search-filters-792496933.html>
also use prefix notation so users might be already familiar with that.

With prefix notation the above example would look something like this:

<param>
        <name>virtual.group.mapping.datalake-admin</name>
        <value>(or (and group:admin group:datalake)
user:lmccay,pzampino)</value>
 </param>

Or a more explicit and flexible version would look like this:

(or
    (and
        (= group "admin")
        (= group "datalake"))
    (or
        (= user "lmccay")
        (= user "pzampino")))

You can write it in one line, I just formatted it for ease of readability.

This syntax is unambiguous and trivial to parse and interpret. We don't
need to deal with precedence rules. Plus it allows us to extend the syntax
further with no limitations.

For example:

   (or (starts-with "aprefix-" group) (matches "user[0-9]+" user))

This is just a demonstration on language extension, with regexp and string
manipulation support.


Even if we don't need this amount of flexibility I think we should still
consider this because of the low implementation cost.

Postfix notation is even simpler to implement but not widely used so some
users might find it unusual. Other than this it has the same
characteristics as the prefix version.

group "admin" = group "datalake" = and user "lmccay" = user "pzampino" = or
or

Thoughts?



On Sat, Feb 5, 2022 at 7:36 PM larry mccay <lm...@apache.org> wrote:

> All -
>
> I've thrown together a proposal KIP for adding Virtual Group Mapping to
> Knox identity assertion providers. [1]
>
> Being able to create virtual groups based on aspects of the established
> security context, identity, group memberships and attributes from the
> request or other things will enabled a number of new capabilities. Things
> like, more advanced and dynamic authorization policies and acls, custom
> routing, throttling, QoS levels, etc.
>
> Thoughts?
>
> thanks,
>
> --larry
>
>
> 1.
>
> https://cwiki.apache.org/confluence/display/KNOX/KIP-16+-+Virtual+Groups+in+Apache+Knox
>


-- 
*Attila Magyar* | Staff Software Engineer

cloudera.com <https://www.cloudera.com>

[image: Cloudera] <https://www.cloudera.com/>

[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
------------------------------