You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by Jukka Zitting <ju...@gmail.com> on 2013/10/04 21:04:03 UTC

Rethinking access control evaluation

Hi,

I was looking at OAK-1046 and OAK-774, and thinking about how we could
avoid the current heavy performance hit on access control evaluation.

I think the various caching approaches suggested in the above issues
and in other discussions are just addressing the symptoms of the
problem instead of the root cause, which I believe is the way we
currently do permission lookups.

For example, consider a simple content tree with a node like
/content/site/page/paragraph, with an ACL at /content/site that grants
everyone read access to that subtree. Assuming no other applicable
ACLs, currently (AFAICT) each property access on the paragraph node
will require permission store lookups for
/content/site/page/paragraph, /content/site/page, /content/site,
/content and / for all principals of the user until a match is found
for the everyone principal at /content/site. For a simple case with
just one user and one group principal before everyone, that's still 12
failed lookups before the match is found. The number can get a lot
higher for a deeper tree or a user with more principals. And that work
is done over and over again for each property that is being accessed.

Given such an access pattern it's no wonder that the access control
evaluation is so expensive. We could of course speed things up a lot
by caching the results of frequent lookups, but I think there's a much
simpler and more elegant solution.

Instead of repeatedly looking up things from the permission store
using a sequence of (principal, ancestorPath) keys for each property
being accessed, I suggest that we collect all the potentially
applicable ACEs along the path as we descend it through successive
SecurityContext.getChildContext() calls. This way when accessing
/content/site/page/paragraph, we'd end up looking up ACEs from /,
/content, /content/site, /content/site/page and
/content/site/page/paragraph. Most of those lookups could be
short-circuited by noticing that there is no rep:policy child node.
Once that's done (and since NodeStates are immutable) we'd already
know that the only matching ACE grants us read access, and thus the
extra overhead for reading properties of the paragraph node would be
essentially zero.

Such an approach adds some up-front cost in the form of the ACE
lookups during path evaluation, but that cost is quickly amortized
since it's easy to memorize the results and since typical client
access patterns are highly localized. For example after accessing the
paragraph node reading /content/site/page/paragraph2 would only
require one extra lookup for ACEs under paragraph2, which also could
be short-circuited in the common case that there's no rep:policy child
node.

BR,

Jukka Zitting

Re: Rethinking access control evaluation

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Mon, Oct 7, 2013 at 3:41 AM, Angela Schreiber <an...@adobe.com> wrote:
> we never have aces that effect just one single node.

I included it just as the simplest possible example of how custom
restrictions could be implemented with the proposed state machine
model.

> similarly it's one of core requirements of oak that we no longer
> have a limitation with the number of access content entries
> defined in the content tree.

The algorithm I outlined is not limited by the total number of ACEs in
the tree, just by the number of ACEs in the ancestor nodes. And if we
expect that to be a limiting factor, we can always pre-sort the ACEs
by the affected principal like is now done with the permission store.

> we definitely do have need for huge amount of ac content without
> having any performance draw back... it might well be that you are
> not aware of these use cases but feel asserted that they exist
> and that i am not going to invent use cases that make the whole
> story complicated if i can have it simple.

Could we have at least rough descriptions of the main use cases you're
thinking about. Just simple drafts like the paragraph example I used
should be a good start. Otherwise it's very difficult to estimate how
well a particular design will work in practice.

See http://markmail.org/message/7nn6kiotvoallrfw for my related
message months ago. If we had done that exercise back then, we could
already have spotted the current performance issue with a simple
calculation like the one I did in earlier in this thread. I'd hate to
see us spend another few months implementing something different, only
to realize that it won't perform well enough for one use case or
another.

> tobi and myself had a lot of discussions during the oakathon last
> week regarding access control and security requirements in general.
> no final decisions yet... but a lot of ideas that we kept collecting
> over last couple of years.

Can you write up a quick summary of the discussions?

BR,

Jukka Zitting

Re: Rethinking access control evaluation

Posted by Angela Schreiber <an...@adobe.com>.

hi michael

i don't want to create yet another implementation that doesn't fit
our requirements... we had too many escalations due to the limitation
of jackrabbit core without being able to address them.

you have been involved in the recent discussion with our soco team
and they are definitely looking for scalable repository without
limitation in terms of access control content.

in other words: i want to ship oak with a ac setup that not only
covers the default setup. already with jackrabbit-core it was possible
to configure custom implementations of the permission evaluation and i
just know of one single case where this was actually considered an option.
that's why i came to the conclusion that we have to make sure that
we get rid of these limitations out of the box.

at the same time the permission eval should still be pluggable such that we
can extend or even replace the current model. but i consider this
an option for completely replacing the evaluation model (e.g.
each document comes with it's own permissions and there is no
inheritance at all)... i don't want to offer pluggability as
as workaround for limitations of the implementation. that's what
we had in the past: the limitations were known and communicated even
before releasing  jackrabbit 2.x and i was always told that
it's not a problem. this was simply not true... i learned my lession :-)

kind regards
angela

On 10/7/13 9:59 AM, "Michael Marth" <mm...@adobe.com> wrote:

>Hi Jukka,
>
>you are right that the majority of repositories we see (or at least that
>I see) have few principals and few ACLs. But as Angela mentioned there is
>a not-so-small number of cases with a very large number of principals
>(>100000, e.g. a public portal or forum) and/or a large number of ACLs
>(>50000, e.g. Intranet where ACLs are not hierarchic).
>From my POV it makes sense (as it was suggested on this thread) to
>optimize for the normal case (few ACLs) out of the box, but make the ACL
>evaluation pluggable, so that different strategies could be used in the
>different scenarios.
>
>my2c
>Michael
>
>On Oct 5, 2013, at 5:31 AM, Jukka Zitting wrote:
>
>Do we have real-world examples of such ACL-heavy repositories? Do they
>also require optimum performance? I'm not aware of any such use cases,
>but that of course doesn't mean they don't exist.
>
>If possible I'd rather avoid the extra complexity and go with just a
>single strategy that's optimized for the kinds of repositories we
>normally see.
>

Re: Rethinking access control evaluation

Posted by Michael Marth <mm...@adobe.com>.

Hi Jukka,

you are right that the majority of repositories we see (or at least that I see) have few principals and few ACLs. But as Angela mentioned there is a not-so-small number of cases with a very large number of principals (>100000, e.g. a public portal or forum) and/or a large number of ACLs (>50000, e.g. Intranet where ACLs are not hierarchic).
>From my POV it makes sense (as it was suggested on this thread) to optimize for the normal case (few ACLs) out of the box, but make the ACL evaluation pluggable, so that different strategies could be used in the different scenarios.

my2c
Michael

On Oct 5, 2013, at 5:31 AM, Jukka Zitting wrote:

Do we have real-world examples of such ACL-heavy repositories? Do they
also require optimum performance? I'm not aware of any such use cases,
but that of course doesn't mean they don't exist.

If possible I'd rather avoid the extra complexity and go with just a
single strategy that's optimized for the kinds of repositories we
normally see.

Re: Rethinking access control evaluation

Posted by Angela Schreiber <an...@adobe.com>.

hi jukka

we never have aces that effect just one single node. the effect
is *always* inherited to the complete subtree unless there are
additional restrictions that narrow the effect to those nodes
and properties in the subtree that match a given pattern.

this is how a backwards compatible implementation must behave
and i don't remember how many time i already explained this.

similarly it's one of core requirements of oak that we no longer
have a limitation with the number of access content entries
defined in the content tree. there are in fact a lot of customers
that are suffering from that limitation nowadays and it's our
own product that is limited by the current shortcoming of jackrabbit.
we definitely do have need for huge amount of ac content without
having any performance draw back... it might well be that you are
not aware of these use cases but feel asserted that they exist
and that i am not going to invent use cases that make the whole
story complicated if i can have it simple.

one additional thing we must be aware of:
SlingRepository#loginAdministrative
as finally been deprecated as it's always associated with critical
security issues. the replacement will be a setup with service
users that have dedicated permissions. in other words: we will
have a lot more access control content for all the service users
and we will have long-living service user sessions that register
observation listeners. this is a different situation than we have
nowadays were all long-living sessions are associated with the
admin session.

apart from existing requirements and constraints we already have
(there are quite a lot), i am also collecting use-cases for all
kind of access control setup scenarios where a different access
control model might be desirable underneath a given subtree of
the repository. in other words: the new OAK setup must allow
for custom extensions and it's very likely that we will have to
add even more complexity in terms of effective policies present
at a given subtree.

tobi and myself had a lot of discussions during the oakathon last
week regarding access control and security requirements in general.
no final decisions yet... but a lot of ideas that we kept collecting
over last couple of years.

kind regards
angela

On 10/5/13 5:31 AM, "Jukka Zitting" <ju...@gmail.com> wrote:

>Hi,
>
>On Fri, Oct 4, 2013 at 8:29 PM, Tobias Bocanegra <tr...@apache.org>
>wrote:
>> that's exactly the new algorithm Angela is working on right now
>
>Great!
>
>> However, it's not so simple since we also need to consider higher
>> level ACEs with restrictions. I.e. even if you know there are no more
>> ACLs beyond a specific node, there could still be a an earlier ACE
>> with a restriction that matches an item path further down.
>
>A few years ago in Apache Tika I solved a structurally similar problem
>with a state machine. The idea, applied to Oak, would be to model the
>effect of ACEs as state machines that switch from one state to another
>as we descend down the tree. Something like this:
>
>    interface EffectivePermission {
>
>        /**
>         * Returns the effect of this permission on the specified child
>node,
>         * or {@code null} if this permission does not apply to that
>subtree.
>         */
>        EffectivePermission getPermissionForChildNode(
>            String name, NodeState state);
>
>    }
>
>The effect of a normal ACE that applies to the entire subtree would
>remain unchanged for all descendants:
>
>    public EffectivePermission getPermissionForChildNode(
>            String name, NodeState state) {
>        return this;
>    }
>
>An ACE that's limited to just the node on which it was defined, would
>have no effect on descendants:
>
>    public EffectivePermission getPermissionForChildNode(
>            String name, NodeState state) {
>        return null;
>    }
>
>More complex restrictions (glob patterns, type matches, etc.) can be
>implemented with this same pattern. As an example, the mentioned
>solution in Tika implements a fairly complete XPath subset using this
>approach: 
>https://github.com/apache/tika/tree/trunk/tika-core/src/main/java/org/apac
>he/tika/sax/xpath.
>
>> Also keep in mind, that the performance of the algorithm really
>> depends on the ACL distribution. for example, repositories with very
>> restrictive ACLs on every "document" would benefit from the bottom-up
>> approach. So I think that we need both and find a way to choose the
>> strategy automatically or by configuration.
>
>Do we have real-world examples of such ACL-heavy repositories? Do they
>also require optimum performance? I'm not aware of any such use cases,
>but that of course doesn't mean they don't exist.
>
>If possible I'd rather avoid the extra complexity and go with just a
>single strategy that's optimized for the kinds of repositories we
>normally see.
>
>BR,
>
>Jukka Zitting

Re: Rethinking access control evaluation

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Fri, Oct 4, 2013 at 8:29 PM, Tobias Bocanegra <tr...@apache.org> wrote:
> that's exactly the new algorithm Angela is working on right now

Great!

> However, it's not so simple since we also need to consider higher
> level ACEs with restrictions. I.e. even if you know there are no more
> ACLs beyond a specific node, there could still be a an earlier ACE
> with a restriction that matches an item path further down.

A few years ago in Apache Tika I solved a structurally similar problem
with a state machine. The idea, applied to Oak, would be to model the
effect of ACEs as state machines that switch from one state to another
as we descend down the tree. Something like this:

    interface EffectivePermission {

        /**
         * Returns the effect of this permission on the specified child node,
         * or {@code null} if this permission does not apply to that subtree.
         */
        EffectivePermission getPermissionForChildNode(
            String name, NodeState state);

    }

The effect of a normal ACE that applies to the entire subtree would
remain unchanged for all descendants:

    public EffectivePermission getPermissionForChildNode(
            String name, NodeState state) {
        return this;
    }

An ACE that's limited to just the node on which it was defined, would
have no effect on descendants:

    public EffectivePermission getPermissionForChildNode(
            String name, NodeState state) {
        return null;
    }

More complex restrictions (glob patterns, type matches, etc.) can be
implemented with this same pattern. As an example, the mentioned
solution in Tika implements a fairly complete XPath subset using this
approach: https://github.com/apache/tika/tree/trunk/tika-core/src/main/java/org/apache/tika/sax/xpath.

> Also keep in mind, that the performance of the algorithm really
> depends on the ACL distribution. for example, repositories with very
> restrictive ACLs on every "document" would benefit from the bottom-up
> approach. So I think that we need both and find a way to choose the
> strategy automatically or by configuration.

Do we have real-world examples of such ACL-heavy repositories? Do they
also require optimum performance? I'm not aware of any such use cases,
but that of course doesn't mean they don't exist.

If possible I'd rather avoid the extra complexity and go with just a
single strategy that's optimized for the kinds of repositories we
normally see.

BR,

Jukka Zitting

Re: Rethinking access control evaluation

Posted by Tobias Bocanegra <tr...@apache.org>.

Hi Jukka,
that's exactly the new algorithm Angela is working on right now and we
think that the cumulative ACL aggregation is indeed cheaper than the
current bottom-up approach.
However, it's not so simple since we also need to consider higher
level ACEs with restrictions. I.e. even if you know there are no more
ACLs beyond a specific node, there could still be a an earlier ACE
with a restriction that matches an item path further down.
for example you could have an ACE on the root node that denies read
access on all jcr:content/jcr:title properties for a user.

Also keep in mind, that the performance of the algorithm really
depends on the ACL distribution. for example, repositories with very
restrictive ACLs on every "document" would benefit from the bottom-up
approach. So I think that we need both and find a way to choose the
strategy automatically or by configuration.

Regards, Toby

On Fri, Oct 4, 2013 at 12:04 PM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> I was looking at OAK-1046 and OAK-774, and thinking about how we could
> avoid the current heavy performance hit on access control evaluation.
>
> I think the various caching approaches suggested in the above issues
> and in other discussions are just addressing the symptoms of the
> problem instead of the root cause, which I believe is the way we
> currently do permission lookups.
>
> For example, consider a simple content tree with a node like
> /content/site/page/paragraph, with an ACL at /content/site that grants
> everyone read access to that subtree. Assuming no other applicable
> ACLs, currently (AFAICT) each property access on the paragraph node
> will require permission store lookups for
> /content/site/page/paragraph, /content/site/page, /content/site,
> /content and / for all principals of the user until a match is found
> for the everyone principal at /content/site. For a simple case with
> just one user and one group principal before everyone, that's still 12
> failed lookups before the match is found. The number can get a lot
> higher for a deeper tree or a user with more principals. And that work
> is done over and over again for each property that is being accessed.
>
> Given such an access pattern it's no wonder that the access control
> evaluation is so expensive. We could of course speed things up a lot
> by caching the results of frequent lookups, but I think there's a much
> simpler and more elegant solution.
>
> Instead of repeatedly looking up things from the permission store
> using a sequence of (principal, ancestorPath) keys for each property
> being accessed, I suggest that we collect all the potentially
> applicable ACEs along the path as we descend it through successive
> SecurityContext.getChildContext() calls. This way when accessing
> /content/site/page/paragraph, we'd end up looking up ACEs from /,
> /content, /content/site, /content/site/page and
> /content/site/page/paragraph. Most of those lookups could be
> short-circuited by noticing that there is no rep:policy child node.
> Once that's done (and since NodeStates are immutable) we'd already
> know that the only matching ACE grants us read access, and thus the
> extra overhead for reading properties of the paragraph node would be
> essentially zero.
>
> Such an approach adds some up-front cost in the form of the ACE
> lookups during path evaluation, but that cost is quickly amortized
> since it's easy to memorize the results and since typical client
> access patterns are highly localized. For example after accessing the
> paragraph node reading /content/site/page/paragraph2 would only
> require one extra lookup for ACEs under paragraph2, which also could
> be short-circuited in the common case that there's no rep:policy child
> node.
>
> BR,
>
> Jukka Zitting