You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by John Stoneham <ly...@lyrically.net> on 2011/12/15 19:50:49 UTC

Passing scan authorizations that exceed the Accumulo user's authorizations

Hi,

I'm wondering about the expected behavior when creating a Scanner with
authorizations that exceed the Accumulo user's authorizations. (For
example, if some authentication mechanism gave a user AUTH1, AUTH2, and
AUTH3, but the particular Accumulo usr in use has had AUTH3 removed from
its authorizations temporarily.) Current behavior on the 1.3 line is to
throw an exception if a scan is attempted with any authorizations which the
Accumulo user does not possess.

The docs are inconsistent. The manual on the 1.4 line reads: When a user
creates a scanner a set of Authorizations is passed. If the authorizations
passed to the scanner are not a subset of the users authorizations, then an
exception will be thrown.

However, the Javadocs on Connector.getBatchScanner read: A set of
authorization labels that will be checked against the column visibility of
each key inorder to filter data. The authorizations passed in for scanning
are intersected with the accumulo users set of authorizations. So if the
accumulo user has authorizations (A1, A2) and authorizations (A2,A3) are
passed, then (A2) will be used for the scan.

As an Accumulo user I'd prefer the behavior documented in getBatchScanner
(that is, intersect on the server side the Accumulo user's authorizations
with the authorizations passed). In that situation, I can safely pass all
the authorizations my end-user might have, including any unexpected new (or
dynamic) ones that weren't known when I started the application. Updating
the Accumulo user's authorizations (addition or removal) would not require
an application restart.

With the current situation, I have the following less happy scenarios to
choose from:
1) retrieve the Accumulo user's authorizations at startup and perform this
intersection logic in application code each time. New authorizations added
to the Accumulo user won't be effective until an application restart.
Authorizations removed from the Accumulo user have the potential to cause
application errors until an application restart.
2) retrieve the Accumulo user's authorizations periodically and do the
same. Same characteristics as #1 except that the time window is reduced.
3) retrieve the Accumulo user's authorizations for each scan, then
intersect myself. Adds an extra round trip to every scan, and there's a
race condition if auths are being modified simultaneously.
4) whitelist authorizations coming from authentication layer to ones I know
are on the Accumulo user, keep the whitelist config in sync with the server
config (or retrieve them at startup) and just take down the application
before any authorization changes.

I thought I'd ask the list for thoughts on this before I file an issue.
Perhaps there are constraints or solutions I haven't thought of.

Thanks,

- John

-- 
John Stoneham
lyric@lyrically.net

Re: Passing scan authorizations that exceed the Accumulo user's authorizations

Posted by Keith Turner <ke...@deenlo.com>.
One thought about the old behavior of intersection scan auths with
user auths.  This is similar to a file system that gives you a zero
length file when trying to open a file you can not read.

On Thu, Dec 15, 2011 at 1:50 PM, John Stoneham <ly...@lyrically.net> wrote:
> Hi,
>
> I'm wondering about the expected behavior when creating a Scanner with
> authorizations that exceed the Accumulo user's authorizations. (For example,
> if some authentication mechanism gave a user AUTH1, AUTH2, and AUTH3, but
> the particular Accumulo usr in use has had AUTH3 removed from its
> authorizations temporarily.) Current behavior on the 1.3 line is to throw an
> exception if a scan is attempted with any authorizations which the Accumulo
> user does not possess.
>
> The docs are inconsistent. The manual on the 1.4 line reads: When a user
> creates a scanner a set of Authorizations is passed. If the authorizations
> passed to the scanner are not a subset of the users authorizations, then an
> exception will be thrown.
>
> However, the Javadocs on Connector.getBatchScanner read: A set of
> authorization labels that will be checked against the column visibility of
> each key inorder to filter data. The authorizations passed in for scanning
> are intersected with the accumulo users set of authorizations. So if the
> accumulo user has authorizations (A1, A2) and authorizations (A2,A3) are
> passed, then (A2) will be used for the scan.
>
> As an Accumulo user I'd prefer the behavior documented in getBatchScanner
> (that is, intersect on the server side the Accumulo user's authorizations
> with the authorizations passed). In that situation, I can safely pass all
> the authorizations my end-user might have, including any unexpected new (or
> dynamic) ones that weren't known when I started the application. Updating
> the Accumulo user's authorizations (addition or removal) would not require
> an application restart.
>
> With the current situation, I have the following less happy scenarios to
> choose from:
> 1) retrieve the Accumulo user's authorizations at startup and perform this
> intersection logic in application code each time. New authorizations added
> to the Accumulo user won't be effective until an application restart.
> Authorizations removed from the Accumulo user have the potential to cause
> application errors until an application restart.
> 2) retrieve the Accumulo user's authorizations periodically and do the same.
> Same characteristics as #1 except that the time window is reduced.
> 3) retrieve the Accumulo user's authorizations for each scan, then intersect
> myself. Adds an extra round trip to every scan, and there's a race condition
> if auths are being modified simultaneously.
> 4) whitelist authorizations coming from authentication layer to ones I know
> are on the Accumulo user, keep the whitelist config in sync with the server
> config (or retrieve them at startup) and just take down the application
> before any authorization changes.
>
> I thought I'd ask the list for thoughts on this before I file an issue.
> Perhaps there are constraints or solutions I haven't thought of.
>
> Thanks,
>
> - John
>
> --
> John Stoneham
> lyric@lyrically.net

Re: Passing scan authorizations that exceed the Accumulo user's authorizations

Posted by Keith Turner <ke...@deenlo.com>.
On Thu, Dec 15, 2011 at 1:50 PM, John Stoneham <ly...@lyrically.net> wrote:
> Hi,
>
> I'm wondering about the expected behavior when creating a Scanner with
> authorizations that exceed the Accumulo user's authorizations. (For example,
> if some authentication mechanism gave a user AUTH1, AUTH2, and AUTH3, but
> the particular Accumulo usr in use has had AUTH3 removed from its
> authorizations temporarily.) Current behavior on the 1.3 line is to throw an
> exception if a scan is attempted with any authorizations which the Accumulo
> user does not possess.
>
> The docs are inconsistent. The manual on the 1.4 line reads: When a user
> creates a scanner a set of Authorizations is passed. If the authorizations
> passed to the scanner are not a subset of the users authorizations, then an
> exception will be thrown.
>
> However, the Javadocs on Connector.getBatchScanner read: A set of
> authorization labels that will be checked against the column visibility of
> each key inorder to filter data. The authorizations passed in for scanning
> are intersected with the accumulo users set of authorizations. So if the
> accumulo user has authorizations (A1, A2) and authorizations (A2,A3) are
> passed, then (A2) will be used for the scan.
>
> As an Accumulo user I'd prefer the behavior documented in getBatchScanner
> (that is, intersect on the server side the Accumulo user's authorizations
> with the authorizations passed). In that situation, I can safely pass all
> the authorizations my end-user might have, including any unexpected new (or
> dynamic) ones that weren't known when I started the application. Updating
> the Accumulo user's authorizations (addition or removal) would not require
> an application restart.
>
> With the current situation, I have the following less happy scenarios to
> choose from:
> 1) retrieve the Accumulo user's authorizations at startup and perform this
> intersection logic in application code each time. New authorizations added
> to the Accumulo user won't be effective until an application restart.
> Authorizations removed from the Accumulo user have the potential to cause
> application errors until an application restart.
> 2) retrieve the Accumulo user's authorizations periodically and do the same.
> Same characteristics as #1 except that the time window is reduced.
> 3) retrieve the Accumulo user's authorizations for each scan, then intersect
> myself. Adds an extra round trip to every scan, and there's a race condition
> if auths are being modified simultaneously.
> 4) whitelist authorizations coming from authentication layer to ones I know
> are on the Accumulo user, keep the whitelist config in sync with the server
> config (or retrieve them at startup) and just take down the application
> before any authorization changes.
>
> I thought I'd ask the list for thoughts on this before I file an issue.
> Perhaps there are constraints or solutions I haven't thought of.
>
> Thanks,
>
> - John
>
> --
> John Stoneham
> lyric@lyrically.net

Another option is to retrieve Accumulo user auths when you get an
exception.  This make problems w/ keeping things in sync go away.
Would be slightly painful would probably need to catch a
RuntimeException and get its cause.  Since the scanner implements the
Iterator interface it can only throw Runtime exceptions.

As far as changing the behavior, it is something to consider.  We
decided to go with the current behavior after many many instances of
people not getting data back and not knowing why (silent intersection
was causing data to be dropped).  Could make the behavior
configurable.  By default it throws an exception, but allow user to
turn this off.  What do you think about this? I think its worth
opening a ticket for an continuing the conversation there.

I think the batch scanner documentation is jsut wrong.  This is how it
used to work, thats a documentation bug. I will open a bug for this.

Re: Passing scan authorizations that exceed the Accumulo user's authorizations

Posted by John Stoneham <ly...@lyrically.net>.
On Fri, Dec 16, 2011 at 9:18 AM, Billie J Rinaldi <billie.j.rinaldi@ugov.gov
> wrote:

> It sounds like you always want to scan with all the authorizations your
> user has.  In that case, you don't need a list of all possible
> authorizations to pass in -- just pass in the user's actual authorizations,
> which can be retrieved with
> connector.securityOperations().getUserAuthorizations(user).
>

Currently, I want to scan using the intersection of my (human) user's
authorizations and the Accumulo (application) user's authorizations. What
you've listed is the call I'm using to get the Accumulo user's
authorizations. But if my user were to have an authorization that, for some
reason, had been removed from the Accumulo user, or that was available on
one deployment or cluster but not on another, I'd have a problem unless I
performed this intersection in the application.

-- 
John Stoneham
lyric@lyrically.net

Re: Passing scan authorizations that exceed the Accumulo user's authorizations

Posted by Billie J Rinaldi <bi...@ugov.gov>.
On Thursday, December 15, 2011 1:50:49 PM, "John Stoneham" <ly...@lyrically.net> wrote:
> As an Accumulo user I'd prefer the behavior documented in
> getBatchScanner (that is, intersect on the server side the Accumulo
> user's authorizations with the authorizations passed). In that
> situation, I can safely pass all the authorizations my end-user might
> have, including any unexpected new (or dynamic) ones that weren't
> known when I started the application. Updating the Accumulo user's
> authorizations (addition or removal) would not require an application
> restart.

It sounds like you always want to scan with all the authorizations your user has.  In that case, you don't need a list of all possible authorizations to pass in -- just pass in the user's actual authorizations, which can be retrieved with
connector.securityOperations().getUserAuthorizations(user).

Billie