You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Marcelo Vanzin <va...@cloudera.com> on 2012/07/12 22:26:09 UTC

Enhancing AccessController

Hi all,

I've been doing some work around HBase security and I've identified a
few enhancements that would make AccessController provide better audit
information for people interested in that. There are three different
items which are not necessarily related to each other.


(i) Lack of column family information in audit logs

I'd actually consider this a bug, more than an enhancement. The
culprit here is AccessController.permissionGranted(). If you look at
that method, it will generate audit events containing column family /
qualifier information only when the required permission for that
specific family / qualifier is denied. If permission is granted, you
get an audit message that contains the table name, but no family /
qualifier.

I think the correct thing here would be to list the affected family /
qualifiers when permission is granted, too. In the deny case, it's a
little bit hazier: do we list all families / qualifiers, the first one
to have a permission denied, or all families / qualifiers for which
permission was denied?

This question is also mildly related to (iii) below.


(ii) The access controller does not work if authentication is disabled.

This sounds obvious, right? Why would you need an access controller if
there is no security configured?

I think it would still be useful to collect auditable events in this
case. The user information will be bogus or non-existent, but at least
anyone interested will be able to collect access information for their
service. This could be done by creating another coprocessor, but that
would require replicating some of the logic in access controler (to
decide which permissions to log in the audit event), or refactoring
that logic into a helper class or something.


(iii) There's no easy way to customize processing of audit events.

Audit events are written to a log appender in a private method in
AccessController.java; this means anyone who wants something
different, like writing this data to a database, has to go through the
logging system to do it. This is sub-optimal since it means having to
parse a log message, and potentially losing information in the
process.

My preferred approach is to separate audit event creation
(AccessController.java) from audit event storage (currently also in
AccessController.java) by means of an "audit logger" interface. A new
config option can tell the AccessController to instantiate one (or
more, although I don't see much use in that) "audit logger", and it
would then call that logger instead of (or in addition to) sending the
log message to the logging subsystem. I actually have a working
prototype for this approach on top of HBase 0.92, I can post the patch
somewhere if anyone is interested.

A different approach would be to make logResult() in AccessController
protected, so that it can be subclassed, achieving similar
functionality. But I don't like how this would create tight coupling
between AccessController and the audit logging code.


So I think that covers what I've been looking at; sorry for the
long-ish e-mail. Feedback always welcome.

-- 
Marcelo

Re: Enhancing AccessController

Posted by Marcelo Vanzin <va...@cloudera.com>.
Hello,

On Thu, Jul 12, 2012 at 3:54 PM, Andrew Purtell <ap...@apache.org> wrote:
>> For example, HDFS logs audit messages at INFO level today (IIRC), while HBase does so at TRACE level.
> This has been fixed.

Ah, good to know. It seems our git mirrors are a little bit out of date.

>> Well, the logging path wouldn't go away; this would just be an
>> extension for people who have might complicated needs than just
>> writing to log files. We're looking at maybe providing a similar thing
>> for HDFS. In the end, we don't want the easy way to be any different
>> than it is today, but at the same time have a system where doing more
>> complicated things is possible.
>
> This is the right approach, IMHO, build it into Hadoop core and then
> we can use it in a manner consistent with how core does.

My concern with trying to come up with a common solution for core
Hadoop and HBase is that the data being logged is fundamentally
different. Sure, you could have a silly logger that just takes a
string, but that's no better than hacking through the logging system,
which can be done today.

A proper interface would have proper types provided to the logger
(e.g., the "AuthResult" class currently private in AccessController).
And those cannot be shared among different services; maybe some base
type with common audit-related fields, but not much more than that.

Anyway, I'll clean up my code and post it on Jira instead of
elongating this thread. :-)

-- 
Marcelo

Re: Enhancing AccessController

Posted by Andrew Purtell <ap...@apache.org>.
Hi Marcelo,

> For example, HDFS logs audit messages at INFO level today (IIRC), while HBase does so at TRACE level.

This has been fixed.

> Well, the logging path wouldn't go away; this would just be an
> extension for people who have might complicated needs than just
> writing to log files. We're looking at maybe providing a similar thing
> for HDFS. In the end, we don't want the easy way to be any different
> than it is today, but at the same time have a system where doing more
> complicated things is possible.

This is the right approach, IMHO, build it into Hadoop core and then
we can use it in a manner consistent with how core does.

Otherwise, thanks a lot for your attention to this area.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

Re: Enhancing AccessController

Posted by Marcelo Vanzin <va...@cloudera.com>.
Hi Andrew, thanks for the feedback.

On Thu, Jul 12, 2012 at 2:56 PM, Andrew Purtell <ap...@apache.org> wrote:
> I'd argue the entire security side of Hadoop is in need of some
> serious work regards audit. For starters, consistent audit logging
> formats: success is logged at INFO level, failure is logged via
> exception.

I won't dispute that. :-) Consistent behavior is a good thing. For
example, HDFS logs audit messages at INFO level today (IIRC), while
HBase does so at TRACE level. For starters, that means HBase audit
logs won't be available by default in most installations.

>> (i) Lack of column family information in audit logs
> Consider filing a JIRA for this as a subtask under
> https://issues.apache.org/jira/browse/HBASE-6096.

Will do.

>> (ii) The access controller does not work if authentication is disabled.
>
> IMHO, doing anything with authentication disabled is out of design
> scope. Reasonable people may disagree.

I don't have a strong opinion about this being a feature of the
AccessController. It can be done easily enough with a custom
coprocessor. The only thing that is kinda sketchy in the custom
coprocessor approach is the definition of "what requests map to what
required permissions", something that is baked into the
AccessController code today.

That's not too much information to replicate, but having it available
in an easier manner would help a lot here.

>> (iii) There's no easy way to customize processing of audit events.
>>
>> Audit events are written to a log appender in a private method in
>> AccessController.java; this means anyone who wants something
>> different, like writing this data to a database, has to go through the
>> logging system to do it.
>
> This is consistent with how all of Hadoop does logging. I don't think
> we should roll our own. That doesn't improve the situation for system
> operators, it means they have to deal with all other parts of Hadoop
> then do something else for HBase specifically. That said,

Well, the logging path wouldn't go away; this would just be an
extension for people who have might complicated needs than just
writing to log files. We're looking at maybe providing a similar thing
for HDFS. In the end, we don't want the easy way to be any different
than it is today, but at the same time have a system where doing more
complicated things is possible.

>> I actually have a working
>> prototype for this approach on top of HBase 0.92, I can post the patch
>> somewhere if anyone is interested.
>
> Suggest putting it up as another subtask under
> https://issues.apache.org/jira/browse/HBASE-6096 so we can review it.

I'll play with it some more and post something.


-- 
Marcelo

Re: Enhancing AccessController

Posted by Andrew Purtell <ap...@apache.org>.
All of the below are good suggestions.

I'd argue the entire security side of Hadoop is in need of some
serious work regards audit. For starters, consistent audit logging
formats: success is logged at INFO level, failure is logged via
exception.

> (i) Lack of column family information in audit logs
> I'd actually consider this a bug, more than an enhancement. The
> culprit here is AccessController.permissionGranted().

Consider filing a JIRA for this as a subtask under
https://issues.apache.org/jira/browse/HBASE-6096.

> (ii) The access controller does not work if authentication is disabled.

IMHO, doing anything with authentication disabled is out of design
scope. Reasonable people may disagree.

> (iii) There's no easy way to customize processing of audit events.
>
> Audit events are written to a log appender in a private method in
> AccessController.java; this means anyone who wants something
> different, like writing this data to a database, has to go through the
> logging system to do it.

This is consistent with how all of Hadoop does logging. I don't think
we should roll our own. That doesn't improve the situation for system
operators, it means they have to deal with all other parts of Hadoop
then do something else for HBase specifically. That said,

> I actually have a working
> prototype for this approach on top of HBase 0.92, I can post the patch
> somewhere if anyone is interested.

Suggest putting it up as another subtask under
https://issues.apache.org/jira/browse/HBASE-6096 so we can review it.


On Thu, Jul 12, 2012 at 1:26 PM, Marcelo Vanzin <va...@cloudera.com> wrote:
> Hi all,
>
> I've been doing some work around HBase security and I've identified a
> few enhancements that would make AccessController provide better audit
> information for people interested in that. There are three different
> items which are not necessarily related to each other.
>
>
> (i) Lack of column family information in audit logs
>
> I'd actually consider this a bug, more than an enhancement. The
> culprit here is AccessController.permissionGranted(). If you look at
> that method, it will generate audit events containing column family /
> qualifier information only when the required permission for that
> specific family / qualifier is denied. If permission is granted, you
> get an audit message that contains the table name, but no family /
> qualifier.
>
> I think the correct thing here would be to list the affected family /
> qualifiers when permission is granted, too. In the deny case, it's a
> little bit hazier: do we list all families / qualifiers, the first one
> to have a permission denied, or all families / qualifiers for which
> permission was denied?
>
> This question is also mildly related to (iii) below.
>
>
> (ii) The access controller does not work if authentication is disabled.
>
> This sounds obvious, right? Why would you need an access controller if
> there is no security configured?
>
> I think it would still be useful to collect auditable events in this
> case. The user information will be bogus or non-existent, but at least
> anyone interested will be able to collect access information for their
> service. This could be done by creating another coprocessor, but that
> would require replicating some of the logic in access controler (to
> decide which permissions to log in the audit event), or refactoring
> that logic into a helper class or something.
>
>
> (iii) There's no easy way to customize processing of audit events.
>
> Audit events are written to a log appender in a private method in
> AccessController.java; this means anyone who wants something
> different, like writing this data to a database, has to go through the
> logging system to do it. This is sub-optimal since it means having to
> parse a log message, and potentially losing information in the
> process.
>
> My preferred approach is to separate audit event creation
> (AccessController.java) from audit event storage (currently also in
> AccessController.java) by means of an "audit logger" interface. A new
> config option can tell the AccessController to instantiate one (or
> more, although I don't see much use in that) "audit logger", and it
> would then call that logger instead of (or in addition to) sending the
> log message to the logging subsystem. I actually have a working
> prototype for this approach on top of HBase 0.92, I can post the patch
> somewhere if anyone is interested.
>
> A different approach would be to make logResult() in AccessController
> protected, so that it can be subclassed, achieving similar
> functionality. But I don't like how this would create tight coupling
> between AccessController and the audit logging code.
>
>
> So I think that covers what I've been looking at; sorry for the
> long-ish e-mail. Feedback always welcome.
>
> --
> Marcelo



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)